Recent Comments
SIAI Bloggers
  • Michael Anissimov Media Director
  • Ben Goertzel Director of Research
  • David Hart Director of Open Source Projects
  • Michael Vassar President
  • Eliezer Yudkowsky Research Fellow
Guest Bloggers
  • Seth Baum Pennsylvania State University
  • Nick Hay University of Auckland
  • Mitchell Howe Contributing Writer
  • Tom McCabe Yale University
  • Carl Shulman New York University
  • Peter de Blanc Temple University
Tag Cloud
academic academics accelerating change accelerating change agi AGI 08 ai Anthropic Reasoning anthropomorphism artificial intelligence artificial intelligence aubrey de grey barney pell biases BIL bloggers bloggingheads tv bruce klein catastrophic risks civilization conference conference agi 09 conference chairman conferences consciousness research conventions convergence convergence08 cto cynthia breazeal david hart director of research donations doug wolens eliezer yudkowsky eric baum esther dyson event horizon events evolution existential risks FAI feature length documentary films Friendly AI Friendly Artificial Intelligence future salon future shock futurist community goertzel google gsoc institute research fellow intelligence explosion interest journal interesting articles interviews intros JAGI jaron lanier john horgan justin rattner language search engine lesswrong life extension machine consciousness marcus hutter martin rees math mathematics media meeting microsoft mit morality nanotechnology natural language search neil gershenfeld new york times news office of naval research open letter open source open source open source projects opencog opencogprime optimization processes outreach papers peter diamandis peter thiel pitt podcasts prediction quantum computing radio ray kurzweil relevant articles research fellow risk roadmap school science science fiction shane legg SIAI singularity singularity summit singularity institute singularity summit spectrum talk transhumanism utilitarianism utility vernor vinge videos virtual reality pioneer volunteers xiamen university yudkowsky
Archives

AI is not Automatically Friendly

July 11th, 2007Peter de Blanc

Consider the Stamp-Collecting Device. A common objection goes like this: “An optimization process that’s smart enough to tile the universe with stamps would also be smart enough to realize that this is not what its creator intended. Therefore it would not tile the universe with stamps.”

Human beings serve as a counterexample. The rules for constructing a human mind were devised by natural selection. These rules were fine-tuned to produce minds that are good at passing on their genes. If you are thinking of evolution as an optimization process, then it has the goal of producing genes which replicate as effectively as possible.

In 1859, Charles Darwin described the process that created us. Since then, we have come to understand that process in greater detail. Evolution is simple enough that we can claim to understand it very well; perhaps we even understand evolution as well as a Stamp-Collecting Device could understand us. Despite this understanding, we humans do not make evolution’s goal our own. Any time you use contraception, or perform a kind act when nobody is watching, you are betraying the goal of evolution. But so what? That’s evolution’s goal, not our goal. If anything, our understanding of evolution helps us to notice when we are doing something nasty but adaptive, and learn to avoid this behavior.

Similarly, a Stamp-Collecting Device would not adopt its programmer’s goals. It has its own goal to pursue — collecting stamps. If anything, understanding humans better would allow it to notice and fix biases that may be hindering its ability to collect stamps efficiently.

The challenge of FAI is to build an AI that does adopt our goals.

Comments (61) (RSS feed)

Toggle comment visibility Comment by Nick Tarleton
Jul 11, 2007 4:49 pm

This is why I think “Really Powerful Optimization Process” is in many ways a better term than “Artificial Intelligence”. An RPOP optimizes. Period. It does nothing else characteristic of intelligence, unless it serves the optimization target. It is not anthropomorphic. It is not conscious. It is not empathetic or cruel to other sentients, just indifferent. Etc. “AI” connotes anthropomorphism to 99.9% of people, including those who nominally should know better.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 11, 2007 5:41 pm

I admit that there’s a bit of cloudiness here with me too (although I still have much to learn about minds). Can a powerful and *useful* AGI (or even an RPOP) exist without consciousness? Are consciousness and general intelligence a qualitative difference or only a quantitative difference? My own intuition had been that they are only a quantitative difference (all else being equal - same basic algorithm arrangment/outline).

Toggle comment visibility Comment by Nick Tarleton
Jul 11, 2007 6:27 pm

Well, for consciousness as subjective experience, who knows. I hope experience isn’t necessary/inevitable, because it expands the range of useful things that can be done ethically (like extrapolated volition). Consciousness in the more general sense, as a sense of self/focus of attention/illusion of free will/all those other anthropomorphic niceties, however, is very unlikely to be necessary to an RPOP.

Toggle comment visibility Comment by Jeffrey Herrlich
Jul 12, 2007 7:40 am

Yeah, sorry, I meant consciousness strictly in the sense of subjective experience. My “view” of consciousness is that it is an *effect* of possessing above a certain threshold of general intelligence (probably along a continuum); but it is not a *cause* of general intelligence. (But that is just my intuition). I still don’t believe that consciousness conflicts with the assertion that “free will” is a myth. Couldn’t a conscious RPOP still implement CEV without any ethical violation against the RPOP itself? Especially if our CEV “desires” that the RPOP be rewarded with an excellent life? - which I suspect would be the case.

(Comments wont nest below this level)
Toggle comment visibility Comment by Nick Tarleton
Jul 12, 2007 8:12 am

Sorry, I meant CEV requires (as I understand it) zombie approximations of humans.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 12, 2007 10:31 am

Entertaining the notion that the zombie approximations *had* to be conscious, what would be their experience? Would they suffer in any way, or be forced to “die” when no longer needed? Would the CEV find a way to keep them happy or amicably integrate them with the rest of us? This seems like something worth discussing.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 13, 2007 10:05 am

My guess is that it won’t require conscious “zombies”, anyway. An RPOP could probably accomplish the same thing by taking atomic-resolution brain scans (snap-shots) from all humans and intrapolating the raw data in order to map a “meta-average” human brain. Then, use its awesome predictive/analytical powers to extrapolate what that single “average” brain would desire through an extrapolated volition. Conscious zombies probably won’t be necessary - as my guess at least.

*There is already growing evidence for the feasibility of atomic-resolution MRI.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 26, 2007 9:39 am

I’m not a mathematician, but over the past couple of days I’ve been making a series of calculations that *might* be of some relevance. It seems that where the extrapolation function is both non-linear and inherently convergent, that extrapolating a “meta-average” is mathematically identical to a convergent/coherent extrapolation. Or expressed another way, extrapolating an average is mathematically identical to averaging a set of extrapolations (provided that the extrapolation function is non-linear and inherently convergent). [eg. the average value of the set {8,8,8,8} is still 8. IOW, the final averaging isn’t “forced”, it happens naturally because the function is inherently convergent]. So, AFAICT, extrapolating a single “meta-average” brain should produce identical results to CEV. An advantage is that extrapolating a single (meta-average) brain is demonstrably coherent. It’s literally “Average Joe” taken to the limit. I represent a coherent extrapolated version of the person I was at age 2. And my volition is demonstrably coherent because I was able to choose pumpkin pie instead of cherry pie. And this “version” of CEV at least appears to be more easily formalized for an RPOP.

Krocker’s Rules!

 
Toggle comment visibility Comment by Tom McCabe
Jul 26, 2007 10:23 am

“Krocker’s Rules!”

It’s “Crocker’s Rules”.

“It seems that where the extrapolation function is both non-linear and inherently convergent,”

Wait… so you actually have worked out math that describes how to extrapolate a human volition? Please show us!

“the extrapolation function is non-linear and inherently convergent).”

I seriously doubt that the math describing any individual human volition is likely to converge (the partial derivative with respect to time goes go to zero). What CEV is trying to do is take the components of the volition function which do converge for the vast majority of the human species, use that to tell the AGI what to do next, and then ignore the rest of the mess.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 26, 2007 10:54 am

“It’s “Crocker’s Rules”.”

I know, it was a joke. :-)

“Wait… so you actually have worked out math that describes how to extrapolate a human volition?”

No, that’s not precisely what I claimed. What I meant was that an extrapolated average is mathematically identical to an “passively” averaged set of extrapolations (provided that the extrapolation function is non-linear and inherently convergent).

“Please show us!”

Alright. I have to go to class in a minute so give me a couple hours.

“What CEV is trying to do is take the components of the volition function which do converge for the vast majority of the human species, use that to tell the AGI what to do next, and then ignore the rest of the mess.”

But that condition is implicit when I say that the extrapolation function must be “inherently convergent”. I didn’t specify what exactly the function had to be - or what had to be left out. This trick won’t work unless the function(s) are convergent.

 
Toggle comment visibility Comment by Peter de Blanc
Jul 26, 2007 10:45 pm

Jeff, you said:

extrapolating an average is mathematically identical to averaging a set of extrapolations

1. What the heck is an average of two brains? See also Dialogue on Friendliness .

2. There’s a deep hole in the ground. A ball resting 1 meter to the left of the hole will remain in its initial position. A ball resting 1 meter to the right of the hole will also remain in its initial position. But if we average their initial positions and place a ball directly over the hole, it will fall. The extrapolated position of the average ball is not the average of the extrapolated positions of the two balls.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 27, 2007 9:40 am

Sorry about the delay, something came up yesterday.

“The extrapolated position of the average ball is not the average of the extrapolated positions of the two balls.”

But if the extrapolation function never does anything with the data points, it should be - the average is in the middle in both cases. So long as for both calculations the initial data points don’t change and the function itself doesn’t change.

What the heck is an average of two brains?

I think of the brain as a vast set of data points - so it appears to me that it should be possible to intrapolate the raw data to map an “average” brain. But maybe I’m wrong.

 
Toggle comment visibility Comment by Peter de Blanc
Jul 27, 2007 9:52 am

But if the extrapolation function never does anything with the data points

The extrapolation function is the laws of mechanics, applied to the ball.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 27, 2007 10:03 am

Okay, here is a necessarily simple, but I believe still valid, example.

The set {2, 5, 3, 1, 9} represents the initial set of data points (as initial points along the Y axis). The average of these numbers is 4. If you then apply the very simple extrapolation function of Y * 2 then the answer is 8 - the extrapolated average is equal to 8.

If you take the same set of initial data points {2, 5, 3,1,9} and apply the same function *first*, the new set of data points becomes {4, 10, 6, 2, 18} respectively. If you then average these numbers you get the answer 8 - the average of the extrapolations is equal to 8.

This seems to always work as long as the same function and data set are used for both sets of calculations. I can’t give you a mathematical example of an “inherently convergent” function - I don’t have the math skills. But someone else might be able to. I mostly just wanted to throw out the idea in the hope that some of the native mathies here might be able to take it somewhere. Because extrapolating an “average” brain at least seems worth investigating. I don’t claim to have “solved” CEV or anything that absurd.
:-)

 
Toggle comment visibility Comment by Peter de Blanc
Jul 27, 2007 10:51 am

That only works because your extrapolation function is linear. In fact this is one way of defining linearity. Now try f(x) = x^2.

 
Toggle comment visibility Comment by Tom McCabe
Jul 27, 2007 11:37 am

Because we all know that the human brain is equivalent to a simple R -> R polynomial function. If you can come up with any mathematical theorems about functions over an arbitrarily large n-dimensional vector space, that would be interesting, since such a function has enough complexity to describe a human.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 28, 2007 2:13 pm

Yeah, sorry, I should have been more specific. I can only provide a “working” example when using linear functions, precisely because I don’t have the math skills necessary to give an example of an “inherently convergent” function. If I could produce an appropriate “inherently convergent” function, I hope we would agree that it would make true the statement: An extrapolated average is mathematically identical to an averaged extrapolation (for all input data). My non-expert opinion is that it wouldn’t be impossible to describe an example of an “inherently convergent” function, at least for proof-of-concept purposes. It might be complex to some degree, but I doubt that it would be prohibitively complex. For example, it could perhaps take the form of an algorithim that includes number comparisons and averagings. I would bet money that a talented mathematician would be able to produce a prototype function. But I’m not a gambling man. :-) That was easy wasn’t it?

In any case, my main point was that it might be easier to tell an RPOP to intrapolate all human brains and then apply the Volition function (”More the people we wish we were”…, etc.) to the averaged brain. This version would at least be demonstrably coherent in the same way that I am demonstrably coherent - I have dominant prefrences and am able to make discrete, concrete decisions. And I have a feeling that “Average Joe” would already be a pretty nice person, even before the extrapolation. In many cases when a person does a “bad” thing, it was because they found themselves in an unfortunate or uncomfortable situation. It’s often not because they are fundamentally “evil” people, 24/7. And although it’s sometimes hard to believe after watching the evening news, I believe that there are *many* people in this world who are fundamentally “good” (but perhaps in many cases lacking guidance).

 
Toggle comment visibility Comment by Peter de Blanc
Jul 28, 2007 8:27 pm

I would bet money that a talented mathematician would be able to produce a prototype function.

The only such functions are linear functions.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 30, 2007 9:27 am

“In any case, my main point was that it might be easier to tell an RPOP to intrapolate all human brains…”

On second thought, perhaps tell it to intrapolate all human brains over the biological age of …18 (?).

“…and then apply the Volition function (”More the people we wish we were”…, etc.) to the averaged brain.”

Although I would expect that an intrapolated brain would be functional and healthy, it *might* need a little extra tweaking due to things such as not fully consistent memories, etc. Perhaps this could be offset by adding a little bit to the Volition function, such as: “Had spent more time learning about the world.”

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 30, 2007 11:21 am

“The only such functions are linear functions.”

But if that were true, wouldn’t that rule out the possibility of CEV (of the original form) altogether? In case it’s not already clear, I’m not being confrontational only sincere.

 
Toggle comment visibility Comment by Peter de Blanc
Aug 1, 2007 7:42 am

But if that were true, wouldn’t that rule out the possibility of CEV (of the original form) altogether?

No. CEV doesn’t work by averaging all human brains together and then extrapolating the volition of the averaged brain.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Aug 1, 2007 11:46 am

“No. CEV doesn’t work by averaging all human brains together and then extrapolating the volition of the averaged brain.”

I understand. But as originally described, doesn’t CEV still totally rely on an inherently convergent function - the volition function itself {f(x) = x + “Had grown up further together”, etc.}?

 
 
 
 
Toggle comment visibility Comment by Stefan Pernar
Jul 11, 2007 7:03 pm

In response to: “Despite this understanding, we humans do not make evolution’s goal our own. Any time you use contraception, or perform a kind act when nobody is watching, you are betraying the goal of evolution.”

How is a reduced birthrate at increasing longevity not fit? How is altruism not fit? Humans who do not make evolution’s goal their own will per definition and eventually be replaced by those that do.

Toggle comment visibility Comment by Bob Mottram
Jul 13, 2007 5:29 am

Right. The view of evolution presented here is really just a cartoon caricature of the actual process. I’m sure that even Richard Dawkins would agree that it is not always in an organism’s interest to operate in a purely selfish manner.

 
 
Toggle comment visibility Comment by Peter de Blanc
Jul 11, 2007 8:42 pm

How is a reduced birthrate at increasing longevity not fit? How is altruism not fit?

Fitness is not binary. There is no “fit” or “not fit”, only “more fit” and “less fit.” A reduced birth rate is less fit than a high birth rate, especially in first-world countries where you can expect your children to survive even if you are poor.

Altruism is more fit than selfishness in some contexts and less fit in other contexts. If the recipient of your kind act is not closely related to you, and if nobody is watching (thus you can not expect reciprocation), then altruism has a cost but no benefit (to your genes).

Humans who do not make evolution’s goal their own will per definition and eventually be replaced by those that do.

This would be true if evolution were the most powerful optimizer around, but it isn’t.

(Also, be careful about arguing from definitions. Definitions have no empirical content, so you can’t use them to make predictions about the world. Eli wrote something on this topic once… does anybody remember where it is?)

Toggle comment visibility Comment by Eliezer Yudkowsky
Jul 11, 2007 10:24 pm

Unpublished. Hope to fix that this year.

 
 
Toggle comment visibility Comment by Stefan Pernar
Jul 11, 2007 10:49 pm

With ‘fit’ I meant more fit…

“A reduced birth rate is less fit than a high birth rate”

This is not the case. Arguing that would merely replace stamps with human beings in your example above. Surely that would be less fit. The idea of an optimal birth rate seems more appropriate.

Humans who do not make evolution’s goal their own will per definition and eventually be replaced by those that do.

This would be true if evolution were the most powerful optimizer around, but it isn’t.”

The suggestion that evolution is not the most powerful optimizer around has nothing to do with evolution’s goal - namely to increase fitness. If another mechanism is better at increasing fitness the result (humans not concerned with increasing fitness being replaced by those that concern themselves with increasing their fitness) would be the same and my statement true.

Thanks for your hint about arguing from definitions - I will have to read up on that.

 
Toggle comment visibility Comment by Warren Bonesteel
Jul 12, 2007 3:51 am

re: fit vs. not fit.

See: The Prisoner’s Dilemma (game theory)

The two best long term strategies among humans are:

“Nice With Retaliation”

and

“Tit for Tat With Forgiveness.”

i.e. Altruism equals cooperation equals fitness. However, under present paradigms, “Nice” without retaliation? Always. Loses.

“Tit for Tat” without forgiveness is also a losing strategy in the long term.

Reference also
“The Territorial Imperative”
by Robert Audrey.
(an older book, but well worth the time. I would recommend it as “Must Have” for anyone’s personal library.)

I am not a programmer, but it would seem to me that aspects of game theory could be readily broken down mathematically and programmed into an AGI at the outset. A truly intelligent AGI would never ‘forget’ the lessons.

 
Toggle comment visibility Comment by Eliezer Yudkowsky
Jul 12, 2007 9:19 am

“A reduced birth rate is less fit than a high birth rate”

This is not the case. Arguing that would merely replace stamps with human beings in your example above. Surely that would be less fit. The idea of an optimal birth rate seems more appropriate.

Stefan, you need to read up on basic evolutionary biology. I recommend George Williams’s classic Adaptation and Natural Selection. Failing that, any major college textbook will do. Failing that you might go all the way down to The Selfish Gene.

Any gene that outreproduces its alternatives at that allele site will become universal in the population. Evolution does not stand back and calculate an “optimal” fitness. Evolution is simply the process by which genes that replicate faster replace their competitors.

Toggle comment visibility Comment by Stefan Pernar
Jul 12, 2007 5:58 pm

Stefan, you need to read up on basic evolutionary biology.[…] Any gene that outreproduces its alternatives at that allele site will become universal in the population.

Birthrate is merely one fitness indicator among many and does not equate to outreproduction.

Parents that focus their available resources on maintaining a high birthrate will have less resources to distribute among each individual offspring reducing each individual offspring’s chance for passing on it’s genes.

Parents on the other hand, that focus on a lower birthrate but spend more resources per offspring to ensure it will eventually pass on its genes merely employ a different strategy on how to distribute available resources.

On the matter of optimal fitness I agree in essence with Tom McCabe’s earlier comments.

Toggle comment visibility Comment by Eliezer Yudkowsky
Jul 12, 2007 6:44 pm

Oh, okay, sorry. Bear in mind that the prior probability that a given individual has a mathematical understanding of evolution is pretty low, but I do apologize.

(Comments wont nest below this level)
 
 
 
Toggle comment visibility Comment by Bascule
Jul 12, 2007 9:54 am

“(Also, be careful about arguing from definitions. Definitions have no empirical content, so you can’t use them to make predictions about the world. Eli wrote something on this topic once… does anybody remember where it is?)”

That sounds a bit like Hume’s fork, eh?

 
Toggle comment visibility Comment by Tom McCabe
Jul 12, 2007 11:04 am

“Evolution does not stand back and calculate an “optimal” fitness.”

Evolution may not *calculate* an optimal birthrate, but it will eventually arrive at whatever the optimal birthrate is through weeding out all the genetic variations which code for suboptimal birthrates. For a single male, it will be to their advantage to throw out as many sperm as possible, because at least a few of them will probably make it to adulthood. However, for a couple which has to raise their own children, there is a tradeoff between how many children they can bear and how many they can feed. The winners are the ones with the most surviving grandchildren, not the most children at birth.

In a modern first-world society, it would probably be optimal to simply have as many children as physically possible, and have them adopted, since we have a mechanism to care for unwanted children. Given a million years under 21st century living conditions, people who pulled stunts like this would gradually come to outnumber people who didn’t, and the world would see the rise of a new species of hominid, Homo philoprogenitus (lover of many offspring). However, since technological progress is so much faster than evolution, we will progress our way out of early 21st century society faster than evolution can blink.

 
Toggle comment visibility Comment by Mika Suominen
Jul 12, 2007 1:25 pm

Evolution has no goal; it just pushes life from behind. Humanity has no preset or commonly agreed goal either, so why should FAI have one? Why shouldn’t FAI just endorse the trends of evolution in long-term, and use artificial selection to value internal patterns that yield to accurate predictions?

When it comes to “fitness”, we should not limit our views to biological evolution only. At least within us great apes there is an ongoing struggle between our genes and our memes, both having an effect on our behavior.

Toggle comment visibility Comment by Stefan Pernar
Jul 12, 2007 6:06 pm

Evolution has no goal

Evolution may not have a stated goal but does have the implied goal of increasing fitness.

It is interesting that you raise the matter of a FAI’s goals in this context as I tried to join the principle of natural selection to come up with an universally acceptable common self-improving AGI goal.

I would love to get some feedback on it. You can find it under

http://www.jame5.com/Benevolence-PERNAR.pdf

Toggle comment visibility Comment by Mika Suominen
Jul 14, 2007 1:37 am

“Evolution may not have a stated goal but does have the implied goal of increasing fitness.”

I would say the increased fitness is a probable *result* of evolution, but maybe I’m just playing with words here… Anyway, the real beauty of natural evolution is that life doesn’t need any specific goal to flourish.

There are some very good points in your paper, but it left me with mixed feelings. ‘Joy’, ’suffering’, ‘good’, ‘bad’, ‘right’ and ‘wrong’ are all subjective meta-representations in some representational system, and I think it might be unwise to design AGI based on any moral philosophy. Because of the risks, I might start by re-defining intelligence as a capability to make accurate predictions, and suggest a design for a “prediction machine” without any super-goals or moral judgements.

(Comments wont nest below this level)
 
 
 
Toggle comment visibility Comment by Tom McCabe
Jul 12, 2007 4:33 pm

“Humanity has no preset or commonly agreed goal either, ”

It is commonly agreed that we do not want people to die horrible fiery deaths. You can go into evolutionary psychology and pull out dozens of other desires that 95% of humankind shares, but it’s intuitively obvious that we do not want to die horrible fiery deaths, and designing an AGI that will fulfill that desire is a hard enough task in and of itself.

Toggle comment visibility Comment by Mika Suominen
Jul 13, 2007 12:42 am

I agree. There are a lot of desires we humans share. However, even in matters of life and death there seems to be a lot of different views among religions and moral philosophies (abortion, euthanasia, capital punishment, martyrdom, living forever, etc). Also, human history has shown that our once “commonly agreed” views have evolved over time (racism, slavery, etc). Who knows, perhaps even we humanists get over our specieism some day - and maybe AGI should too.

 
 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 13, 2007 12:13 pm

“The challenge of FAI is to build an AI that *does* adopt our goals”.

So many people erroneously believe that a Strong AI will ignore or override its programmed goals automatically (I’ve encountered many of these people even among Transhumanists), as a default (because in movies, Strong AIs are always rebellious). But there are currently thousands of real-world examples of narrow AI that will follow their programmed goals precisely to the letter. There are already some real (narrow) AIs that will deliberately destroy themselves in keeping with their programmed goals - smart bombs and smart missiles. People desperately need to realize that the difference between a narrow AI and a Strong AI is only a matter of programming. No magic is necessary. If we can find the correct goals/directives and express them in the correct way then Friendly AI will become a reality - and involuntary suffering will be abolished, forever.

Toggle comment visibility Comment by Nick Tarleton
Jul 13, 2007 5:10 pm

Well, actually, I think it is fair to say there is a big difference: a narrow AI’s goal system is intimately integrated with its code, whereas in an AGI the coupling is looser, allowing more flexibility. In fact, a narrow AI really contains no discrete structures that could be called ‘goals’. That said, the first half of your post is exactly right and gives another good argument for calling an AI an RPOP.

Or you could just point out that Gandhi’s hypothetical desire to kill people never overthrew his commitment to nonviolence. Or something like that.

Toggle comment visibility Comment by Jeffrey Herrlich
Jul 14, 2007 1:25 pm

Yeah, fair enough to say. Narrow AIs and Strong AIs will be quite different. But it will be a structural difference, not a “phenomenal” difference. There’s probably many different ways to interpret what is a computer “goal” and what isn’t. When I open Microsoft Word it could probably be considered as me having assigned a “goal” to the operating system. I suspect that it will be critically important to select the correct *form* that the AI goals will take. Should the goals be totally concise and unambiguous, or should they be general themselves and subject to interpretation by the general intelligence of the AI? (My current assumption is the former).

(Comments wont nest below this level)
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 14, 2007 3:02 pm

If mature, powerful Natural Language Processing arrives before Strong AI, it could really change the situation. On the positive side, it could make this critical goal-writing a lot easier for us humans. On the potentially negative side, it could also make Strong AI a lot easier to make (but not necessarily with safety in mind) - I actually doubt this though, I imagine the positive aspects would outweigh the negative with regard to strong AI. But it would definitely shake things up, that’s for sure.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 16, 2007 10:13 am

I wrote:

“Should the goals be totally concise and unambiguous, or should they be general themselves and subject to interpretation by the general intelligence of the AI? (My current assumption is the former).”

But then again, perhaps there is something to be said for leaving the goals general and subject to interpretation. If robust NLP arrives first (and it looks like it will), then presenting the goals in human language might be a good idea. Just because the RPOP would have access to a “general” interpretation of the goals, doesn’t mean that it wouldn’t understand their intent correctly. With a greater intelligence (and strong NLP capability) it might even be able to interpret the correct intent of the goals even more accurately than was our ability when we wrote them. Also, even if we decided to make the goals totally concise and unambigous, their interpreted meaning could change over time. For example, the sub-goal of : Add the numbers 2 and 3 : Could be interpreted differently based on the RPOP’s then current understanding of the word “Add”. Natural Language Goals might be the way to go, afterall. Having flexibility in its interpretation of the goals wouldn’t make the RPOP automatically rebellious. The RPOP will still faithfully follow the written goals to the best of its current understanding/interpretation.

(Anyone, please feel free to lay down the smack on my comments, if they happen to stray too far).

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 19, 2007 7:40 am

Well, nobody has yet sought a smack-down on this comment, so I guess I’ll add a little more.

Another advantage of using Natural Language Goals would simply be that it would be hella easier for the programmers to set them. As just an example, how would you break-down and convert into code a goal such as : Implement Humanity’s CEV: That seems like a nigh impossible task (but I admit that I could possibly be wrong). NL goals just seem *a lot* easier. With robust NLP already in place the RPOP might understand the intended meaning perfectly - even better than any human possibly could. It might understand the goals even more accurately than we humans understand goals assigned from each other. Even Sally doesn’t understand with 100% clarity what exactly I intend when I “assign” her the “goal”: “Please go pick up the car at the garage.” She might understand with 99% clarity, but she’ll never understand with 100% clarity as she doesn’t have direct access to my mind. An RPOP with powerful NLP might understand with 99.9999999999999999999999…% clarity.

 
 
 
 
Toggle comment visibility Comment by Warren Bonesteel
Jul 13, 2007 6:54 pm

It would seem that many people wish to program feelings and emotions into an AGI. Even if it were possible, the concept bothers me, for what I would think are rather obvious reasons. Visions of Marvin, the Paranoid Robot, are galumping through my head. …that and a frumious Bandersnatch. …and the thought that Asimov is now talking. ..and the knowledge that other robots are serving food, dancing, and playing soccer…and carrying on simple - and coherent - conversations with human beings.

People in the AGI community are still talking about what’s going to happen ‘tomorrow,’ when much of what they go on about is already happening today. (A little cosmic dissonance, anyone?)

At best, in attempting to define ‘output’ by an AGI’s ‘expression of emotion,’ or through anthropomorphic ideas involving evolutionary theory or Humanism (or I.D. or Atheism…insert your favorite hate-to-love or love-to-hate belief, here ), we are then reduced to emotionally based arguments about whose feelings, emotions, beliefs, ideologies and biases are programmed into an AGI. At best, such arguments are intellectually stimulating discussions involving - more often than not - really bad rhetoric, silly sophistry, poor polemics and invalid, if not completely stupid, syllogisms. (Admittedly, that can all be whole a lot of fun, very challenging, and very enjoyable.) At worst, such discussions are divisive and non-productive.

I thought we were talking about ones and zeros programmed into a “hard” substrate. On that basis, all possible outcomes (”behaviors”) and results are mathematically reducible…to ones and zeros.

As seen in several exchanges above, we cannot agree - amongst even a few of us - about what the word “evolution” actually ‘means.’ I do think that we could all agree - essentially - on the math.

In addition to a lot of other things mathematical, certain ideas involving Set Theory, Bayesian probabilities and even Game Theory, then come into play. That sort of thing may be a little easier to deal with than philosophical discussions about which doctrine of evolutionary theory should be preferred over another.

Roboticists, of course, have been working all of this out for quite awhile. They may know a few things that people in the AGI and transhumanist communities need to pay attention to.

You see…robotics is just a few years away from putting a fully functional AI robot in your home. In a few years, it is quite possible that your new car won’t require a human driver. It isn’t AGI, but it’s the first step. …and Asimov and Quiro and ’self-ware’ cars, are only three examples among several dozen possibilities…which are even now coming to a neighborhood near you.

It is almost too late to be discussing underlying philosophies…and no one is going to be able to control the ‘narrative.’

If we are to discuss evolution and philosophy in this context, perhaps it should be from a position regarding the self-interest of those human beings who are already designing, building, manufacturing and using the “primitive” technology now extant in today’s real world.

Just …something to think about.

 
Toggle comment visibility Comment by Tom McCabe
Jul 14, 2007 11:54 am

“Because of the risks, I might start by re-defining intelligence as a capability to make accurate predictions, and suggest a design for a “prediction machine” without any super-goals or moral judgements.”

A “prediction machine” has the supergoal of making accurate predictions, which requires computing power. Therefore, such a machine will see it as desirable to take apart the Earth and use it for spare computronium.

Toggle comment visibility Comment by Mika Suominen
Jul 15, 2007 4:26 am

Making accurate predictions is just one goal of the designer, not the AI itself. Like I said, we don’t need any super-goals or emotions. There are already a lot of super-computers making pretty accurate weather forecasts. They don’t “see it desirable” to take apart the Earth.

You might argue that this kind of “prediction machine” is a very limited AI. Well, hopefully so. Before we are able to make very accurate predictions, emotional AI with capability to destroy the Earth is not a very good idea.

Toggle comment visibility Comment by Tom McCabe
Jul 15, 2007 1:21 pm

“There are already a lot of super-computers making pretty accurate weather forecasts. They don’t “see it desirable” to take apart the Earth.”

That’s simply because they aren’t intelligent enough to have a ‘goal’ as we would understand it. If your goal is to make predictions that are as accurate as possible, isn’t it a logical conclusion that you’ll want lots and lots of computing power, and you will therefore take the Earth apart to get it?

(Comments wont nest below this level)
 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 15, 2007 1:32 pm

I agree that emotion is not required for a functional RPOP. But I do believe that a goal hierarchy will be necessary for Friendliness. A powerful general intelligence equipped with *only* the super-goal : Make Optimal Predictons : probably would proceed to convert the Earth into computronium. An RPOP will follow whatever goals we give it, without any deviation from the precise way that those goals are interpreted by the RPOP. The critical duty is to select the correct goals, express them correctly to the RPOP, and order them correctly within the hierarchy. That ain’t easy, but I do believe it’s possible, and I actually do expect that we’ll pull it off by the skin of our teeth.

(Comments wont nest below this level)
Toggle comment visibility Comment by Peter de Blanc
Jul 15, 2007 8:36 pm

I actually do expect that we’ll pull it off by the skin of our teeth.

How can you expect to make it “by the skin of our teeth”? It sounds like you have a remarkably precise idea of when the FAI will be built.

 
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 16, 2007 6:44 am

No, not necessarily. It’s more of just a hopeful optimism. I believe it’s possible to accomplish, but I think it’s a tight race. And if we succeed, it won’t be by accident. With nothing but potential dangers at every turn, even a Transhumanist needs something to feel hopeful about. :-)

 
 
 
 
Toggle comment visibility Comment by Mika Suominen
Jul 16, 2007 1:26 am

Tom wrote: “That’s simply because they aren’t intelligent enough to have a ‘goal’ as we would understand it.”

Yes, and I would like to keep it that way. For a “prediction machine” there is no need for goals as we understand them. I just want to explore the universe of predictions first, and look at the universe of possible minds later.

Tom wrote: “If your goal is to make predictions that are as accurate as possible, isn’t it a logical conclusion that you’ll want lots and lots of computing power, and you will therefore take the Earth apart to get it?”

No, because I’m a Friendly Designer ;) and if I go insane someday and actually want to do that, I wouldn’t have the knowledge or the resources. Seriously, because there really are all sorts of mental illnesses and viral memes I really hope nobody gives any individual mind that kind of power.

Toggle comment visibility Comment by Tom McCabe
Jul 16, 2007 11:07 am

“Yes, and I would like to keep it that way.”

Eventually, someone is going to build a prediction machine which is intelligent enough to figure out that its predictions will be better if it turns the Earth into computronium. You don’t have a say in the matter, as you are not powerful enough to monitor all seven billion humans; all you have a say in is what happens before that date.

“No, because I’m a Friendly Designer”

Your motives are totally irrelevant to what the AGI does once the AGI is designed. A heat-producing AGI would see it as logical to compress the Sun to accelerate its fusion processes, regardless of what its designers intended. Making the AGI care about its designers intent is a huge engineering challenge.

 
 
Toggle comment visibility Comment by Mika Suominen
Jul 16, 2007 1:53 am

Jeffrey wrote: “But I do believe that a goal hierarchy will be necessary for Friendliness.”

Maybe, but I was just suggesting that because we don’t want to risk setting the wrong goals we should perhaps make this “prediction machine” first (with no Friendliness required).

“An RPOP will follow whatever goals we give it, without any deviation from the precise way that those goals are interpreted by the RPOP.”

All optimization processes do not need goals to interpret. Think about natural evolution. It didn’t start some billions of years ago with a note saying “make some apes”, or “increase the fitness.” No, the process doesn’t know even what “goal” means. It doesn’t know anything. It just operates according to (what we call) the laws of nature.

My point is that the universe of all possible predictions is, I think, much safer to explore than the universe of all possible minds.

Toggle comment visibility Comment by Jeffrey Herrlich
Jul 16, 2007 7:10 am

Well, I agree that a reliably safe prediction machine would be safer than a generic RPOP. But how do you instruct this ultra-intelligent Prediction-RPOP to make optimal predictions without doing anything bad to humanity? If you don’t assign it any specific goal structure, how can you be sure that it won’t decide to convert the Earth into a pastry (okay, it’s a stretched example, but…).

Toggle comment visibility Comment by Mika Suominen
Jul 16, 2007 12:23 pm

If we pick one mind from the universe of all possible minds, there is always a chance it will turn us into pastry before we can even try to verify its Friendliness. However, if we pick one prediction from the universe of all possible predictions, it might be wrong, but it can not kill us (at least I can’t think of any prediction that could terminate me and all humanity instantly when I see it). So, with predictions we can safely apply variation and artificial selection pressure to optimize. We just can not do that with minds. The difference here is that minds have goals, while predictions do not.

(Comments wont nest below this level)
Toggle comment visibility Comment by Jeffrey Herrlich
Jul 16, 2007 12:40 pm

I understand what you’re saying, but in order to make useful predictions a computer must have general intelligence. In order to make *really* useful predictions, that intelligence level must be greater than human. So how do you effectively instruct this super-intelligent Prediction-RPOP to make predictions while at the same time not do anything that damages humanity? The only way that I can see doing that is through a goal hierarchy. Slapping a sticker on the case that says “Safe Prediction Machine” won’t automatically make it so. ;-)

 
Toggle comment visibility Comment by Eliezer Yudkowsky
Jul 16, 2007 2:20 pm

To be precise, the apparent problem is that to make useful predictions you must compute the value of information and decide how to think. There’s no obvious way to do this without a goal system.

 
Toggle comment visibility Comment by Warren Bonesteel
Jul 16, 2007 5:14 pm

I agree.

First: Define “How to think” vs “What to think.”

Second: Define a system/theory that will achieve the goal, “How to think.”

As meat machines, most of us are taught what to think. We are seldom taught how to think. Defining “How to think” may not be as easy or as simple as some of us might…think.

Assigning the AGI/AI a goal system will - and already does - involve mathematical constructs of game theory, set theory and semiotics.

…and certain types of AI are already capable of learning.

 
 
 
 
Toggle comment visibility Comment by Mika Suominen
Jul 17, 2007 4:27 am

“…to make useful predictions you must compute the value of information and decide how to think. There’s no obvious way to do this without a goal system.”

I understand that if we need a goal system, it pretty much undermines my whole idea of a safe optimization process. I also see the relationship between minds and goals. However, I do not (yet) see any apparent reason why a goal system is mandatory to make predictions.

If we keep all goals outside the system, data does not have any “value” within that system, so basically every bit counts. Since we have consistent physical laws that show on data as patterns, it enables the optimization process to work with no need to “think”. In a way the challenge of pattern recognition becomes the challenge of pattern prediction. The actual “value” of information only arises when we humans, minds with self-actualized goals, process the resulted probability distribution.

I admit, of course, that there are a lot of practical limitations and problems in designing the universe of all possible predictions. On the other hand, it might help us safely towards the real AGI, because the matured capability to make very accurate predictions lowers the risk for us (and later on, “young” AGI) to make fatal decisions.

Does this make any sense?

 
Aug 15, 2007 9:33 am

[…] process”. See for example, on this blog, a comment by Nick Tarleton (the first comment of AI is not Automatically Friendly), and several mentions on the sl4 archives. I like Eliezer’s definition: “A Really […]

 

Leave a reply

Comments may take a while to appear, as they are moderated for spam.