Recent Comments
SIAI Bloggers
  • Michael Roy Ames SIAI Canada President
  • Tyler Emerson Executive Director
  • Ben Goertzel Director of Research
  • David Hart Director of Open Source Projects
  • Bruce Klein Director of Outreach
  • Jonas Lamis Director of Partnerships
  • C. Colby Thomson Director of Strategy
  • Eliezer Yudkowsky Research Fellow
Guest Bloggers
  • Michael Anissimov Lifeboat Foundation
  • Seth Baum Pennsylvania State University
  • Nick Hay University of Auckland
  • Mitchell Howe Contributing Writer
  • Carl Shulman New York University
  • Peter de Blanc Temple University
Tag Cloud
academic academics accelerating change accelerating change agi AGI 08 ai Anthropic Reasoning anthropomorphism artificial intelligence artificial intelligence aubrey de grey barney pell biases BIL bloggers bloggingheads tv bruce klein catastrophic risks civilization conference conference agi 09 conference chairman conferences consciousness research conventions convergence convergence08 cto cynthia breazeal david hart director of research donations doug wolens eliezer yudkowsky eric baum esther dyson event horizon events evolution existential risks FAI feature length documentary films Friendly AI Friendly Artificial Intelligence future salon future shock futurist community goertzel google gsoc institute research fellow intelligence explosion interest journal interesting articles interviews intros JAGI jaron lanier john horgan justin rattner language search engine life extension machine consciousness marcus hutter martin rees math mathematics media meeting microsoft mit morality nanotechnology natural language search neil gershenfeld new york times news office of naval research open letter open source open source open source projects opencog opencogprime optimization processes outreach papers peter diamandis peter thiel pitt podcasts prediction quantum computing radio ray kurzweil relevant articles research fellow risk roadmap school science science fiction shane legg SIAI singularity singularity summit singularity institute singularity summit spectrum talk transhumanism utilitarianism utility vernor vinge videos virtual reality pioneer xiamen university yudkowsky
Archives

When is it Optimal to Launch a Friendly AI?

July 9th, 2007Seth Baum

Suppose we have coded what we believe to be a Friendly AI. We then face a moment like in the film Pi: “12:50, press Return” (link). In other words, we face three options: Launch the AI now, try improving it further and launch it later, or never launch it. Why wouldn’t we launch right away or perhaps ever? Maybe we made a mistake in the code somewhere, or maybe we don’t need to launch and would be better off not chancing it.

Ideally, we’d have an infinite amount of time to make sure we got the code right, but in practice, we don’t have this luxury: The longer we wait, the more risk we run of an existential event (including an unFriendly AI launch from a different AI project) occurring while we wait. Our hesitation could save us, but it could also spell our doom.

So, when should we push the button?

The answer to this question depends on how we define the word “should”. I’ll answer from three different points of view:

1) We should minimize the expected number of unwanted deaths that occur. This view comes from the discussion at Transhumanism as Simplified Humanism. It also resembles “prior-existence utilitarianism” (see Total vs. Prior Existence on Felicifia). In the analysis below, we assume that a Friendly Singularity would include successful life extension, thereby preventing the unwanted deaths of all alive after the Singularity. We also assume that in the absence of a Friendly Singularity, everyone would eventually die.

2) We should minimize the chance that an existential event occurs. This view comes from an analysis of infinite utility at Felicifia which, while apparently flawed, does closely resemble my views as well as “total utilitarianism” (see prior link on infinite utility). In the analysis below, we assume that a Friendly Singularity would eliminate existential risk. We also assume that in the absence of a Friendly Singularity, an existential event would eventually occur.

3) We should minimize the chance that we launch an unFriendly AI. This view comes from the precautionary principle, which states that “if an action or policy might cause severe or irreversible harm to the public, in the absence of a scientific consensus that harm would not ensue, the burden of proof falls on those who would advocate taking the action”.

To simplify our analysis, let’s assume that there are only two possible occasions to launch our AI, t0 and t1, with t1 coming later. The analysis can be readily extended to multiple time periods or to continuous time. Also assume that we must launch at either t0 or t1 (unless of course we’re wiped out before t1). To further help, let’s define a few more variables:

n0 = # individuals alive at t0
n1 = # individuals alive at t1
d01 = # individuals who die between t0 & t1
pF0 = # probability that an AI launched by us at t0 would be Friendly
pF1 = # probability that an AI launched by us at t1 would be Friendly
pE01 = probability that an existential event occurs between t0 & t1

Analysis

1) Minimize the expected number of unwanted deaths that occur. Here, we simply compare the expected number of unwanted deaths of those alive at t0 that would occur under the two possible launch moments and pick the lower one:
If ( n0*pF0 > (n0-d01)*pF1 ) then we should launch at t0.
Otherwise, we should launch at t1.

2) Minimize the chance that an existential event occurs. Here we compare the chance of us causing an existential event (i.e. by launching an unFriendly AI) at t0 with the same for t1, factoring in the chance that one might happen between t0 and t1 if we don’t launch:
If ( pF0 > pF1 - pE01 ) then we should launch at t0.
Otherwise, we should launch at t1.

3) Minimize the chance that we launch an unFriendly AI. Here we compare the chance of us causing an existential event at t0 with the same for t1, ignoring the chance that one might happen between t0 and t1 if we don’t launch:
If ( pF0 > pF1 ) then we should launch at t0.
Otherwise, we should launch at t1.

Discussion

I’m guessing the easiest view for us to reject is (3), if we can agree that ignoring other existential risks is foolish. However, given past disagreement on the life extension issue (in the Transhumanism as Simplified Humanism comment thread), deciding between (1) and (3) (or some other view) may prove trickier. One thing to take from all of this is that even if we manage to avoid programming our own personal values into the AI (such as by successfully implementing coherent extrapolated volition), our values may enter into the project nonetheless. Indeed, our very decision to participate in the project is a reflection of our values.

So what about postponing a launch indefinitely? Clearly this is what (3) would recommend, since it’s the only way we can’t cause harm. We would recommend this under (2) if we could never get the chance of an existential event occurring to be lower with our AI launch than without. Finally, we would recommend this under (1) if life extension has been solved without AI and other causes of death (e.g. existential events) are less likely without the AI launch.

A take home message from this is that we would be wise to assess both the chance that an existential event might prevent our getting the chance to launch in the first place as well as the chance that any AI we launch would succeed. Both of these inevitably require some subjective/Bayesian probability estimation, so understanding cognitive bias will be important. Many here are familiar with resources on the topic including the Overcoming Bias blog and some of the materials on SIAI’s cognitive science reading list. For following other existential risks, see SIAI’s global catastrophic risks reading list and perhaps also my DailyKos series and other materials.

One final thought: Any time we spend estimating when we should or shouldn’t launch is time not spent improving our code, reducing other existential risk, etc. How we decide what to spend our time on seems as much an art as a science. (See Decision Procedures on Felicifia for more.) Hopefully we’ll make good decisions.

Comments (8) (RSS feed)

Toggle comment visibility Comment by Jeffrey Herrlich
Jul 10, 2007 7:01 am

The existential risks posed by Genetics and Nanotechnology seem at least “survivable” through successfully colonizing space. I think that Friendly AI and space colonization are the two areas that currently need the most critical attention (in addition to other concerns of course). Not only because they have great leverage but because the Friendly AI issue in particular seems to receive the least degree of serious attention, at present.

 
Toggle comment visibility Comment by Robert Bradbury
Jul 10, 2007 7:28 am

Starting with assumption 1, the analysis is fundamentally flawed. There is a significant probability that the presence of even any significantly advanced “friendly AI” in our solar system would lead to the extinction of a significant fraction of humanity. You would just be replacing a fictional “god” (the faith of most current humans) with a real “god”. There are likely to be many humans (starting with myself) who would become very bored with living an an artificial reality imposed by a real “god”. These humans would seek either to restore the former order of things or commit suicide. So an advanced FAI would IMO doom humanity (except those seeking a permanent Mommy/Daddy entity) to extinction.

The second problem deals with “successful lifespan extension”. I happen to know a bit about this area (more than the average person who runs around discussing AIs) and we now understand this problem and how to address it. It *does not* require an AI. It does require that people recognize the problem can be solved and commit to solving it (as was the case for landing men on the moon). Unless you intend for your FAI to reach into the minds of all the people who will not “believe” it until 50 or 100 years from now (and we have a bunch of 150 year old people wandering around) acceptance of this is not going to happen overnight.

A sufficiently powerful AI could “force” humanity into accepting its wisdom (as parents must sometimes force children to eat their vegetables). But I for one see it as highly unlikely that that acceptance process as being one in which the maximum number of human lives is saved.

 
Toggle comment visibility Comment by Grant Czerepak
Jul 10, 2007 1:46 pm

The interesting thing about this post is we are assuming that:

1. The development of the intelligence is binary instead of incremental

2. The implementation of the intelligence is binary instead of incremental

Why can’t an artificial intelligence develop in the same manner as an individual, acquiring increasing levels of intelligence and capability over time that can be controlled? If there are an adequate number of check points can we not judge whether the AI is developing in a manner that is favorable or not and cut back on the intelligence and resources available to it if necessary?

In the same way that you control a nuclear reaction why can’t you have control rods for an AI that you can feed in and feed out? Why are we thinking about AI like it is the bomb? It seems like a path of reasoning that is more careless than I care to entertain. If creating an AI and just flipping the switch for an uncontrolled chain reaction is the policy of the Singularity Institute, I withdraw my support.

 
Toggle comment visibility Comment by Tom McCabe
Jul 10, 2007 3:23 pm

“Why can’t an artificial intelligence develop in the same manner as an individual, acquiring increasing levels of intelligence and capability over time”

It can; the crucial threshold is when the intelligence is able to improve itself sustainably.

“can be controlled?”

Once the AGI is substantially smarter than we are, we won’t be able to control it any more than monkeys can control us.

“If there are an adequate number of check points can we not judge whether the AI is developing in a manner that is favorable or not and cut back on the intelligence and resources available to it if necessary?”

If the AGI is undergoing a failure of Friendliness, and it can’t self-improve, the only rational thing to do is to pull the plug to avoid any risk of a hard takeoff. If it can self-improve substantially faster than humans can program it, we’re already screwed. You cannot control a superintelligence like you can control a toaster; it is much more knowledgeable and better at strategy than you are.

“In the same way that you control a nuclear reaction why can’t you have control rods for an AI that you can feed in and feed out?”

Because humans *cannot* control a superintelligent AGI. This is an unavoidable side-effect of the tremendous good we want the AGI to do; anything that has the power to do more good than humans must be more powerful than humans.

 
Toggle comment visibility Comment by Tom McCabe
Jul 10, 2007 3:27 pm

“You would just be replacing a fictional “god” (the faith of most current humans) with a real “god”.”

Fictional gods have no more bearing on the development of the universe than The Matrix. See http://www.acceleratingfuture.com/tom/?p=12 for why it is bad to mention futurism and fiction in the same sentence.

“(except those seeking a permanent Mommy/Daddy entity)”

This is a perfect example of what I warn against in the above link; cross-importation of concepts from fiction to reality. The fiction of religion is that God is a “father figure”; therefore, we assume that AGIs will also act like a “father figure”, because we have already thought of the two as being roughly analogous.

 
Toggle comment visibility Comment by Kaj Sotala
Jul 11, 2007 9:44 am

There are likely to be many humans (starting with myself) who would become very bored with living an an artificial reality imposed by a real “god”. These humans would seek either to restore the former order of things or commit suicide.

The risk of this depends entirely on the AI’s behavior - certainly there are ways for it to behave that would make a large fraction of humanity prefer death. That, however, would in itself be a failure of Friendliness. The point is precisely to design an AI that is intelligent enough to calculate a behavior that leads to the least deaths.

The second problem deals with “successful lifespan extension”. I happen to know a bit about this area (more than the average person who runs around discussing AIs) and we now understand this problem and how to address it. It *does not* require an AI.

Biological lifespans can probably be extended without an AI, though it will be easier with one. But death due to aging is only one cause of death - one of the worst, yes, but still only one. An AI could potentially help with them all.

 
Toggle comment visibility Comment by Seth Baum
Jul 11, 2007 5:04 pm

Life extension science is not at all my area of expertise. However, it does seem plausible that an AI could help with the social persuasion part in addition to the basic research part, if life extension was a goal.

As for boredom in artificial reality, if this would be a problem, and if the AI is so talented (and the assumption here seems to be that it would be), then it presumably would be sufficiently talented to make the artificial realty interesting.

As for flipping a switch to an uncontrollable outcome, I’ll admit I’m not enthusiastic about this myself- hence the discussion of “postponing a launch indefinitely”. However, if some switch is going to be flipped anyways, then I would prefer we flip the best switch, whatever that means. In the face of existential risk, not flipping an AI switch could amount to flipping a switch leading to our doom.

 
Toggle comment visibility Comment by Grant Czerepak
Jul 12, 2007 7:31 pm

You can control a superior intelligence.

Look how many of us work for idiot managers.

 

Leave a reply

Comments may take a while to appear, as they are moderated for spam.