| Next: | 3: Part III: Seed AI | Bookmark | |
| Up: | 2: Part II: Levels of organization in deliberative general intelligence | Monolithic | |
| Prev: | 2.6: The thought level |
In humans, higher levels of organization are generally more accessible to introspection. It is not surprising if the internal cognitive events called "thoughts", as described in the last section, seem strangely familiar; we listen to thoughts all day. The danger for AI developers is that cognitive content which is open to introspection is sometimes temptingly easy to translate directly into code. But if humans have evolved a cyclic interaction of thought and imagery, this fact alone does not prove (or even argue) that the design is a good one. What is the material benefit to intelligence of using blackboard mental imagery and sequiturs, instead of the simpler fixed algorithms of "reasoning" under classical AI?
Evolution is characterized by ascending levels of organization of increasing elaboration, complexity, flexibility, richness, and computational costliness; the complexity of the higher layers is not automatically emergent solely from the bottom layer, but is instead subject to selection pressures and the evolution of complex functional adaptation - adaptation which is relevant at that level, and, as it turns out, sometimes preadaptive for the emergence of higher levels of organization. This design signature emerges at least in part from the characteristic blindness of evolution, and may not be a necessary idiom of minds-in-general. Nonetheless, past attempts to directly program cognitive phenomena which arise on post-modality levels of organization have failed profoundly. There are specific AI pathologies that emerge from the attempt, such as the symbol grounding problem and the commonsense problem. In humans concepts are smoothly flexible and expressive because they arise from modalities; thoughts are smoothly flexible and expressive because they arise from concepts. Even considering the value of blackboard imagery and sequiturs in isolation - for example, by considering an AI architecture that used fixed algorithms of deliberation but used those algorithms to create and invoke DGI thoughts - there are still necessary reasons why deliberative patterns must be built on behaviors of the thought level, rather than being implemented as independent code; there are AI pathologies that would result from the attempt to implement deliberation in a purely top-down way. There is top-down complexity in deliberation - adaptive functionality that is best viewed as applying to the deliberation level and not the thought level - but this complexity is mostly incarnated as behaviors of the thought level that support deliberative patterns.
Because the deliberation level is flexibly emergent out of the sequiturs of the thought level, a train of thought can be diverted without being destroyed. To use the example given earlier, if a deliberative mind wonders "Why is X a Y?" but no explanation is found, this local failure is not a disaster for deliberation as a whole. The mind can mentally note the question as an unsolved puzzle and continue with other sequiturs. A belief violation does not destroy a mind; it becomes a focus of attention and one more thing to ponder. Discovering inconsistent beliefs does not cause a meltdown, as it would in a system of monotonic logic, but instead shifts the focus of attention to checking and revising the deductive logic. Deliberation weaves multiple, intersecting threads of reasoning through intersecting imagery, with the waystations and even the final destination not always known in advance.
In the universe of bad TV shows, speaking the Epimenides Paradox1 "This sentence is false" to an artificial mind causes that mind to scream in horror and collapse into a heap of smoldering parts. This is based on a stereotype of thought processes that cannot divert, cannot halt, and possess no bottom-up ability to notice regularities across an extended thought sequence. Given how deliberation emerges from the thought level, it is possible to imagine a sufficiently sophisticated, sufficiently reflective AI that could naturally surmount the Epimenides Paradox. Encountering the paradox "This sentence is false" would probably indeed lead to a looping thought sequence at first, but this would not cause the AI to become permanently stuck; it would instead lead to categorization across repeated thoughts (like a human noticing the paradox after a few cycles), which categorization would then become salient and could be pondered in its own right by other sequiturs. If the AI is sufficiently competent at deductive reasoning and introspective generalization, it could generalize across the specific instances of "If the statement is true, it must be false" and "If the statement is false, it must be true" as two general classes of thoughts produced by the paradox, and show that reasoning from a thought of one class leads to a thought of the other class; if so the AI could deduce - not just inductively notice, but deductively confirm - that the thought process is an eternal loop. Of course, we won't know whether it really works this way until we try it.
The use of a blackboard sequitur model is not automatically sufficient for deep reflectivity; an AI that possessed a limited repertoire of sequiturs, no reflectivity, no ability to employ reflective categorization, and no ability to notice when a train of thought hasn't yielded anything useful for a while, might still loop eternally through the paradox as the emergent but useless product of the sequitur repertoire. Transcending the Epimenides Paradox requires the ability to perform inductive generalization and deductive reasoning on introspective experiences. But it also requires bottom-up organization in deliberation, so that a spontaneous introspective generalization can capture the focus of attention. Deliberation must emerge from thoughts, not just use thoughts to implement rigid algorithms.
Having reached the deliberation level, we finally turn from our long description of what a mind is, and focus at last on what a mind does - the useful operations implemented by sequences of thoughts that are structures of concepts that are abstracted from sensory experience in sensory modalities.
Philosophers frequently define "truth" as an agreement between belief and reality; formally, this is known as the "correspondence theory" of truth [James11]. Under the correspondence theory of truth, philosophers of Artificial Intelligence have often defined "knowledge" as a mapping between internal data structures and external physical reality [Newell80]. Considered in isolation, the correspondence theory of knowledge is easily abused; it can be used to argue on the basis of mappings which turn out to exist entirely in the mind of the programmer.
Intelligence is an evolutionary advantage because it enables us to model and predict and manipulate reality. In saying this, I am not advocating the philosophical position that only useful knowledge can be true. There is enough regularity in the activity of acquiring knowledge, over a broad spectrum of problems that require knowledge, that evolution has tended to create independent cognitive forces for truthseeking. Individual organisms are best thought of as adaptation-executers rather than fitness-maximizers [Tooby92]. "Seeking truth", even when viewed as a mere local subtask of a larger problem, has sufficient functional autonomy that many human adaptations are better thought of as "truthseeking" than "useful-belief-seeking". Furthermore, under my own philosophy, I would say that beliefs are useful because they are true, not "true" because they are useful.
But usefulness is a stronger and more reliable test of truth; it is harder to cheat. The social process of science applies prediction as a test of models, and the same models that yield successful predictions are often good enough approximations to construct technology (manipulation).
I would distinguish four successively stronger grades of binding between a model and reality:
A string of several discrete or quantitative variables creates a patterned variable, which is also likely to be computationally intractable for exhaustive forward search. Binding a patterned goal to a patterned action, if the relation is not one of direct identity, requires (again) a causal belief that specifies a reversible relation between the antecedent and the consequent, or (if no such belief is forthcoming) deliberative analysis of complex regularities in the relation between the action and the outcome, or exploratory tweaking followed by induction on which tweaks increase the apparent similarity between the outcome and the desired outcome.
There are levels of organization within bindings; a loose binding at one level can give rise to a tighter binding at a higher level. The rods and cones of the retina correspond to incoming photons that correspond to points on the surface of an object. The binding between a metaphorical pixel in the retina and a point in a real-world surface is very weak, very breakable; a stray ray of light can wildly change the detected optical intensity. But the actual sensory experience occupies one level of organization above individual pixels. The fragile sensory binding between retinal pixels and surface points, on a lower level of organization, gives rise to a solid sensory binding between our perception of the entire object and the object itself. A match between two discrete variables or two rough quantitative variables can arise by chance; a match between two patterned variables on a higher holonic level of organization is far less likely to arise from complete coincidence, though it may arise from a cause other than the obvious. The concept kernels in human visual recognition likewise bind to the entire perceptual experience of an object, not to individual pixels of the object. On an even higher level of organization, the manipulative binding between human intelligence and the real world is nailed down by many individually tight sensory bindings between conceptual imagery and real-world referents. Under the human implementation, there are at least three levels of organization within the correspondence theory of truth! The AI pathology that we perceive as "weak semantics" - which is very hard to define, but is an intuitive impression shared by many AI philosophers - may arise from omitting levels of organization in the binding between a model and its referent.
The series of motor actions I use to strike a key on my keyboard have enough degrees of freedom that "which key I strike", as a discrete variable, or "the sequence of keys struck", as a patterned variable, are both subject to direct specification. I do not need to engage in complex planning to strike the key sequence "hello world" or "labm4"; I can specify the words or letters directly and without need for complex planning. My motor areas and cerebellum do an enormous amount of work behind the scenes, but it is work that has been optimized to the point of subjective invisibility. A keystroke is thus an action for pragmatic purposes, although for a novice typist it might be a goal. As a first approximation, goal imagery has been reduced to action imagery when the imagery can direct a realtime skill in the relevant modality. This does not necessarily mean that actions are handed off to skills with no further interaction; realtime manipulations sometimes go wrong, in which case the interrelation between goals and actions and skills becomes more intricate, sometimes with multiple changing goals interacting with realtime skills. Imagery approaches the action level as it becomes able to interact with realtime skills.
Sometimes a goal does not directly reduce to actions because the goal referent is physically distant or physically separated from the "effectors" - the motor appendages or their virtual equivalents - so that manipulating the goal referent depends on first overcoming the physical separation as a subproblem. However, in the routine activity of modern-day humans, another very common reason why goal imagery does not translate directly into action imagery is that the goal imagery is a high-level abstract characteristic, cognitively separated from the realm of direct actions. I can control every keystroke of my typing, but the quantitative percept of writing quality2 referred to by the goal imagery of high writing quality is not subject to direct manipulation. I cannot directly set my writing quality to equal that of Shakespeare, in the way that I can directly set a keystroke to equal "H", because writing quality is a derived, abstract quantity. A better word than "abstract" is "holonic", the term used earlier from [Koestler67] and used to describe the way in which a single quality may simultaneously be a whole composed of parts, and a part in a greater whole. Writing quality is a quantitative holon which is eventually bound to the series of discrete keystrokes. I can directly choose keystrokes, but cannot directly choose the writing-quality holon. To increase the writing quality of a paragraph I must link the writing-quality holon to lower-level holons such as correct spelling and omitting needless words, which are qualities of the sentences holons, which are created through keystroke actions. Action imagery is typically, though not always, the level on which variables are completely free (directly specifiable with many degrees of freedom); higher levels involve interacting constraints which must be resolved through deliberation.
The very-high-level abstract goal imagery for writing quality is bound to directly specifiable action imagery for words and keystrokes through an intermediate series of child goals which inherit desirability from parent goals. But what are goals? What is desirability? So far I have been using an intuitive definition of these terms, which often suffices for describing how the goal system interacts with other systems, but is not a description of the goal system itself.
Unfortunately, the human goal system is somewhat... confused... as you know if you're a human. Most of the human goal system originally evolved in the absence of deliberative intelligence, and as a result, behaviors that contribute to survival and reproduction tend to be evolved as independent drives. Taking the intentionalist stance toward evolution, we would say that the sex drive is a child goal of reproduction. Over evolutionary time this might be a valid stance. But individual organisms are best regarded as adaptation-executers rather than fitness-maximizers, and the sex drive is not cognitively a child goal of reproduction; hence the modern use of contraception. Further complications are introduced at the primate level by the existence of complex social groups; consequently primates have "moral" adaptations, such as reciprocal altruism, third-party intervention to resolve conflicts ("community concern"), and moralistic aggression against community offenders [Flack00]. Still further complications are introduced by the existence of deliberative reasoning and linguistic communication in humans; humans are imperfectly deceptive social organisms that argue about each other's motives in adaptive contexts. This has produced what I can only call "philosophical" adaptations, such as the ways we reason about causation in moral arguments - ultimately giving us the ability to pass (negative!) judgement on the moral worth of our evolved goal systems and evolution itself.
It is not my intent to untangle that vast web of causality in this paper, although I have written (informally but at length) about the problem elsewhere [Yudkowsky01], including a description of the cognitive and motivational architectures required for a mind to engage in such apparently paradoxical behaviors as passing coherent judgement on its own top-level goals. (For example, a mind may regard the current representation of morals as a probabilistic approximation to a moral referent that can be reasoned about.) The architecture of morality is a pursuit that goes along with the pursuit of general intelligence, and the two should not be parted, for reasons that should be obvious and will become even more obvious in Part III; but unfortunately there is simply not enough room to deal with the issues here. I will note, however, that the human goal system sometimes does the Wrong Thing3 and I do not believe AI should follow in those footsteps; a mind may share our moral frame of reference without being a functional duplicate of the human goal supersystem.
Within this paper I will set aside the question of moral reasoning and take for granted that the system supports moral content. The question then becomes how moral content binds to goal imagery and ultimately to actions.
The imagery that describes the supergoal is the moral content and describes the events or world-states that the mind regards as having intrinsic value. In classical terms, the supergoal description is analogous to the intrinsic utility function. Classically, the total utility of an event or world-state is its intrinsic utility, plus the sum of the intrinsic utilities (positive or negative) of the future events to which that event is predicted to lead, multiplied in each case by the predicted probability of the future event as a consequence. (Note that predicted consequences include both direct and indirect consequences, i.e., consequences of consequences are included in the sum.) This may appear at first glance to be yet another oversimplified Good Old-Fashioned AI definition, but for once I shall argue in favor; the classical definition is more fruitful of complex behaviors than first apparent. The property desirability should be coextensive with, and should behave identically to, the property is-predicted-to-lead-to-intrinsic-utility.
Determining which actions are predicted to lead to the greatest total intrinsic utility, and inventing actions which lead to greater intrinsic utility, has subjective regularities when considered as a cognitive problem and external regularities when considered as an event structure. These regularities are called subgoals. Subgoals define areas where the problem can be efficiently viewed from a local perspective. Rather than the mind needing to rethink the entire chain of reasoning "Action A leads to B, which leads to C, which leads to D, [...], which leads to actual intrinsic utility Z", there is a useful regularity that actions which lead to B are mostly predicted to lead through the chain to Z. Similarly, the mind can consider which of subgoals B1, B2, B3 are most likely to lead to C, or consider which subgoals C1, C2, C3 are together sufficient for D, without rethinking the rest of the logic to Z.
This network (not hierarchical) event structure is an imperfect regularity; desirability is heritable only to the extent, and exactly to the extent, that predicted-to-lead-to-Z-ness is heritable. Our low-entropy universe has category structure, but not perfect category structure. Using imagery to describe an event E which is predicted to lead to event F is never perfect; perhaps most real-world states that fit description E lead to events that fit description F, but it would be very rare, outside of pure mathematics, to find a case where the prediction is perfect. There will always be some states in the volume carved out by the description E that lead to states outside the volume carved out by description F. If C is predicted to lead to D, and B is predicted to lead to C, then usually B will inherit C's predicted-to-lead-to-D-ness. However, it may be that B leads to a special case of C which does not lead to D; in this case, B would not inherit C's predicted-to-lead-to-D-ness. Therefore, if C had inherited desirability from D, B would not inherit C's desirability either.
To deal with a world of imperfect regularities, goal systems model the regularities in the irregularities, using descriptive constraints, distant entanglements, and global heuristics. If events fitting description E usually but not always lead to events fitting description F, then the mental imagery describing E, or even the concepts making up the description of E, may be refined to narrow the extensional class to eliminate events that seem to fit E but that don't turn out to lead to F. These "descriptive constraints" drive the AI to focus on concepts and categories that expose predictive, causal, and manipulable regularities in reality, rather than just surface regularities.
A further refinement is "distant entanglements"; for example, an action A that leads to B which leads to C, but which also simultaneously has side effects that block D, which is C's source of desirability. Another kind of entanglement is when action A leads to unrelated side effect S, which has negative utility outweighing the desirability inherited from B.
"Global heuristics" describe goal regularities that are general across many problem contexts, and which can therefore be used to rapidly recognize positive and negative characteristics; the concept "margin for error" is a category that describes an important feature of many plans, and the belief "margin for error supports the local goal" is a global heuristic that positively links members of the perceptual category margin for error to the local goal context, without requiring separate recapitulation of the inductive and deductive support for the general heuristic. Similarly, in self-modifying or at least self-regulating AIs, "minimize memory usage" is a subgoal that many other subgoals and actions may impact, so the perceptual recognition of events in the "memory usage" category or "leads to memory usage" categories implies entanglement with a particular distant goal.
Descriptive constraints, distant entanglements, and global heuristics do not violate the desirability-as-prediction model; descriptive constraints, distant entanglements, and global heuristics are also useful for modeling complex predictions, in the same way and for the same reasons as they are useful in modeling goals. However, there are at least three reasons for the activity of planning to differ from the activity of prediction. First, prediction typically proceeds forward from a definite state of the universe to determine what comes after, while planning often (though not always) reasons backward from goal imagery to pick out one point in a space of possible universes, with the space's dimensions determined by degrees of freedom in available actions. Second, desirabilities are differential, unlike predictions; if A and ~A both lead to the same endpoint E, then from a predictive standpoint this may increase the confidence in E, but from a planning standpoint it means that neither A nor ~A will inherit net desirability from E. The final effect of desirability is that an AI chooses the most desirable action, an operation which is comparative rather than absolute; if both A and ~A lead to E, neither A nor ~A transmit differential desirability to actions.
Third, while both implication and causation are useful for reasoning about predictions, only causal links are useful in reasoning about goals. If the observation of A is usually followed by the observation of B, then this makes A a good predictor of B - regardless of whether A is the direct cause of B, or whether there is a hidden third cause C which is the direct cause of both A and B. I would regard implication as an emergent property of a directed network of events whose underlying behavior is that of causation; if C causes A, and then causes B, then A will imply B. Both "A causes B" (direct causal link) and "A implies B" (mutual causal link from C) are useful in prediction. However, in planning, the distinction between "A directly causes B" and "A and B are both effects of C" leads to a distinction between "Actions that lead to A, as such, are likely to lead to B" and "Actions that lead directly to A, without first leading through C, are unlikely to have any effect on B". This distinction also means that experiments in manipulation tend to single out real causal links in a way that predictive tests do not. If A implies B then it is often the case that C causes both A and B, but it is rarer in most real-world problems for an action intended to affect A to separately and invisibly affect the hidden third cause C, giving rise to false confirmation of direct causality4. (Although it happens, especially in economic and psychological experiments.)
So far, this section has introduced the distinction between sensory, predictive, decisive, and manipulative models; discrete, quantitative, and patterned variables; the holonic model of high-level and low-level patterns; and supergoal referents, goal imagery, and actions. These ideas provide a framework for understanding the immediate subtasks of intelligence - the moment-to-moment activities of deliberation. In carrying out a high-level cognitive task such as design a bicycle, the subtasks consist of crossing gaps from very high-level holons such as good transport to the holon fast propulsion to the holon pushing on the ground to the holon wheel to the holons for spokes and tires, until finally the holons become directly specifiable in terms of design components and design materials directly available to the AI.
The activities of intelligence can be described as knowledge completion in the service of goal completion. To complete a bicycle, one must first complete a design for a bicycle. To carry out a plan, one must complete a mental picture of a plan. Because both planning and design make heavy use of knowledge, they often spawn purely knowledge-directed activities such as explanation, prediction, and discovery. These activities are messy, non-inclusive categories, but they illustrate the general sorts of things that general minds do.
Knowledge activities are carried out both on a large scale, as major strategic goals, and on a small scale, in routine subtasks. For example, "explanation" seeks to extend current knowledge, through deduction or induction or experiment, to fill the gap left by the unknown cause of a known effect. The unknown cause will at least be the referent of question imagery, which will bring into play sequiturs and verifiers which react to open questions. If the problem becomes salient enough, and difficult enough, finding the unknown cause may be promoted from question imagery to an internal goal, allowing the AI to reason deliberatively about which problem-solving strategies to deploy. The knowledge goal for "building a plan" inherits desirability from the objective of the plan, since creating a plan is required for (is a subgoal of) achieving the objective of the plan. The knowledge goal for explaining an observed failure might inherit desirability from the goal achievable when the failure is fixed. Since knowledge goals can govern actual actions and not just the flow of sequiturs, they should be distinguished from question imagery. Knowledge goals also permit reflective reasoning about what kind of internal actions are likely to lead to solving the problem; knowledge goals may invoke sequiturs that search for beliefs about solving knowledge problems, not just beliefs about the specific problem at hand.
Explanation fills holes in knowledge about the past. Prediction fills holes in knowledge about the future. Discovery fills holes in knowledge about the present. Design fills gaps in the mental model of a tool. Planning fills gaps in a model of future strategies and actions. Explanation, prediction, discovery, and design may be employed in the pursuit of a specific real-world goal, or as an independent pursuit in the anticipation of the resulting knowledge being useful in future goals - "curiosity". Curiosity fills completely general gaps (rather than being targeted on specific, already-known gaps), and involves the use of forward-looking reasoning and experimentation, rather than backward chaining from specific desired knowledge goals; curiosity might be thought of as filling the very abstract goal of "finding out X, where X refers to anything that will turn out to be a good thing to know later on, even though I don't know specifically what X is." (Curiosity involves a very abstract link to intrinsic utility, but one which is nonetheless completely true - curiosity is useful.)
What all the activities have in common is that they involve reasoning about a complex, holonic model of causes and effects. "Explanation" fills in holes about the past, which is a complex system of cause and effect. "Prediction" fills in holes in the future, which is a complex system of cause and effect. "Design" reasons about tools, which are complex holonic systems of cause and effect. "Planning" reasons about strategies, which are complex holonic systems of cause and effect. Intelligent reasoning completes knowledge goals and answers questions in a complex holonic causal model, in order to achieve goal referents in a complex holonic causal system.
This gives us the three elements of DGI:
The evolutionary context of intelligence has historically included environmental adaptive contexts, social adaptive contexts (modeling of other minds), and reflective adaptive contexts (modeling of internal reality). In evolving to fit a wide variety of adaptive contexts, we have acquired much cognitive functionality that is visibly specialized for particular adaptive problems, but we have also acquired cognitive functionality that is adaptive across many contexts, and adaptive functionality that coopts previously specialized functionality for wider use. Humans can acquire substantial competence in modeling, predicting, and manipulating fully general regularities of our low-entropy universe. We call this ability "general intelligence". In some ways our ability is very weak; we often solve general problems abstractly instead of perceptually, so we can't deliberatively solve problems on the order of realtime visual interpretation of a 3D scene. But we can often say something which is true enough to be useful and simple enough to be tractable. We can deliberate on how vision works, even though we can't deliberate fast enough to perform realtime visual processing.
There is currently a broad trend toward one-to-one mappings of cognitive subsystems to domain competencies. While in popular psychology this often degenerates into phrenology, such abuses are of course irrelevant to genuine hypotheses about mappings between specialized domain competencies and specialized computational subsystems, or decisions to pursue specialized AI. In DGI, human intelligence is held to consist of a supersystem with complex interdependent subsystems that exhibit internal functional specialization, but this does not rule out the existence of other subsystems that contribute solely or primarily to specific cognitive talents and domain competencies, or subsystems that contribute more heavily to some cognitive talents than others. The mapping from computational subsystems to cognitive talents is many-to-many, and the mapping from cognitive talents plus acquired expertise to domain competencies is also many-to-many, but this does not rule out specific correspondences between human variances in the "computing power" (generalized cognitive resources) allocated to computational subsystems and observed variances in cognitive talents or domain competencies.
However, the subject matter of AI is not the variance between humans, but the base of adaptive complexity common to all humans (or at least all neurologically intact humans). If increasing the resources allocated to a cognitive subsystem yields an increase in a cognitive talent or domain competency, it does not follow that the talent or competency can be implemented by that subsystem alone. It should also be noted that under the traditional paradigm of programming, programmers' thoughts about solving specific problems are translated into code, and this is the idiom underlying most branches of classical AI; for example, expert systems engineers supposedly translate the beliefs in specific domains directly into the cognitive content of the AI. This would naturally tend to yield a view of intelligence in which there is a one-to-one mapping between subsystems and competencies. I believe this is the underlying cause of the atmosphere in which the quest for intelligent AI is greeted with the reply: "AI that is intelligent in what domain?"
This does not mean that exploration in specialized AI is entirely worthless; in fact, DGI's levels of organization suggest a specific class of cases where specialized AI may prove fruitful. Sensory modalities lie directly above the code level; sensory modalities were some of the first specialized cognitive subsystems to evolve and hence are not as reliant on a supporting supersystem framework, although other parts of the supersystem depend heavily on modalities. This suggests a specialized approach, with programmers directly writing code, may prove fruitful if the project is constructing a sensory modality. And indeed, AI research that focuses on creating sensory systems and sensorimotor systems continues to yield real progress. Such researchers are following evolution's incremental path, often knowingly so, and thereby avoiding the pitfalls that result from violating the levels of organization.
However, I still do not believe it is possible to match the deliberative supersystem's inherently broad applicability by implementing a separate computational subsystem for each problem context. Not only is it impossible to duplicate general intelligence through the sum of such subsystems, I suspect it is impossible to achieve humanlike performance in most single contexts using specialized AI. Occasionally we use abstract deliberation to solve modality-level problems for which we lack sensory modalities, and in this case it is possible for AI projects to solve the problem on the modality level, but the resulting problem-solving method will be very different from the human one, and will not generalize outside the specific domain. Hence Deep Blue.
Even on the level of individual domain competencies, not all competencies are unrelated to each other. Different minds may have different abilities in different domains; a mind may have an "ability surface", with hills and spikes in areas of high ability; but a spike in an area such as learning or self-improvement tends to raise the rest of the ability surface [Voss01]. The talents and subsystems that are general in the sense of contributing to many domain competencies - and the domain competencies of self-improvement; see Part III - occupy a strategic position in AI analogous to the central squares in chess.
When can an AI legitimately use the word "I"?
(For the sake of this discussion, I must give the AI a temporary proper name; I will use "Aisa" during this discussion.)
A classical AI that contains a LISP token for "hamburger" knows nothing about hamburgers; at most the AI can recognize recurring instances of a letter-sequence typed by programmers. Giving an AI a suggestively named data structure or function does not make that component the functional analogue of the similarly named human feature [McDermott76]. At what point can Aisa talk about something called "Aisa" without Drew McDermott popping up and accusing us of using a term that might as well translate to "G0025"?
Suppose that Aisa, in addition to modeling virtual environments and/or the outside world, also models certain aspects of internal reality, such as the effectiveness of heuristic beliefs used on various occasions. The degrees of binding between a model and reality are sensory, predictive, decisive, and manipulative. Suppose that Aisa can sense when a heuristic is employed, notice that heuristics tend to be employed in certain contexts and that they tend to have certain results, and use this inductive evidence to formulate expectations about when a heuristic will be employed and predict the results on its employment. Aisa now predictively models Aisa; it forms beliefs about its operation by observing the introspectively visible effects of its underlying mechanisms. Tightening the binding from predictive to manipulative requires that Aisa link introspective observations to internal actions; for example, Aisa may observe that devoting discretionary computational power to a certain subprocess yields thoughts of a certain kind, and that thoughts of this kind are useful in certain contexts, and subsequently devote discretionary power to that subprocess in those contexts.
A manipulative binding between Aisa and Aisa's model of Aisa is enough to let Aisa legitimately say "Aisa is using heuristic X", such that using the term "Aisa" is materially different from using "hamburger" or "G0025". But can Aisa legitimately say, "I am using heuristic X"?
My favorite quote on this subject comes from Douglas Lenat, although I cannot find the reference and am thus quoting from memory: "While Cyc knows that there is a thing called Cyc, and that Cyc is a computer, it does not know that it is Cyc." Personally, I would question whether Cyc knows that Cyc is a computer - but regardless, Lenat has made a legitimate and fundamental distinction. Aisa modeling a thing called Aisa is not the same as Aisa modeling itself.
In an odd sense, assuming that the problem exists is enough to solve the problem. If another step is required before Aisa can say "I am using heuristic X", then there must be a material difference between saying "Aisa is using heuristic X" and "I am using heuristic X". And that is one possible answer: Aisa can say "I" when the behavior of modeling itself is materially different, because of the self-reference, from the behavior of modeling another AI that happens to look like Aisa.
One specific case where self-modeling is materially different than other-modeling is in planning. Employing a complex plan in which a linear sequence of actions A, B, C are individually necessary and together sufficient to accomplish goal G requires an implicit assumption that the AI will follow through on its own plans; action A is useless unless it is followed by actions B and C, and action A is therefore not desirable unless actions B and C are predicted to follow. Making complex plans does not actually require self-modeling, since many classical AIs engage in planning-like behaviors using programmatic assumptions in place of reflective reasoning, and in humans the assumption is usually automatic rather than being the subject of deliberation. However, deliberate reflective reasoning about complex plans requires an understanding that the future actions of the AI are determined by the decisions of the AI's future self, that there is some degree of continuity (although not perfect continuity) between present and future selves, and that there is thus some degree of continuity between present decisions and future actions.
An intelligent mind navigates a universe with four major classes of variables: Random factors, variables with hidden values, the actions of other agents, and the actions of the self. The space of possible actions differs from the spaces carved out by other variables because the space of possible actions is under the AI's control. One difference between "Aisa will use heuristic X" and "I will use heuristic X" is the degree to which heuristic usage is under Aisa's deliberate control - the degree to which Aisa has goals relating to heuristic usage, and hence the degree to which the observation "I predict that I will use heuristic X" affects Aisa's subsequent actions. Aisa, if sufficiently competent at modeling other minds, might predict that a similar AI named Aileen would also use heuristic X, but beliefs about Aileen's behaviors would be derived from predictive modeling of Aileen, and not decisive planning of internal actions based on goal-oriented selection from the space of possibilities. There is a cognitive difference between Aisa saying "I predict Aileen will use heuristic X" and "I plan to use heuristic X". On a systemic level, the global specialness of "I" would be nailed down by those heuristics, beliefs, and expectations that individually relate specially to "I" because of introspective reflectivity or the space of undecided but decidable actions. It is my opinion that such an AI would be able to legitimately use the word "I", although in humans the specialness of "I" may be nailed down by additional cognitive forces as well. (Legitimate use of "I" is explicitly not offered as a necessary and sufficient condition for the "hard problem of conscious experience" [Chalmers95] or social, legal, and moral personhood.)
| Next: | 3: Part III: Seed AI |
| Up: | 2: Part II: Levels of organization in deliberative general intelligence |
| Prev: | 2.6: The thought level |