Self-Awareness in the Age of Machines - The Power of Reflection

Know Thyself: The Science of Self-Awareness - Stephen M Fleming 2021

Self-Awareness in the Age of Machines
The Power of Reflection

As technology progresses, an ever more intimate mix of human and machine takes shape. You’re hungry; Yelp suggests some good restaurants. You pick one; GPS gives you directions. You drive; car electronics does the low-level control. We are all cyborgs already.

—PEDRO DOMINGOS, The Master Algorithm

From the entirely objective point of view of information and computer theories, all scientific knowledge of dynamic systems is knowledge of the aspect that is machine-like. Nevertheless, the questions are still being asked: Can a machine know it is a machine? Has a machine an internal self-awareness?

—W. ROSS ASHBY, Mechanisms of Intelligence

In June 2009, an Air France flight from Rio de Janeiro to Paris disappeared into the Atlantic Ocean. The three pilots were flying an Airbus A330—one of the most advanced aircraft in the world, laden with autopilot and safety mechanisms and notoriously difficult to crash. As night fell during the ocean crossing, the pilots noticed a storm on the flight path. Storms are usually straightforward for modern airliners to handle, but on this occasion ice in the clouds caused a sensor on the plane to seize up and stop functioning, leading the autopilot to disconnect and requiring the pilots to take control. Pierre-Cédric Bonin, the inexperienced copilot, tried to take over manual control, but he began flying unsteadily and started to climb. The thin air made the aircraft stall and lose altitude.

Even at this stage, there was little danger. All the pilots needed to do to fix the problem was to level out the plane and regain airspeed. It’s the kind of basic maneuverer that is taught to novice pilots in their first few hours of flying lessons. But Bonin kept on climbing, with disastrous consequences. The investigation report found that, despite the pilots having spent many hours in the cockpit of the A330, most of these hours had been spent monitoring the autopilot, rather than manual flying. They were unwilling to believe that all this automation would let them make the series of errors that ended in disaster.

The moral of this tragic story is that sometimes offloading control to automation can be dangerous. We have already seen that a well-learned skill such as a golf swing can become automatic to the point at which the need to think about what we are doing becomes less necessary and even detrimental for performing well. But this kind of automation is all within the same brain—our own. When things go awry—when we hook a golf ball, fluff a tennis backhand, or put the car in the wrong gear—we are usually jolted back into awareness of what we are doing. The picture is very different when a machine has taken control. Paradoxically, as automation becomes more and more sophisticated, human operators become less and less relevant. Complacency sets in, and skills may decline.

Such worries about the consequences of offloading to technology are not new. Socrates tells the mythical story of Egyptian god Theuth, who is said to have discovered writing. When Theuth offered the gift to Thamus, king of Egypt, the king was not impressed and worried that it would herald the downfall of human memory, introducing a pandemic of forgetfulness. He complained that people who used it “will appear to be omniscient and will generally know nothing; they will be tiresome company, having the show of wisdom without the reality.”1

Thankfully, these worries about writing and reading did not come to pass. But AI and machine learning may prove different. The intellectual boosts provided by writing, or the printing press, or even the computer, Internet, or smartphone, are the result of transparent systems. They respond lawfully to our requests: We set up a particular printing block, and this systematically reproduces the same book, time and time again. Each time I press the return key when writing on my laptop, I get a new line. The intersection of machine learning and automation is different. It is often nontransparent; we not only do not always know how it is working, but we also do not know whether it will continue to work in the same way tomorrow, the next day, or the day after that. In one sense, machine learning systems have minds of their own, but they are minds that are not currently able to explain how they are solving a particular problem. The remarkable developments in artificial intelligence have not yet been accompanied by comparable developments in artificial self-awareness.

In fact, as technology gets smarter, the relevance of our self-awareness might also diminish. A powerful combination of data and machine learning may end up knowing what we want or need better than we know ourselves. The Amazon and Netflix recommendation systems offer up the next movie to watch; dating algorithms take on the job of finding our perfect match; virtual assistants book hair appointments before we are aware that we need them; online personal shoppers send us clothes that we didn’t even know we wanted.

As human consumers in such a world, we may no longer need to know how we are solving problems or making decisions, because these tasks have become outsourced to AI assistants. We may end up with only weak metacognitive contact with our machine assistants—contact that is too thin for us to intervene when they might be doing things they were not designed to do, or, if we are alerted, we may find it is too late to do anything about it. Machines would not need to explain how they are solving problems or making decisions either, because they had no need to do so in the first place. The outsourcing of intelligence could lead to meaningful self-awareness gradually fading into the background of a technology-dependent society.

So what? we might say. A radical futurist may see this melding of mind and machine as the next logical step in human evolution, with changes to how we think and feel being a small price to pay for such technological advances. But I think we need to be careful here. We have already seen in this book that any kind of mind—silicon or biological—is likely to need a degree of metacognition in order to solve scientific and political problems on a grand scale.

I see two broad solutions to this problem:

• Seek to engineer a form of self-awareness into machines (but risk losing our own autonomy in the process).

• Ensure that when interfacing with future intelligent machines, we do so in a way that harnesses rather than diminishes human self-awareness.

Let’s take a look at each of these possibilities.

Self-Aware Machines

Ever since Alan Turing devised the blueprints for the first universal computer in 1937, our position as owners of uniquely intelligent minds has looked increasingly precarious. Artificial neural networks can now recognize faces and objects at superhuman speed, fly planes or pilot spaceships, make medical and financial decisions, and master traditionally human feats of intellect and ingenuity such as chess and computer games. The field of machine learning is now so vast and fast-moving that I won’t attempt a survey of the state of the art and instead refer readers to excellent recent books on the topic in the endnote. But let’s try to extract out a few key principles by considering how machine learning relates to some of the building blocks of metacognition we encountered in Part I.2

A useful starting point is to look at the components needed for a robot to begin to perceive its environment. From there, we can see what extra components we might need to create a form of machine self-awareness. Let’s call our robot Val, after the cyberneticist Valentino Braitenberg, whose wonderful book Vehicles was one of the inspirations for this chapter. Val is pictured here. She is a toy car with a camera on the front and motors on the wheels. There are two lights positioned in front of her: a blue light to the right, and a green light to the left. But so far, Val is nothing more than a camera. Her eyes work, but there is no one home. To allow Val to start seeing what is out there, we need to give her a brain.3

An artificial neural network is a piece of computer software that takes an input, such as a digital image, and feeds the information through a set of simulated layers of “neurons.” In fact, each neuron is very simple: it computes a weighted sum of the inputs from neurons in the layers below, and passes the result through a nonlinear function to generate its level of “activation.” This value is then passed on to the next layer, and so on. The clever thing about neural networks is that the weights connecting each layer can be adjusted, through trial and error, to begin to classify the inputs that it is fed. For instance, if you present a neural network with a series of images of cats and dogs and ask it to respond “cat” or “dog,” you can tell the network each time it is right and each time it is wrong. This is known as supervised learning—the human is supervising the neural network to help it get better at the task of classifying cats and dogs. Over time, the weights between the layers are adjusted so that the network gets better and better at giving the right answer all by itself.4


Val without a brain

Neural networks have a long history in artificial intelligence research, but the first networks were considered too simple to compute anything useful. That changed in the 1980s and 1990s with the advent of more computing power, more data, and clever ways of training the networks. Today’s artificial neural networks can classify thousands of images with superhuman performance. One particularly efficient approach is to use deep networks, which have multiple layers, just like the visual system of the human brain.5


A diagram of an artificial neural network (

To allow Val to start perceiving the world, then, we could hook up all the pixels from her digital camera to the input layer of a deep neural network. We could switch on each of the blue and green lights in turn, and train the output layers of the network to categorize the light as either blue or green. By providing corrective feedback (just as we might do to an infant who is learning the names of things in the world), the weights between the layers of Val’s network would gradually get adjusted so that she is reliably perceiving the blue light as blue and the green light as green.

There is a curious side effect of this kind of training. Once a network has learned to classify aspects of its environment, parts of the network start to respond to similar features. In Val’s case, after learning to discriminate between blue and green lights, some neurons in her brain will tend to activate more strongly for green, and others more strongly for blue. These patterned responses to the outside world are known as representations of the environment. A representation is something inside a cognitive system that “keeps track of” or is “about” some aspect of the outside world, just as we might say that a painting “represents” a landscape. Representations play an important role in the general idea that the brain performs computations. If something inside my head can represent a house cat, then I can also do things like figure out that it is related to a lion, and that both belong to a larger family of animals known as cats.6

It is likely no coincidence that the way successful artificial image-classification networks are wired is similar to the hierarchical organization of the human brain. Lower layers contain neurons that handle only small parts of the image and keep track of features such as the orientation of lines or the difference between light and shade. Higher layers contain neurons that process the entire image and represent things about the object in the image (such as whether it contains features typical of a cat or a dog). Computational neuroscientists have shown that exactly this kind of progression—from computing local features to representing more global properties—can be found in the ventral visual stream of human and monkey brains.7

Scaled-up versions of this kind of architecture can be very powerful indeed. By combining artificial neural networks with reinforcement learning, the London-based technology company DeepMind has trained algorithms to solve a wide range of board and video games, all without being instructed about the rules in advance. In March 2016, its flagship algorithm, AlphaGo, beat Lee Sedol, the world champion at the board game Go and one of the greatest players of all time. In Go, players take turns placing their stones on intersections of a nineteen-by-nineteen grid, with the objective of encircling or capturing the other player’s stones. Compared to chess, the number of board positions is vast, outstripping the estimated number of atoms in the universe. But by playing against itself millions of times, and updating its predictions about valuable moves based on whether it won or lost, AlphaGo could achieve superhuman skill at a game that is considered so artful that it was once one of four essential skills that Chinese aristocrats were expected to master.8

These kinds of neural networks rely on supervised learning. They have to learn whether they are right or wrong by training on a series of examples. After being trained, they acquire rich representations of their environment and reward functions that tell them what is valuable. These algorithms can be stunningly powerful, intelligent, and resourceful, but they have limited self-awareness of what they do or do not know. It is also unlikely that self-awareness will simply emerge as a by-product of designing ever more intelligent machines. As we have seen, good ability does not necessarily lead to good metacognition. You might be performing expertly at a task (recall the chick sexers) and yet have no self-awareness of what you are doing. Instead, this kind of AI is likely to become more and more intelligent in particular domains, while perhaps remaining no more self-aware than a pocket calculator.

We can make this discussion more precise by considering what building blocks machines would need to become self-aware. Many of these components are things we have already encountered in Part I, such as the ability to track uncertainty and self-monitor actions. Machines often do not have these second-order capabilities—partly because, in most cases in which the problem is clearly specified, they do not need to. William James, the grandfather of modern psychology, anticipated this idea when he mused that a machine with a “clock-work whose structure fatally determines it to a certain rate of speed” would be unlikely to make errors in the first place, let alone need to correct them. He contrasted this against the effortless self-awareness of the human mind: “If the brain be out of order and the man says ’Twice four are two,’ instead of ’Twice four are eight’… instantly there arises a consciousness of error.”9

The machines in James’s day were simple enough that errors were the exception rather than the rule. This is no longer the case. In fact, a key problem with modern machine learning techniques is that they are often overconfident in the real world; they think they know the answer when they would be better off hedging their bets. This poses a serious problem for operating AI devices in novel environments—for instance, the software installed in self-driving cars can be fooled by inputs it has not encountered before or different lighting conditions, potentially leading to accidents.10

Another problem is that, once a neural network is trained, it is hard to know why it is doing what it is doing. As we have seen, modern AI is not usually set up with the goal of self-explanation. The philosopher Andy Clark and psychologist Annette Karmiloff-Smith anticipated this idea in a landmark article in 1993. They suggested that artificial neural networks are unable to monitor what they are doing precisely because their knowledge remains stored “in” the system, in the weights connecting each layer.

They gave the example of training an artificial neural network to predict whether individuals would default on their loans (something that was hypothetical in 1993 but is now a routine aspect of machine learning at major banks). The network can draw on a huge database of information (such as postal address or income level) about people in a particular country (such as the UK) to refine its predictions about who will and will not prove likely to default. Clark and Karmiloff-Smith asked whether, after all this learning, the network would be able to communicate what it had learned to a new network that the bank wanted to set up in Australia. In other words, would it know what it had learned and be able to use that self-knowledge to teach others? They concluded it would not: “What the system needs, in order to be able to tell the Australian system anything useful, is some more abstract and transportable knowledge concerning relevant factors in loan assessment.… The original network gets by without such explicit abstractions, since it only develops the minimal representations needed to succeed in the version of the task for which it was trained.” In other words, Clark and Karmiloff-Smith were suggesting that, even with generous computing resources and ample data, neural networks solving complex problems are unlikely to become self-aware of what they know. Knowledge within these networks remains buried in a changing pattern of weights between the layers—it is knowledge “in” the network, rather than “for” the network. They went on to suggest that, in self-aware minds, a process of “representational redescription” also occurs (at least for some types of learning), allowing us not only to perceive and categorize a cat or a dog, but also to know that we are perceiving a cat as a cat and a dog as a dog.11

This all sounds quite abstract and theoretical. But the process of creating meta-representations is actually relatively simple. Just as we can have neural networks that take in information from the outside world, transform it, and spit out an answer, we can have metacognitive networks that model how other neural networks are operating.

One of the first attempts at creating artificial metacognition was made by Nicholas Yeung, Jonathan Cohen, and Matthew Botvinick in a landmark paper published in Psychological Review in 2004. They trained a neural network to solve a Stroop task (stating the color of a word rather than reading the word out loud), similar to the one we encountered earlier in the experiments on typewriting. The researchers then added a simple metacognitive network to monitor what was happening in their first network. In fact, this second network was so simple that it was just a single unit, or neuron, receiving inputs from the main network. It calculated how much “conflict” there was between the two responses: if both the word and the color were competing for control over the response, this would indicate that the judgment was difficult, and that an error might occur, and it would be wise to slow down. The simple addition of this layer was able to account for an impressive array of data on how humans detect their own errors on similar tasks.12

Other work has focused on attempting to build computer simulations of neurological conditions, such as blindsight, in which metacognition is impaired but lower-level performance is maintained. Axel Cleeremans’s team has attempted to mimic this phenomenon by training an artificial neural network to discriminate among different stimulus locations. They then built a second neural network that tracked both the inputs and outputs of the first network. This metacognitive network was trained to gamble on whether the first network would get the answer correct. By artificially damaging the connections between the two networks, the team could mimic blindsight in their computer simulation. The first network remained able to correctly select the location, but the system’s metacognitive sensitivity—the match between confidence and performance—was abolished, just as in the patients. In other experiments looking at how metacognitive knowledge emerges over time, Cleeremans’s team found that the simulation’s self-awareness initially lagged behind its performance; it was as if the system could initially perform the task intuitively, without any awareness of what it was doing.13

Various ingenious solutions are now being pursued to build a sense of confidence into AI, creating “introspective” robots that know whether they are likely to be right before they make a decision, rather than after the fact. One promising approach, known as dropout, runs multiple copies of the network, each with a slightly different architecture. The range of predictions that the copies make provides a useful proxy for how uncertain the network should be about its decision. In another version of these algorithms, autonomous drones were trained to navigate themselves around a cluttered environment—just as a parcel-delivery drone would need to do if operating around the streets and skyscrapers of Manhattan. The researchers trained a second neural network within the drone to detect the likelihood of crashes during test flights. When their souped-up introspective drone was released in a dense forest, it was able to bail out of navigation decisions that it had predicted would lead to crashes. By explicitly baking this kind of architecture into machines, we may be able to endow them with the same metacognitive building blocks that we saw are prevalent in animals and human infants. No one is yet suggesting that the drone is self-aware in a conscious sense. But by being able to reliably predict its own errors, it has gained a critical building block for metacognition.14

There may be other side benefits of building metacognition into machines, beyond rectifying their tendency for overconfidence. Think back to the start of the book, where we encountered our student Jane studying for an upcoming exam. In making decisions about when and where to study, Jane is likely drawing on abstract knowledge she has built up about herself over many instances, regardless of the subject she was studying. The neural machinery for creating abstract knowledge about ourselves is similar to the kind of machinery needed for creating abstract knowledge about how things work more generally. For instance, when going to a foreign country, you might not know how to operate the metro system, but you expect, based on experience in other cities, that there will be similar components such as ticket machines, tickets, and barriers. Leveraging this shared knowledge makes learning the new system much faster. And such abstractions are exactly what Andy Clark and Annette Karmiloff-Smith recognized that we need to build into neural networks to allow them to know what they know and what they don’t know.15

Knowledge about ourselves is some of the most transferable and abstract knowledge of all. After all, “I” am a constant feature across all situations in which I have to learn. Beliefs about my personality, skills, and abilities help me figure out whether I will be the kind of person who will be able to learn a new language, play an unfamiliar racket sport, or make friends easily. These abstract facts about ourselves live at the top of our metacognitive models, and, because of their role in shaping how the rest of the mind works, they exert a powerful force on how we live our lives. It is likely that similar abstract self-beliefs will prove useful in guiding autonomous robots toward tasks that fit their niche—for instance, to allow a drone to know that it should seek out more parcel-delivery jobs rather than try to vacuum the floor.

Let’s imagine what a future may look like in which we are surrounded by metacognitive machines. Self-driving cars could be engineered to glow gently in different colors, depending on how confident they were that they knew what to do next—perhaps a blue glow for when they are confident and yellow for when they are uncertain. These signals could be used by their human operators to take control in situations of high uncertainty and increase the humans’ trust that the car did know what it was doing at all other times. Even more intriguing is the idea that these machines could share metacognitive information with each other, just as self-awareness comes into its own when humans begin to collaborate and interact. Imagine two autonomous cars approaching an intersection, each signaling to turn in different directions. If both have a healthy blue glow, then they can proceed, safe in the knowledge that the other car has a good idea of what is happening. But if one or both of them begins to glow yellow, it would be wise to slow down and proceed with caution, just as we would do if a driver on the other side of the intersection didn’t seem to know what our intentions were.

Intriguingly, this kind of exchange would itself be interactive and dynamic, such that if one car began to glow yellow, it could lead others to drop their confidence too. Every car at the intersection would begin to hesitate until it was safe to proceed. The sharing of metacognitive information between machines is still a long way from endowing them with a full-blown theory of mind. But it may provide the kind of minimal metacognitive machinery that is needed to manage interactions between human-machine and machine-machine teams. More elaborate versions of these algorithms—for instance, versions that know why they are uncertain (Is it the change in lighting? A new vehicle it’s never encountered before?)—may also begin to approach a form of narrative explanation about why errors were made.

That is our first scenario: building minimal forms of artificial metacognition and self-awareness into machines. This research is already well underway. But there is also a more ambitious alternative: augmenting machines with the biology of human self-awareness.

Know Thy Robot

Imagine that when we step into a self-driving car of the future, we simply hook it up to a brain-computer interface while we drive it around the block a few times. The signals streaming back from the car while we drive gradually lead to changes in neural representations in the PFC just as they have already been shaped by a variety of other tools we use. We could then let the car take over; there would be nothing left for “us” to do in terms of routine driving. Critically, however, the brain-computer interface would ensure we have strong, rather than weak, metacognitive contact with the car. Just as we have limited awareness of the moment-to-moment adjustments of our actions, our vehicle would go places on our behalf. But if things were to go awry, we would naturally become aware of this—much as we might become aware of stumbling on a stationary escalator or hitting a poor tennis shot.

This scenario seems far-fetched. But there is nothing in principle to stop us from hooking up our self-awareness to other devices. Thanks to the plasticity of neural circuits, we already know it is possible for the brain to adopt external devices as if they were new senses or limbs. One of the critical developments in brain-computer interfaces occurred in the early 1980s. Apostolos Georgopoulos, a neuroscientist then at Johns Hopkins University, was recording the activity of multiple neurons in the monkey motor cortex. He found that when the monkey made arm movements in different directions, each cell had a particular direction for which its rate of firing was highest. When the firing of the entire population of cells, each with a different preferred direction, was examined, the vector sum of firing rates could predict with substantial accuracy where the monkey’s hand actually went. It was not long before other labs were able to decode these population codes and show that monkeys could be trained to control robot arms by modulating patterns of neural activity.16

Matt Nagle, a tetraplegic left unable to move after being attacked and stabbed, was one of the first patients to receive the benefit of this technology in 2002. After receiving an implant made by a commercial company, Cyberkinetics, he learned to move a computer cursor and change TV channels by thought alone. Companies such as Elon Musk’s Neuralink have recently promised to accelerate the development of such technologies by developing surgical robots that can integrate the implant with neural tissue in ways that would be near impossible for human surgeons. We might think that undergoing neurosurgery is a step too far to be able to control our AI devices. But other companies are harnessing noninvasive brain-scanning devices such as EEG, which, when combined with machine learning, may allow similarly precise control of external technology.17

Most current research on brain-computer interfaces is seeking to find ways of harnessing brain activity to control external technology. But there seems no principled reason why it would not also be possible for the brain to monitor autonomous devices rather than control them directly. Remember that metacognition has a wide purview. If we can interface autonomous technology with the brain, it is likely that we will be able to monitor it using the same neural machinery that supports self-awareness of other cognitive processes. In these cases, our brain-computer interface would be tapping into a higher level of the system: the level of metacognition and self-awareness, rather than perception and motor control.18

By designing our partnership with technology to take advantage of our natural aptitude for self-awareness, we can ensure humans stay in the loop. Progress in AI will provide new and rich sources of raw material to incorporate into our metacognitive models. If the pilots of the doomed Air France Airbus had had such a natural awareness of what their autopilot was doing, the act of stepping in and taking over control may have not been such a jarring and nerve-racking event.

It may not matter that we don’t understand how these machines work, as long as they are well interfaced with metacognition. Only a small number of biologists understand in detail how the eye works. And yet, as humble users of eyes, we can instantly recognize when an image may be out of focus or when we need the help of reading glasses. Few people understand the complex biomechanics of how muscles produce the movements of our arms, and yet we can recognize when we have hit a tennis serve or golf swing poorly and need to go back to have more coaching. In exactly the same way, the machines of the future may be monitored by our own biological machinery for self-awareness, without us needing an instruction manual to work out how to do so.

What Kind of World Do We Want?

Which route we pursue depends on which world we want to live in. Do we want to share our world with self-aware machines? Or would we rather our AIs remain smart and un-self-aware, helping us to augment our natural cognitive abilities?

One concern with the first route is a moral one. Given the human tendency to ascribe moral responsibility to agents that possess self-awareness, enabling machines with even the first building blocks of metacognition may quickly raise difficult questions about the rights and responsibilities of our robot collaborators. For now, though, the richer metacognitive networks being pursued by AI researchers remain distinct from the flexible architecture of human self-awareness. Prototype algorithms for metacognition—such as a drone predicting that it might be about to crash—are just as un-self-aware as the kind of regular neural networks that allow Facebook and Google to classify images and Val to navigate her toy world.

Instead, current versions of artificial metacognition are quite narrow, learning to monitor performance on one particular task, such as classifying visual images. In contrast, as we have seen, human self-awareness is a flexible resource and can be applied to evaluate a whole range of thoughts, feelings, and behaviors. Developing domain-specific metacognitive capacity in machines, then, is unlikely to approach the kind of self-awareness that we associate with human autonomy.19

A second reason for the brittleness of most computer simulations of metacognition is that they mimic implicit, or “model-free,” ways of computing confidence and uncertainty, but do not actively model what the system is doing. In this sense, AI metacognition has started to incorporate the unconscious building blocks of uncertainty and error monitoring that we encountered in Part I, but not the kind of explicit metacognition that, as we saw, emerges late in child development and is linked to our awareness of other minds. It may be that, if a second-order, Rylean view of self-awareness is correct, then humanlike self-awareness will emerge only if and when AI achieves a capacity for fully-fledged theory of mind.

But we should also not ignore what the future holds for our own self-awareness. Paradoxically, the neuroscience of metacognition tells us that by melding with AI (the second route) we may retain more autonomy and explainability than if we continue on the path toward creating clever but unconscious machines. This paradox is highlighted in current debates about explainable AI. Solutions here often focus on providing readouts or intuitive visualizations of the inner workings of the black box. The idea is that if we can analyze the workings of the machine, we will have a better idea of why it is making particular decisions. This might be useful for simple systems. But for complex problems it is unlikely to be helpful. It would be like providing an fMRI scan (or worse, a high-resolution map of prefrontal cortical cell firing) to explain why I made a particular decision about which sandwich to have for lunch. This would not be an explanation in the usual sense of the term and would not be one that would be recognized in a court of law. Instead, for better or worse, humans effortlessly lean on self-awareness to explain to each other why we did what we did, and a formal cross-examination of such explanations forms the basis of our sense of autonomy and responsibility.20

In practice, I suspect a blend of both approaches will emerge. Machines will gain domain-specific abilities to track uncertainty and monitor their actions, allowing them to effectively collaborate and build trust with each other and their human operators. By retaining humans in the loop, we can leverage our unique capacity for self-narrative to account for why our machines did what they did in the language of human explanation. I suspect it was this metacognitive meaning of consciousness that the historian Yuval Noah Harari had in mind when he wrote, “For every dollar and every minute we invest in improving artificial intelligence, it would be wise to invest a dollar and a minute in advancing human consciousness.” The future of self-awareness may mean we end up not only knowing thyself, but also knowing thy machine.21