Chapter 5 Maps in the Brain - Part 1 A New Understanding of the Brain

A Thousand Brains: A New Theory of Intelligence - Jeff Hawkins 2021

Chapter 5 Maps in the Brain
Part 1 A New Understanding of the Brain

It took years for us to deduce that reference frames exist throughout the neocortex, but in hindsight, we could have understood this a long time ago with a simple observation. Right now, I am sitting in a small lounge area of Numenta’s office. Near me are three comfortable chairs similar to the one I am sitting in. Beyond the chairs are several freestanding desks. Beyond the desks, I see the old county courthouse across the street. Light from these objects enters my eyes and is projected onto the retina. Cells in the retina convert light into spikes. This is where vision starts, at the back of the eye. Why, then, do we not perceive objects as being in the eye? If the chairs, desks, and courthouse are imaged next to each other on my retina, how is it that I perceive them to be at different distances and different locations? Similarly, if I hear a car approaching, why do I perceive the car as one hundred feet away to my right and not in my ear, where the sound actually is?

This simple observation, that we perceive objects as being somewhere—not in our eyes and ears, but at some location out in the world—tells us that the brain must have neurons whose activity represents the location of every object that we perceive.

At the end of the last chapter, I told you that we were worried about submitting our first paper about reference frames because, at that time, we didn’t know how neurons in the neocortex could do this. We were proposing a major new theory about how the neocortex works, but the theory was largely based on logical deduction. It would be a stronger paper if we could show how neurons did it. The day before we submitted, I added a few lines of text suggesting that the answer might be found in an older part of the brain called the entorhinal cortex. I am going to tell you why we suggested that with a story about evolution.

An Evolutionary Tale

When animals first started moving about in the world, they needed a mechanism to decide which way to move. Simple animals have simple mechanisms. For example, some bacteria follow gradients. If the quantity of a needed resource, such as food, is increasing, then they are more likely to keep moving in the same direction. If the quantity is decreasing, then they are more likely to turn and try a different direction. A bacterium doesn’t know where it is; it doesn’t have any way to represent its location in the world. It just goes forward and uses a simple rule for deciding when to turn. A slightly more sophisticated animal, such as an earthworm, might move to stay within desirable ranges of warmth, food, and water, but it doesn’t know where it is in the garden. It doesn’t know how far away the brick path is, or the direction and distance to the nearest fence post.

Now consider the advantages afforded to an animal that knows where it is, an animal that always knows its location relative to its environment. The animal can remember where it found food in the past and the places it used for shelter. The animal can then calculate how to get from its current location to these and other previously visited locations. The animal can remember the path it traveled to the watering hole and what happened at various locations along the way. Knowing your location and the location of other things in the world has many advantages, but it requires a reference frame.

Recall that a reference frame is like the grid of a map. For example, on a paper map you might locate something using labeled rows and columns, such as row D and column 7. The rows and columns of a map are a reference frame for the area represented by the map. If an animal has a reference frame for its world, then as it explores it can note what it found at each location. When the animal wants to get someplace, such as a shelter, it can use the reference frame to figure out how to get there from its current location. Having a reference frame for your world is useful for survival.

Being able to navigate the world is so valuable that evolution discovered multiple methods for doing it. For example, some honeybees can communicate distance and direction using a form of dance. Mammals, such as ourselves, have a powerful internal navigation system. There are neurons in the old part of our brain that are known to learn maps of the places we have visited, and these neurons have been under evolutionary pressure for so long that they are fine-tuned to do what they do. In mammals, the old brain parts where these map-creating neurons exist are called the hippocampus and the entorhinal cortex. In humans, these organs are roughly the size of a finger. There is one set on each side of the brain, near the center.

Maps in the Old Brain

In 1971, scientist John O’Keefe and his student Jonathan Dostrovsky placed a wire into a rat’s brain. The wire recorded the spiking activity of a single neuron in the hippocampus. The wire went up toward the ceiling so they could record the activity of the cell as the rat moved and explored its environment, which was typically a big box on a table. They discovered what are now called place cells: neurons that fire every time the rat is in a particular location in a particular environment. A place cell is like a “you are here” marker on a map. As the rat moves, different place cells become active in each new location. If the rat returns to a location where it was before, the same place cell becomes active again.

In 2005, scientists in the lab of May-Britt Moser and Edvard Moser used a similar experimental setup, again with rats. In their experiments, they recorded signals from neurons in the entorhinal cortex, adjacent to the hippocampus. They discovered what are now called grid cells, which fire at multiple locations in an environment. The locations where a grid cell becomes active form a grid pattern. If the rat moves in a straight line, the same grid cell becomes active again and again, at equally spaced intervals.

The details of how place cells and grid cells work are complicated and still not completely understood, but you can think of them as creating a map of the environment occupied by the rat. Grid cells are like the rows and columns of a paper map, but overlaid on the animal’s environment. They allow the animal to know where it is, to predict where it will be when it moves, and to plan movements. For example, if I am at location B4 on a map and want to get to location D6, I can use the map’s grid to know that I have to go two squares to the right and two squares down.

But grid cells alone don’t tell you what is at a location. For example, if I told you that you were at location A6 on a map, that information doesn’t tell you what you will find there. To know what is at A6, you need to look at the map and see what is printed in the corresponding square. Place cells are like the details printed in the square. Which place cells become active depends on what the rat senses at a particular location. Place cells tell the rat where it is based on sensory input, but place cells alone aren’t useful for planning movements—that requires grid cells. The two types of cells work together to create a complete model of the rat’s environment.

Every time a rat enters an environment, the grid cells establish a reference frame. If it is a novel environment, the grid cells create a new reference frame. If the rat recognizes the environment, the grid cells reestablish the previously used reference frame. This process is analogous to you entering a town. If you look around and realize that you have been there before, you pull out the correct map for that town. If the town looks unfamiliar, then you take out a blank piece of paper and start creating a new map. As you walk around the town, you write on your map what you see at each location. That is what grid cells and place cells do. They create unique maps for every environment. As a rat moves, the active grid cells and the active place cells change to reflect the new location.

Humans have grid cells and place cells too. Unless you are completely disoriented, you always have a sense of where you are. I am now standing in my office. Even if I close my eyes, my sense of location persists, and I continue to know where I am. Keeping my eyes closed, I take two steps to my right and my sense of location in the room changes. The grid cells and place cells in my brain have created a map of my office, and they keep track of where I am in my office, even when my eyes are closed. As I walk, which cells are active changes to reflect my new location. Humans, rats, indeed all mammals use the same mechanism for knowing our location. We all have grid cells and place cells that create models of the places we have been.

Maps in the New Brain

When we were writing our 2017 paper about locations and reference frames in the neocortex, I had some knowledge of place cells and grid cells. It occurred to me that knowing the location of my finger relative to a coffee cup is similar to knowing the location of my body relative to a room. My finger moves around the cup in the same way that my body moves about a room. I realized that the neocortex might have neurons that are equivalent to the ones in the hippocampus and entorhinal cortex. These cortical place cells and cortical grid cells would learn models of objects in a similar way to how place cells and grid cells in the old brain learn models of environments.

Given their role in basic navigation, place cells and grid cells are almost certainly evolutionarily older than the neocortex. Therefore, I figured it was more likely that the neocortex creates reference frames using a derivative of grid cells than that it evolved a new mechanism from scratch. But in 2017, we were not aware of any evidence that the neocortex had anything similar to grid cells or place cells—it was informed speculation.

Shortly after our 2017 paper was accepted, we learned of recent experiments that suggested grid cells might be present in parts of the neocortex. (I will discuss these experiments in Chapter 7.) This was encouraging. The more we studied the literature related to grid cells and place cells, the more confident we became that cells that perform similar functions exist in every cortical column. We first made this argument in a 2019 paper, titled “A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex.”

Again, to learn a complete model of something you need both grid cells and place cells. Grid cells create a reference frame to specify locations and plan movements. But you also need sensed information, represented by place cells, to associate sensory input with locations in the reference frame.

The mapping mechanisms in the neocortex are not an exact copy of ones in the old brain. Evidence suggests that the neocortex uses the same basic neural mechanisms, but it is different in several ways. It is as if nature stripped down the hippocampus and entorhinal cortex to a minimal form, made tens of thousands of copies, and arranged them side by side in cortical columns. That became the neocortex.

Grid cells and place cells in the old brain mostly track the location of one thing: the body. They know where the body is in its current environment. The neocortex, on the other hand, has about 150,000 copies of this circuit, one per cortical column. Therefore, the neocortex tracks thousands of locations simultaneously. For example, each small patch of your skin and each small patch of your retina has its own reference frame in the neocortex. Your five fingertips touching a cup are like five rats exploring a box.

Huge Maps in Tiny Spaces

So, what does a model in the brain look like? How does the neocortex stuff hundreds of models into each square millimeter? To understand how this works, let’s go back to our paper-map analogy. Say I have a map of a town. I spread it out on a table and see that it is marked with rows and columns dividing it into one hundred squares. A1 is the top left and J10 is the bottom right. Printed in each square are things I might see in that part of town.

I take a pair of scissors and cut out each square, marking it with its grid coordinates: B6, G1, etc. I also mark each square with Town 1. I then do the same for nine more maps, each map representing a different town. I now have one thousand squares: one hundred map squares for each of ten towns. I shuffle the squares and put them in a stack. Although my stack contains ten complete maps, only one location can be seen at a time. Now someone blindfolds me and drops me off at a random location in one of the ten towns. Removing my blindfold, I look around. At first, I don’t know where I am. Then I see I am standing in front of a fountain with a sculpture of a woman reading a book. I flip through my map squares, one at a time, until I see one showing this fountain. The map square is labeled Town 3, location D2. Now I know what town I am in and I know where I am in that town.

There are several things I can do next. For example, I can predict what I will see if I start walking. My current location is D2. If I walk east, I will be in D3. I search my stack of squares to find the square labeled Town 3, D3. It shows a playground. In this way I can predict what I will encounter if I move in a certain direction.

Perhaps I want to go to the town library. I can search my stack of squares until I see one showing a library in Town 3. That square is labeled G7. Given that I am at D2, I can calculate that I have to travel three squares east and five squares south to get to the library. I can take several different routes to get there. Using my map squares, one at a time, I can visualize what I will encounter along any particular route. I choose one that takes me past an ice cream shop.

Now consider a different scenario. After being dropped off at an unknown location and removing my blindfold, I see a coffee shop. But when I look through my stack of squares, I find five showing a similar-looking coffee shop. Two coffee shops are in one town, and the other three are in different towns. I could be in any of these five locations. What should I do? I can eliminate the ambiguity by moving. I look at the five squares where I might be, and then look up what I will see if I walk south from each of them. The answer is different for each of the five squares. To figure out where I am, I then physically walk south. What I find there eliminates my uncertainty. I now know where I am.

This way of using maps is different than how we typically use them. First, our stack of map squares contains all our maps. In this way, we use the stack to figure out both what town we are in and where we are in that town.

Second, if we are uncertain where we are, then we can determine our town and location by moving. This is what happens when you reach into a black box and touch an unknown object with one finger. With a single touch you probably can’t determine what object you are feeling. You might have to move your finger one or more times to make that determination. By moving, you discover two things at the same time: the moment you recognize what object you are touching, you also know where your finger is on the object.

Finally, this system can scale to handle a large number of maps and do so quickly. In the paper-map analogy, I described looking at the map squares one at a time. This could take a lot of time if you had many maps. Neurons, however, use what is called associative memory. The details are not important here, but it allows neurons to search though all the map squares at once. Neurons take the same amount of time to search through a thousand maps as to search through one.

Maps in a Cortical Column

Now let’s consider how maplike models are implemented by neurons in the neocortex. Our theory says that every cortical column can learn models of complete objects. Therefore, every column—every square millimeter of the neocortex—has its own set of map squares. How a cortical column does this is complicated, and we don’t yet understand it completely, but we understand the basics.

Recall that a cortical column has multiple layers of neurons. Several of these layers are needed to create the map squares. Here is a simplified diagram to give you a flavor of what we think is happening in a cortical column.

Image

A model of a cortical column

This figure represents two layers of neurons (the shaded boxes) in one cortical column. Although a column is tiny, about one millimeter wide, each of these layers might have ten thousand neurons.

The upper layer receives the sensory input to the column. When an input arrives, it causes several hundred neurons to become active. In the paper-map analogy, the upper layer represents what you observe at some location, such as the fountain.

The bottom layer represents the current location in a reference frame. In the analogy, the lower layer represents a location—such as Town 3, D2—but doesn’t represent what is observed there. It is like a blank square, labeled only with Town 3, location D2.

The two vertical arrows represent connections between the blank map squares (the lower layer) and what is seen at that location (the upper layer). The downward arrow is how an observed feature, such as the fountain, is associated with a particular location in a particular town. The upward arrow associates a particular location—Town 3, D2—with an observed feature. The upper layer is roughly equivalent to place cells and the lower layer is roughly equivalent to grid cells.

Learning a new object, such as a coffee cup, is mostly accomplished by learning the connections between the two layers, the vertical arrows. Put another way, an object such as a coffee cup is defined by a set of observed features (upper layer) associated with a set of locations on the cup (lower layer). If you know the feature, then you can determine the location. If you know the location, you can predict the feature.

The basic flow of information goes as follows: A sensory input arrives and is represented by the neurons in the upper layer. This invokes the location in the lower layer that is associated with the input. When movement occurs, such as moving a finger, then the lower layer changes to the expected new location, which causes a prediction of the next input in the upper layer.

If the original input is ambiguous, such as the coffee shop, then the network activates multiple locations in the lower layer—for example, all the locations where a coffee shop exists. This is what happens if you touch the rim of a coffee cup with one finger. Many objects have a rim, so you can’t at first be certain what object you are touching. When you move, the lower layer changes all the possible locations, which then make multiple predictions in the upper layer. The next input will eliminate any locations that don’t match.

We simulated this two-layer circuit in software using realistic assumptions for the number of neurons in each layer. Our simulations showed that not only can individual cortical columns learn models of objects, but each column can learn hundreds of them. The neural mechanism and simulations are described in our 2019 paper “Locations in the Neocortex: A Theory of Sensorimotor Object Recognition Using Cortical Grid Cells.”

Orientation

There are other things a cortical column must do to learn models of objects. For example, there needs to be a representation of orientation. Say you know what town you are in and you know your location in that town. Now I ask you, “What will you see if you walk forward one block?” You would reply, “Which direction am I walking?” Knowing your location is not sufficient to predict what you will see when you walk; you also need to know which way you are facing, your orientation. Orientation is also required to predict what you will see from a particular location. For example, standing on a street corner, you might see a library when you face north and a playground when you face south.

There are neurons in the old brain called head direction cells. As their name suggests, these cells represent the direction an animal’s head is facing. Head direction cells act like a compass, but they aren’t tied to magnetic north. They are aligned to a room or environment. If you stand in a familiar room and then close your eyes, you retain a sense of which way you are facing. If you turn your body, while keeping your eyes closed, your sense of direction changes. This sense is created by your head direction cells. When you rotate your body, your head direction cells change to reflect your new orientation in the room.

Cortical columns must have cells that perform an equivalent function to head direction cells. We refer to them by the more generic term orientation cells. Imagine you are touching the lip of the coffee cup with your index finger. The actual impression on the finger depends on the orientation of the finger. You can, for example, keep your finger in the same location but rotate it around the point of contact. As you do, the sensation of the finger changes. Therefore, in order to predict its input, a cortical column must have a representation of orientation. For simplicity, I didn’t show orientation cells and other details in the above diagram of a cortical column.

To summarize, we proposed that every cortical column learns models of objects. The columns do this using the same basic method that the old brain uses to learn models of environments. Therefore, we proposed that each cortical column has a set of cells equivalent to grid cells, another set equivalent to place cells, and another set equivalent to head direction cells, all of which were first discovered in parts of the old brain. We came to our hypothesis by logical deduction. In Chapter 7, I will list the growing experimental evidence that supports our proposal.

But first, we are going to turn our attention to the neocortex as a whole. Recall that each cortical column is small, about the width of a piece of thin spaghetti, and the neocortex is large, about the size of a dinner napkin. Therefore, there are about 150,000 columns in a human neocortex. Not all of the cortical columns are modeling objects. What the rest of the columns are doing is the topic of the next chapter.