3.1 Learning - Learning and Memory

MCAT Behavioral Sciences Review - Kaplan Test Prep 2021–2022

3.1 Learning
Learning and Memory


After Chapter 3.1, you will be able to:

· Apply principles of habituation, dishabituation, and sensitization to real-life scenarios

· Identify the conditioned stimulus, unconditioned stimulus, conditioned response, and unconditioned response in a Pavlovian learning paradigm

· Distinguish between negative reinforcement, positive reinforcement, negative punishment, and positive punishment

· Predict how reinforcement schedule will affect relative frequency of behavioral response in an operant conditioning scenario:


To a psychologist, learning refers specifically to the way in which we acquire new behaviors. To understand learning, we must start with the concept of a stimulus. A stimulus can be defined as anything to which an organism can respond, including all of the sensory inputs we discussed in Chapter 2 of MCAT Behavioral Sciences Review. The combination of stimuli and responses serves as the basis for all behavioral learning.

Responses to stimuli can change over time depending on the frequency and intensity of the stimulus. For instance, repeated exposure to the same stimulus can cause a decrease in response called habituation. This is seen in many first-year medical students: students often have an intense physical reaction the first time they see a cadaver or treat a severe laceration, but as they get used to these stimuli, the reaction lessens until they are unbothered by these sights. Note that a stimulus too weak to elicit a response is called subthreshold stimulus.

The opposite process can also occur. Dishabituation is defined as the recovery of a response to a stimulus after habituation has occurred. Dishabituation is often noted when, late in the habituation of a stimulus, a second stimulus is presented. The second stimulus interrupts the habituation process and thereby causes an increase in response to the original stimulus. Imagine, for example, that you’re taking a long car trip and driving for many miles on a highway. After a while, your brain will get used to the sights, sounds, and sensations of highway driving: the dashed lines dividing the lanes, the sound of the engine and the tires on the road, and so on. Habituation has occurred. At some point you use an exit ramp, and these sensations change. As you merge onto the new highway, you pay more attention to the sensory stimuli coming in. Even if the stimuli are more or less the same as on the previous highway, the presentation of a different stimulus (using the exit ramp) causes dishabituation and a new awareness of—and response to—these stimuli. Dishabituation is temporary and always refers to changes in response to the original stimulus, not the new one.

Key Concept

Dishabituation is the recovery of a response to a stimulus, usually after a different stimulus has been presented. Note that the term refers to changes in response to the original stimulus, not the new one.

Learning, then, is a change in behavior that occurs in response to a stimulus. While there are many types of learning, the MCAT focuses on two types: associative learning and observational learning.


Associative learning is the creation of a pairing, or association, either between two stimuli or between a behavior and a response. On the MCAT, you’ll be tested on two kinds of associative learning: classical and operant conditioning.

Classical Conditioning

Classical conditioning is a type of associative learning that takes advantage of biological, instinctual responses to create associations between two unrelated stimuli. For many people, the first name that comes to mind for research in classical conditioning is Ivan Pavlov. His experiments on dogs were not only revolutionary, but also provide a template for the way the MCAT will test classical conditioning.

Classical conditioning works, first and foremost, because some stimuli cause an innate or reflexive physiological response. For example, we reflexively salivate when we smell bread baking in an oven, or we may jump or recoil when we hear a loud noise. Any stimulus that brings about such a reflexive response is called an unconditioned stimulus, and the innate or reflexive response is called an unconditioned response. Many stimuli do not produce a reflexive response and are known as neutral stimuli.

In Pavlov’s experiment, the unconditioned stimulus was meat, which would cause the dogs to salivate reflexively, and the neutral stimulus was a ringing bell. Through the course of the experiment, Pavlov repeatedly rang the bell before placing meat in the dogs’ mouths. Initially, the dogs did not react much when they only heard the bell ring without receiving meat. However, after this procedure was repeated several times, the dogs began to salivate when they heard the bell ring. In fact, the dogs would salivate even if Pavlov only rang the bell and did not deliver any meat. Pavlov thereby turned a neutral stimulus into a conditioned stimulus: a normally neutral stimulus that, through association, now causes a reflexive response called a conditioned response. The process of using a reflexive, unconditioned stimulus to turn a neutral stimulus into a conditioned stimulus is termed acquisition, as shown in Figure 3.1.

ImageFigure 3.1. Acquisition in Classical ConditioningUCS = unconditioned stimulus, UCR = unconditioned response, CS = conditioned stimulus, CR = conditioned response

Notice that the stimuli change in this experiment, but the response is the same throughout. Because salivation in response to food is natural and requires no conditioning, it is an unconditioned response in this context. On the other hand, when paired with the conditioned stimulus of the bell, salivation is considered a conditioned response.

MCAT Expertise

On the MCAT, the key to telling conditioned and unconditioned responses apart will be to look at which stimulus is causing them: unconditioned stimuli cause an unconditioned response, while conditioned stimuli cause a conditioned response.

However, it is important to recognize that just because a conditioned response has been acquired, that does not mean that the conditioned response is permanent. Extinction refers to the loss of a conditioned response, and can occur if the conditioned stimulus is repeatedly presented without the unconditioned stimulus. Applying this concept to the Pavlov example, if the bell rings often enough without the dog getting meat, the dog may stop salivating when the bell sounds. Interestingly, this extinction of a response is not always permanent; after some time, presenting subjects again with an extinct conditioned stimulus will sometimes produce a weak conditioned response, a phenomenon called spontaneous recovery.

There are a few processes that can modify the response to a conditioned stimulus after acquisition has occurred. Generalization is a broadening effect by which a stimulus similar enough to the conditioned stimulus can also produce the conditioned response. In one famous experiment, researchers conditioned a child called Little Albert to be afraid of a white rat by pairing the presentation of the rat with a loud noise. Subsequent tests showed that Little Albert’s conditioning had generalized such that he also exhibited a fear response to a white stuffed rabbit, a white sealskin coat, and even a man with a white beard.

Finally, in stimuli discrimination (sometimes referred to as just discrimination), an organism learns to distinguish between similar stimuli. Discrimination is the opposite of generalization. Pavlov’s dogs could have been conditioned to discriminate between bells of different tones by having one tone paired with meat, and another tone presented without meat. In this case, association could have occurred with one tone but not the other.

MCAT Expertise

Classical conditioning is a favorite topic on the MCAT. Expect at least one question to describe a Pavlovian experiment and ask you to identify the role of one of the stimuli or responses described.

Operant Conditioning

Whereas classical conditioning is concerned with instincts and biological responses, the study of operant conditioning examines the ways in which consequences of voluntary behaviors change the frequency of those behaviors. Just as the MCAT will test you on the difference between conditioned and unconditioned responses and stimuli, it will ask you to distinguish between reinforcement and punishment too. Operant conditioning is associated with B. F. Skinner, who is considered the father of behaviorism, the theory that all behaviors are conditioned. The four possible relationships between stimulus and behavior are summarized in Figure 3.2.

ImageFigure 3.2. Terminology of Operant Conditioning


Almost all animals will innately search for resources in their environment. These reward-seeking behaviors, such as foraging and approach behaviors, are modified over time as the animal interacts with various stimuli and adjusts its behaviors accordingly. Reinforcement is the process of increasing the likelihood that an animal will perform a behavior. Reinforcers are divided into two categories. Positive reinforcers increase the frequency of a behavior by adding a positive consequence or incentive following the desired behavior. Money is an example of a common and strong positive reinforcer: employees will continue to work if they are paid. Negative reinforcers act similarly in that they increase the frequency of a behavior, but they do so by removing something unpleasant. For example, taking an aspirin reduces a headache, so the next time you have a headache, you are more likely to take one. Negative reinforcement is often confused with punishment, which will be discussed in the next section, but remember that the frequency of the behavior is the distinguishing factor: any reinforcement—positive or negative—increases the likelihood that a behavior will be performed.

Real World

This concept of learning by consequence forms the foundation for behavioral therapies for many disorders including phobias, anxiety disorders, and obsessive—compulsive disorder.

Negative reinforcement can be subdivided into escape learning and avoidance learning, which differ in whether the unpleasant stimulus occurs or not. Escape learning describes a situation where the animal experiences the unpleasant stimulus and, in response, displays the desired behavior in order to trigger the removal of the stimulus. So, in this type of learning, the desired behavior is used to escape the stimulus. In contrast, avoidance learning occurs when the animal displays the desired behavior in anticipation of the unpleasant stimulus, thereby avoiding the unpleasant stimulus.

Avoidance learning often develops from multiple experiences of escape learning. An example of this progression from escape learning to avoidance learning is the seat belt warning in a car. If a driver begins driving without buckling her seat belt, then the car will produce an annoying beeping noise, which only ends when the seat belt is buckled. In this example, the desired behavior is to buckle the seat belt. This behavior is reinforced by the removal of an unpleasant stimulus (the audible beeping), so this type of learning is negative reinforcement. More specifically, this example illustrates escape learning, since the driver first experiences the unpleasant stimulus, then exhibits the desired behavior in order to escape the unpleasant stimulus. However, after forgetting to buckle her seat belt several times, the driver will eventually learn to preemptively buckle up before driving the car, in order to avoid the beeping sound. At that point, the escape learning has progressed to avoidance learning. Finally, this example illustrates an important misconception about the term negative reinforcement: Buckling one's seat belt is generally considered a "positive" behavior, in that it protects one's health. Nevertheless, the terms "positive" and "negative" in operant conditioning only refer to the addition or removal of a stimulus. So even though buckling up is a "good" thing, this example illustrates several types of negative reinforcement!

Classical and operant conditioning can be used hand-in-hand. For example, dolphin trainers take advantage of reinforcers when training dolphins to perform tricks. Sometimes, the trainers will feed the dolphin a fish after it performs a trick. The fish can be said to be a primary reinforcer because the fish is a treat that the dolphin responds to naturally. Dolphin trainers also use tiny handheld devices that emit a clicking sound. This clicker would not normally be a reinforcer on its own, but the trainers use classical conditioning to pair the clicker with fish to elicit the same response. The clicker is thus a conditioned reinforcer, which is sometimes called a secondary reinforcer. Eventually, the dolphin may even associate the presence of the trainer with the possibility of reward, making the presence of the trainer a discriminative stimulus. A discriminative stimulus indicates that reward is potentially available in an operant conditioning paradigm.


In contrast to reinforcement, punishment uses conditioning to reduce the occurrence of a behavior. Positive punishment adds an unpleasant consequence in response to a behavior to reduce that behavior; for example, in some countries a thief may be flogged for stealing, which is intended to stop him from stealing again. Because positive punishment involves using something unpleasant to discourage a behavior, it is sometimes referred to as aversive conditioning. By contrast, negative punishment is removing a stimulus in order to cause reduction of a behavior. For example, a parent may forbid her child from watching television as a consequence for bad behavior, with the goal of preventing the behavior from happening again.

Key Concept

Negative reinforcement is often confused with positive punishment. Negative reinforcement is the removal of a bothersome stimulus to encourage a behavior; positive punishment is the addition of a bothersome stimulus to reduce a behavior.


Sociological institutions often rely on punishments and rewards to adjust behavior. Within a society, formal sanctions, or rules and laws, can be used to reinforce or punish behavior. Likewise, informal sanctions, such as ostracization, praise, and shunning, can be used to reinforce or punish social behavior without depending on rules established by social institutions. Socialization and social institutions are discussed in Chapters 8 and 11 of MCAT Behavioral Sciences Review, respectively.

Reinforcement Schedules

The presence or absence of reinforcing or punishing stimuli is just a part of the story. The rate at which desired behaviors are acquired is also affected by the reinforcement schedule being used to deliver the stimuli. There are two key factors to reinforcement schedules: whether the schedule is fixed or variable, and whether the schedule is based on a ratio or an interval.

· Fixed-ratio (FR) schedules reinforce a behavior after a specific number of performances of that behavior. For example, in a typical operant conditioning experiment, researchers might reward a rat with a food pellet every third time it presses a bar in its cage. Continuous reinforcement is a fixed-ratio schedule in which the behavior is rewarded every time it is performed.

· Variable-ratio (VR) schedules reinforce a behavior after a varying number of performances of the behavior, but such that the average number of performances to receive a reward is relatively constant. With this type of reinforcement schedule, researchers might reward a rat first after two button presses, then eight, then four, then finally six.

· Fixed-interval (FI) schedules reinforce the first instance of a behavior after a specified time period has elapsed. For example, once our rat gets a pellet, it has to wait 60 seconds before it can get another pellet. The first lever press after 60 seconds gets a pellet, but subsequent presses during those 60 seconds accomplish nothing.

· Variable-interval (VI) schedules reinforce a behavior the first time that behavior is performed after a varying interval of time. Instead of waiting exactly 60 seconds, for example, our rat might have to wait 90 seconds, then 30 seconds, then three minutes. In each case, once the interval elapses, the next press gets the rat a pellet.

Of these schedules, variable-ratio works the fastest for learning a new behavior, and is also the most resistant to extinction. The effectiveness of the various reinforcement schedules is demonstrated in Figure 3.3.

ImageFigure 3.3. Reinforcement SchedulesHatches correspond to instances of reinforcement. The start of each line corresponds to time zero for that schedule.

There are a few things to note in this graph. First, variable-ratio schedules have the fastest response rate: the rat will continue pressing the bar quickly with the hope that the next press will be the “right one.” Also note that fixed schedules (fixed-ratio and fixed-interval) often have a brief moment of no responses after the behavior is reinforced: the rat will stop hitting the lever until it wants another pellet, once it has figured out what behavior is necessary to receive the pellet.


VR stands for Variable-Ratio, but it can also stand for Very Rapid and Very Resistant to extinction.

Real World

Gambling (and gambling addiction) is so difficult to extinguish because most gambling games are based on variable-ratio schedules. While the probability of winning the jackpot on any individual pull of a slot machine is the same, we get caught in the idea that the next pull will be the “right one.”

One final idea associated with operant conditioning is the concept of shaping, which is the process of rewarding increasingly specific behaviors that become closer to a desired response. For example, if you wanted to train a bird to spin around in place and then peck a key on a keyboard, you might first give the bird a treat for turning slightly to the left, then only for turning a full 90 degrees, then 180, and so on, until the bird has learned to spin around completely. Then you might only reward this behavior if done near the keyboard until eventually the bird is only rewarded once the full set of behaviors is performed. While it may take some time, the use of shaping in operant conditioning can allow for the training of extremely complicated behaviors.

Cognitive and Biological Factors in Associative Learning

It would be incorrect to say that classical and operant conditioning are the only factors that affect behavior, nor would it be correct to say that we are all mindless and robotic, unable to resist the rewards and punishments that occur in our lives. Since Skinner’s initial perspectives, it has been found that many cognitive and biological factors are at work that can change the effects of associative learning or allow us to resist them altogether.

Many organisms undergo latent learning, which is learning that occurs without a reward but that is spontaneously demonstrated once a reward is introduced. The classic experiment associated with latent learning involves rats running a maze. Rats that were simply carried through the maze and then incentivized with a food reward for completing the maze on their own performed just as well—and in some cases better—than those rats that had been trained to run the maze using more standard operant conditioning techniques by which they were rewarded along the way.

Problem solving is another method of learning that steps outside the standard behaviorist approach. Think of the way young children put together a jigsaw puzzle: often, they will take pieces one-by-one and try to make them fit together until they find the correct match. Many animals will also use this kind of trial-and-error approach, testing behaviors until they yield a reward. As we get older, we gain the ability to analyze the situation and respond correctly the first time, as when we seek out the correct puzzle piece and orientation based on the picture we are forming. Humans and chimpanzees alike will often avoid trial-and-error learning and instead take a step back, observe the situation, and take decisive action to solve the challenges they face.

Not all behaviors can be taught using operant conditioning techniques. Many animals are predisposed to learn (or not learn) behaviors based on their own natural abilities and instincts. Animals are most able to learn behaviors that coincide with their natural behaviors: birds naturally peck when searching for food, so rewarding them with food in response to a pecking-based behavior works well. This predisposition is known as preparedness. Similarly, it can be very difficult to teach animals behaviors that work against their natural instincts. When animals revert to an instinctive behavior after learning a new behavior that is similar, the animal has undergone instinctive (or instinctual) drift. For example, researchers used behavioral techniques to train raccoons to place coins in a piggy bank. Their efforts were ultimately unsuccessful as the learned behaviors were only temporary. Eventually, rather than placing the coins in the bank, the raccoons would pick up the coins, rub them together, and dip them into the bank before pulling them back out. The researchers concluded that the task they were trying to train the raccoons to perform was conflicting with their natural food-gathering instinct, which was to rub seeds together and wash them in a stream to clean them before eating. The researchers had far better luck training the raccoons to place a ball in a basketball net, as the ball was too large to trigger the food-washing instinct.


Observational learning is the process of learning a new behavior or gaining information by watching others. The most famous and perhaps most controversial study into observational learning is Albert Bandura’s Bobo doll experiment, in which children watched an adult in a room full of toys punching and kicking an inflatable clown toy. When the children were later allowed to play in the room, many of them ignored the other toys in the room and inflicted similar violence on the Bobo doll just as they had seen the adult do. It’s important to note that observational learning is not simply imitation because observational learning can be used to teach individuals to avoid behavior as well. In later iterations of the Bobo doll experiment, children who watched the adult get scolded after attacking the Bobo doll were less likely to be aggressive toward the Bobo doll themselves.

Real World

The connection between violent video games and aggressive behavior is still under active debate. While there are many interest groups on both sides of the controversy, the American Academy of Pediatrics (a major medical society) published one report in which they attributed a 13 to 22% increase in aggressive behavior to observational learning from video games.

Like associative learning, there are a few neurological factors that affect observational learning. The most important of these are mirror neurons. These neurons are located in the frontal and parietal lobes of the cerebral cortex and fire both when an individual performs an action and when that individual observes someone else performing that action. Mirror neurons are largely involved in motor processes, but additionally are thought to be related to empathy and vicarious emotions; some mirror neurons fire both when we experience an emotion and also when we observe another experiencing the same emotion. Mirror neurons also play a role in imitative learning by a number of primates, as shown in Figure 3.4.

ImageFigure 3.4. Use of Mirror Neurons in a MacaqueMany neonatal primates imitate facial expressions using mirror neurons.

Research suggests that observational learning through modeling is an important factor in determining an individual’s behavior throughout his or her lifetime. People learn what behaviors are acceptable by watching others perform them. Much attention is focused on violent media or domestic abuse as models for antisocial behavior, but prosocial modeling can be just as powerful. Of course, observational learning is strongest when a model’s words are consistent with his or her actions. Many parents adopt a Do as I say, not as I do approach when teaching their children, but research suggests that children will disproportionately imitate what the model did, rather than what the model said.

MCAT Concept Check 3.1:

Before you move on, assess your understanding of the material with these questions.

1. Which of the following might cause a person to eat more food during a meal: eating each course separately and moving to the next only when finished with the current course, or interrupting the main course several times by eating side dishes?

2. A college student plays a prank on his roommate by popping a balloon behind the roommate’s head after every time he makes popcorn. Before long, the smell of popcorn makes the roommate nervous. Which part of the story corresponds to each of the classical conditioning concepts below?

o Conditioned stimulus:

o Unconditioned stimulus:

o Conditioned response:

o Unconditioned response:

3. What is the difference between negative reinforcement and positive punishment? Provide an example of each.

o Negative reinforcement:

o Positive punishment: