Psychology: Essential Thinkers, Classic Theories, and How They Inform Your World - Andrea Bonior 2016
B. F. Skinner
BORN 1904, Susquehanna, Pennsylvania
DIED 1990, Cambridge, Massachusetts
Educated at Hamilton College and Harvard University
Further pushing the “radical” in radical behaviorism was B. F. Skinner. He believed that nothing about an organism’s individual physiology could explain behavior; our actions are solely responses to environmental stimuli. To Skinner, genetics are important only in the sense that they preprogram us to be able to learn behaviors.
Skinner made the leap between Ivan Pavlov’s classical conditioning and what he himself called operant conditioning; according to Skinner, Pavlov’s conditioning prepares a dog for food, but his own conditioning actually spurs the dog to get the food. Using his own version of a puzzle box, which came to be known as the Skinner box, he refined and standardized experimental protocols and showed the world that lever-pressing behavior is just as tangible and measurable as the salivation reflex. Thus a new round of research was born.
Skinner used the term operant conditioning because it refers to behaviors that operate on the environment in order to produce consequences. He argued that it is those consequences that determine whether the behavior will ever be exhibited again. Of course, this idea is similar to Edward Thorndike’s connectionism.
But Skinner’s most groundbreaking contributions were what he called schedules of reinforcement and the proof of their crucial role in determining the strength of conditioning. He discovered that the patterns of the pairings between responses and rewards can greatly affect how strong the connections are. So you can train an animal to press a lever for food even if food is not presented every single time the lever is pressed. In fact, there are certain schedules of food presentation that will make the animal more likely to press the lever and to keep doing so.
Skinner, along with the behavioral psychologist Charles Ferster, proposed three main categories of reinforcement schedules:
”Continuous reinforcement is optimal for beginning to establish an association between a behavior and a reward. In this situation, every time a behavior is performed, a reinforcer, or motivating reward, is given.
”An interval schedule of reinforcement has the reinforcer being given after a certain amount of time, no matter how many times the behavior has been performed.
”A ratio schedule requires the behavior to be repeated a certain number of times, no matter how long it takes, before the reinforcer is given.
There are also two more delineators of reinforcement schedules:
”A fixed schedule of reinforcement is one in which the reinforcer is given methodically and predictably.
”A variable schedule of reinforcement will, as its name implies, vary over time and is not as predictable.
When you work for a salary and your paycheck comes every two weeks, whether you’ve taken a day off or not, that schedule of reinforcement is a fixed interval schedule. It’s a fixed amount of time that brings the paycheck. But let’s say instead that your job pays you a certain bonus after every third sale you make, with no base salary at all. That is a fixed ratio schedule because you get paid only when you exhibit the behavior of making a sale three times. As a wage earner, you would find a variable interval schedule of reinforcement much more frustrating. Your paycheck would come after a particular period of time, but that time period could vary widely and unpredictably from one paycheck to the next. A variable ratio schedule would also be disheartening because you would get bonuses after a certain number of times you performed certain behaviors at work, but the number of work behaviors needed in order for you to be paid would vary from bonus to bonus.
Skinner found that, on the whole, ratio schedules induce behavior at higher rates because they facilitate connections between behavior and reinforcers. You see fairly quickly and clearly how performing a certain action causes the reinforcer to be given. And variable schedules tend to create behavior that is harder to extinguish, or get rid of. That’s because, in those situations, you never know exactly when the reward might be right around the corner, so you may think: Why quit now?
This is a good time to point out that the term negative reinforcement does not mean what most people think it means; it actually refers to the removal of a stimulus rather than to an unpleasant consequence. So while many people think of positive reinforcement as a reward and negative reinforcement as a punishment, this isn’t really the case. In conditioning models, the balance of “positive” versus “negative” simply involves the presence or absence of a stimulus. Punishment, of course, involves situations that will make a behavior less likely to be repeated, whereas reinforcement makes a behavior more likely to be repeated. Both punishment and reinforcement can be positive and negative according to whether a stimulus is present or absent.
Skinner also showed that shaping can happen: If you reinforce someone’s attempts that are in the ballpark of a desired behavior and try to guide those attempts accordingly, you can get them closer and closer to the response you want. Perhaps most famously, Skinner showed this with piano-playing cats, Ping-Pong-playing birds, and vacuum-cleaning pigs. He even partnered with the US military to train pigeons to guide missiles and saw some success, though funding for that project eventually lapsed.
Later in his career, Skinner combined his behaviorist principles with utopian ideas, writing the novel Walden Two, a reference to Walden, Henry David Thoreau’s treatise on living in natural and simplified surroundings and relying solely on oneself. In Skinner’s novelistic version of utopia, a group of people forms a community where each person can engage in work, hobbies, and the arts. The behaviorist catch, of course, is that no one in the community is thought to have free will or really much true freedom at all, as the community’s members act in accordance with what the communal environment reinforces within them. Though the title of Skinner’s additional book on the same theme, Beyond Freedom and Dignity, certainly sounds anything but utopian, its premise is that we should all attempt, as much as possible, to maximize the societal benefits of our behaviors.
Finally, Skinner argued that language is not qualitatively different from any other behavior; it is learned through repetition and reinforcement, just like anything else. This came to be another aspect of radical behaviorism, eventually bringing great controversy and dissent.
Skinner consolidated and revolutionized behaviorism into a force that shaped mid-20th-century thought. His reinforcement schedules and operant conditioning models have influenced the development of psychological treatments for disorders as varied as obsessive-compulsive disorder and substance abuse. Skinner’s behaviorist views went head to head with the nativist views of Noam Chomsky in a highly publicized debate where the two men presented opposing theories of language development.
WHAT ABOUT ME?
Skinner’s schedules of reinforcement can be seen to underlie our behavior and motivation, in everything from work life to child discipline to gambling.
Let’s start with a slot machine. You probably know that, overall, you against this one-armed bandit is statistically a losing proposition—casinos aren’t so shiny and huge for nothing. And yet many of us may continue to put our money into a slot machine, over and over again, even when we lose every time. Why is this?
The answer is Skinner’s variable reinforcement schedule. The hardest schedule on which to extinguish a behavior, it is the reason that you are still tempted with each trip to Vegas, and it’s the cause of the difficulty you have in finally knowing when to fold ’em. You know that a slot machine pays sometimes, and sometimes it pays quite big. And whether a slot machine will pay is determined not by what time it is, but by how many times it’s been spun. Oh, how difficult it can be to walk away when you know that just one more spin could send the whole thing exploding with bells, lights, whistles, and a big wad of cash!
Variable reinforcement schedules are also part of why so many of us are so addicted to our smartphones. After all, if you keep scrolling, checking, and refreshing enough, you will sometimes find something quite interesting. Perhaps you’ve had the unsettling realization that you grab your phone almost automatically, feeling somewhat uneasy without it. This is because you feel the need to engage in the behavior of checking for updates or, at the very least, listening for the “ding” that indicates them. It’s the variable ratio schedule—some amount of checking your phone will pay off, but you’re never sure what that amount will be. You know that the “ding” could just be spam from a restaurant whose mailing list you signed up for in order to get a free appetizer. But it also could be news from an old friend, a “job well done” acknowledgment from your boss, or a funny story from your sister. And so you continue, with your behavior proving particularly hard to extinguish.
Let’s think about the motivation to work. Have you ever started a blog, only to have it gradually peter out after a few months, despite your best intentions? You may have cursed yourself for being so undisciplined; after all, you enjoy blogging more than going to your job, and perhaps you even had hopes of eventually developing an empire that would allow you to quit work altogether and be supported as a full-time blogger. And yet you just couldn’t seem to stick to a schedule. That is likely because writing a blog post was not paired with any tangible reinforcement. You love the overall blogging concept, and you know it will take time to build an audience, but there are no consequences for any individual post, and there’s no direct reinforcer for writing one. You don’t have a schedule of reinforcement at all except for the vague good feeling you get from writing. But when only your Aunt Edna appears to have read your posts, that’s not quite enough to keep you going. Your day job, on the other hand, gives you a paycheck every two weeks without fail, and if you were to suddenly stop showing up for that, the paycheck would eventually become a no-show as well. So the behavior of your going to work will likely continue to be much more reliable than the behavior of your writing posts for your blog.
Childrearing is, for many, the hot spot of behavioral reinforcement. Never does it seem more crucial to create motivation for specific behaviors than when you’re trying to convince a tiny human to get up off the floor at Target and stop screaming. A time-out is a classic negative punishment—again, not because it actively presents an unpleasant stimulus per se, but because it takes away the positive stimulus. There is the absence of reward: no playing, no parental attention, and no ability to move around and be stimulated during the allotted period of the time-out. For many kids, of course, there is certainly the presence of unpleasant stimuli—the tear-inducing sound of their siblings having fun when they are not. But you put the child in a time-out because you are hoping to extinguish the behavior you saw and don’t want to see recur, and because you want to break whatever association it has with any kind of reward.
Sticker charts, clicker training for dogs, and token economies are all inherently Skinnerian concepts. In fact, we have come to think of them as such fundamental tools for motivation that we may no longer even think about the psychological mechanics that underlie them, or about the exact connections that we are trying to create between our behaviors and the ways they are reinforced.