Playing A Slot Machine Is Reinforced On A Schedule

Learning Objectives

Reinforcement is a term in psychology for a process of strengthening a directly measurable dimension of behaviour-such as rate (e.g., pulling a lever more frequently), duration (e.g., pulling a lever for longer periods of time), magnitude (e.g., pulling a lever with greater force), or latency (e.g., pulling a lever more quickly following the. This is the most powerful partial reinforcement schedule. An example of the variable ratio reinforcement schedule is gambling. Imagine that Sarah—generally a smart, thrifty woman—visits Las Vegas for the first time. She is not a gambler, but out of curiosity she puts a quarter into the slot machine, and then another, and another. Variable-Ratio (The Slot Machine) A variable-ratio schedule rewards a particular behavior but does so in an unpredictable fashion. The reinforcement may come after the 1st level press or the 15th, and then may follow immediately with the next press or perhaps not follow for another 10 presses.

Distinguish between reinforcement schedules

Remember, the best way to teach a person or animal a behavior is to use positive reinforcement. For example, Skinner used positive reinforcement to teach rats to press a lever in a Skinner box. At first, the rat might randomly hit the lever while exploring the box, and out would come a pellet of food. After eating the pellet, what do you think the hungry rat did next? It hit the lever again, and received another pellet of food. Each time the rat hit the lever, a pellet of food came out. When an organism receives a reinforcer each time it displays a behavior, it is called continuous reinforcement. This reinforcement schedule is the quickest way to teach someone a behavior, and it is especially effective in training a new behavior. Let’s look back at the dog that was learning to sit earlier in the module. Now, each time he sits, you give him a treat. Timing is important here: you will be most successful if you present the reinforcer immediately after he sits, so that he can make an association between the target behavior (sitting) and the consequence (getting a treat).

Once a behavior is trained, researchers and trainers often turn to another type of reinforcement schedule—partial reinforcement. In partial reinforcement, also referred to as intermittent reinforcement, the person or animal does not get reinforced every time they perform the desired behavior. There are several different types of partial reinforcement schedules (Table 1). These schedules are described as either fixed or variable, and as either interval or ratio. Fixed refers to the number of responses between reinforcements, or the amount of time between reinforcements, which is set and unchanging. Variable refers to the number of responses or amount of time between reinforcements, which varies or changes. Interval means the schedule is based on the time between reinforcements, and ratio means the schedule is based on the number of responses between reinforcements.

Table 1. Reinforcement Schedules
Reinforcement Schedule	Description	Result	Example
Fixed interval	Reinforcement is delivered at predictable time intervals (e.g., after 5, 10, 15, and 20 minutes).	Moderate response rate with significant pauses after reinforcement	Hospital patient uses patient-controlled, doctor-timed pain relief
Variable interval	Reinforcement is delivered at unpredictable time intervals (e.g., after 5, 7, 10, and 20 minutes).	Moderate yet steady response rate	Checking Facebook
Fixed ratio	Reinforcement is delivered after a predictable number of responses (e.g., after 2, 4, 6, and 8 responses).	High response rate with pauses after reinforcement	Piecework—factory worker getting paid for every x number of items manufactured
Variable ratio	Reinforcement is delivered after an unpredictable number of responses (e.g., after 1, 4, 5, and 9 responses).	High and steady response rate	Gambling

Now let’s combine these four terms. A fixed interval reinforcement schedule is when behavior is rewarded after a set amount of time. For example, June undergoes major surgery in a hospital. During recovery, she is expected to experience pain and will require prescription medications for pain relief. June is given an IV drip with a patient-controlled painkiller. Her doctor sets a limit: one dose per hour. June pushes a button when pain becomes difficult, and she receives a dose of medication. Since the reward (pain relief) only occurs on a fixed interval, there is no point in exhibiting the behavior when it will not be rewarded.

With a variable interval reinforcement schedule, the person or animal gets the reinforcement based on varying amounts of time, which are unpredictable. Say that Manuel is the manager at a fast-food restaurant. Every once in a while someone from the quality control division comes to Manuel’s restaurant. If the restaurant is clean and the service is fast, everyone on that shift earns a $20 bonus. Manuel never knows when the quality control person will show up, so he always tries to keep the restaurant clean and ensures that his employees provide prompt and courteous service. His productivity regarding prompt service and keeping a clean restaurant are steady because he wants his crew to earn the bonus.

With a fixed ratio reinforcement schedule, there are a set number of responses that must occur before the behavior is rewarded. Carla sells glasses at an eyeglass store, and she earns a commission every time she sells a pair of glasses. She always tries to sell people more pairs of glasses, including prescription sunglasses or a backup pair, so she can increase her commission. She does not care if the person really needs the prescription sunglasses, Carla just wants her bonus. The quality of what Carla sells does not matter because her commission is not based on quality; it’s only based on the number of pairs sold. This distinction in the quality of performance can help determine which reinforcement method is most appropriate for a particular situation. Fixed ratios are better suited to optimize the quantity of output, whereas a fixed interval, in which the reward is not quantity based, can lead to a higher quality of output.

In a variable ratio reinforcement schedule, the number of responses needed for a reward varies. This is the most powerful partial reinforcement schedule. An example of the variable ratio reinforcement schedule is gambling. Imagine that Sarah—generally a smart, thrifty woman—visits Las Vegas for the first time. She is not a gambler, but out of curiosity she puts a quarter into the slot machine, and then another, and another. Nothing happens. Two dollars in quarters later, her curiosity is fading, and she is just about to quit. But then, the machine lights up, bells go off, and Sarah gets 50 quarters back. That’s more like it! Sarah gets back to inserting quarters with renewed interest, and a few minutes later she has used up all her gains and is $10 in the hole. Now might be a sensible time to quit. And yet, she keeps putting money into the slot machine because she never knows when the next reinforcement is coming. She keeps thinking that with the next quarter she could win $50, or $100, or even more. Because the reinforcement schedule in most types of gambling has a variable ratio schedule, people keep trying and hoping that the next time they will win big. This is one of the reasons that gambling is so addictive—and so resistant to extinction.

Watch It

Review the schedules of reinforcement in the following video.

In operant conditioning, extinction of a reinforced behavior occurs at some point after reinforcement stops, and the speed at which this happens depends on the reinforcement schedule. In a variable ratio schedule, the point of extinction comes very slowly, as described above. But in the other reinforcement schedules, extinction may come quickly. For example, if June presses the button for the pain relief medication before the allotted time her doctor has approved, no medication is administered. She is on a fixed interval reinforcement schedule (dosed hourly), so extinction occurs quickly when reinforcement doesn’t come at the expected time. Among the reinforcement schedules, variable ratio is the most productive and the most resistant to extinction. Fixed interval is the least productive and the easiest to extinguish (Figure 1).

Connect the Concepts: Gambling and the Brain

Skinner (1953) stated, “If the gambling establishment cannot persuade a patron to turn over money with no return, it may achieve the same effect by returning part of the patron’s money on a variable-ratio schedule” (p. 397).

Figure 2. Some research suggests that pathological gamblers use gambling to compensate for abnormally low levels of the hormone norepinephrine, which is associated with stress and is secreted in moments of arousal and thrill. (credit: Ted Murphy)

Skinner uses gambling as an example of the power and effectiveness of conditioning behavior based on a variable ratio reinforcement schedule. In fact, Skinner was so confident in his knowledge of gambling addiction that he even claimed he could turn a pigeon into a pathological gambler (“Skinner’s Utopia,” 1971). Beyond the power of variable ratio reinforcement, gambling seems to work on the brain in the same way as some addictive drugs. The Illinois Institute for Addiction Recovery (n.d.) reports evidence suggesting that pathological gambling is an addiction similar to a chemical addiction (Figure 2). Specifically, gambling may activate the reward centers of the brain, much like cocaine does. Research has shown that some pathological gamblers have lower levels of the neurotransmitter (brain chemical) known as norepinephrine than do normal gamblers (Roy, et al., 1988). According to a study conducted by Alec Roy and colleagues, norepinephrine is secreted when a person feels stress, arousal, or thrill; pathological gamblers use gambling to increase their levels of this neurotransmitter. Another researcher, neuroscientist Hans Breiter, has done extensive research on gambling and its effects on the brain. Breiter (as cited in Franzen, 2001) reports that “Monetary reward in a gambling-like experiment produces brain activation very similar to that observed in a cocaine addict receiving an infusion of cocaine” (para. 1). Deficiencies in serotonin (another neurotransmitter) might also contribute to compulsive behavior, including a gambling addiction.

It may be that pathological gamblers’ brains are different than those of other people, and perhaps this difference may somehow have led to their gambling addiction, as these studies seem to suggest. However, it is very difficult to ascertain the cause because it is impossible to conduct a true experiment (it would be unethical to try to turn randomly assigned participants into problem gamblers). Therefore, it may be that causation actually moves in the opposite direction—perhaps the act of gambling somehow changes neurotransmitter levels in some gamblers’ brains. It also is possible that some overlooked factor, or confounding variable, played a role in both the gambling addiction and the differences in brain chemistry.

Glossary

continuous reinforcement: rewarding a behavior every time it occurs

fixed interval reinforcement schedule: behavior is rewarded after a set amount of time

fixed ratio reinforcement schedule: set number of responses must occur before a behavior is rewarded

operant conditioning: form of learning in which the stimulus/experience happens after the behavior is demonstrated

variable interval reinforcement schedule: behavior is rewarded after unpredictable amounts of time have passed

Playing A Slot Machine Is Reinforced On A Blank Schedule

variable ratio reinforcement schedule: number of responses differ before a behavior is rewarded

When we want a child to behave a certain way, one of the best ways to ensure this is by using reinforcement. A schedule of reinforcement indicates whether a behavior will be reinforced after every time it occurs (continuous) or after only some times (intermittent). Each of those frequencies also has 2 possibilities, fixed and variable, resulting in 4 schedules of reinforcement. Of those, the strongest reinforcement schedule is intermittent variable. To better understand this schedule, think about a casino slot machine.

You put your money in, push the button to spin the wheels and rub your lucky rabbit’s foot.

The machine doesn’t pay out, then maybe it does a little, then maybe it doesn’t for quite awhile…

You know there’s a big pay out in there.

You also know that you never know when it’s going to hit.

A small payout here and there keeps you in a state of anticipation.

And you don’t want to miss that big payout or unwittingly give it to “that guy” who will come behind you with a quarter and one spin and win all the money you’ve put in for the past hour and then some.

So you keep playing and playing and playing, being reinforced by the small payouts here and there…

(There’s a reason slot machines have the worst odds in the entire casino, y’all.)

A casino dealer once told me that he knew of more than one little old lady who wet herself because she didn’t want to leave the machine to use the restroom for fear she’d miss that big payout for which she’d been waiting a long time. That’s how compelling the intermittent reinforcement schedule can be. Make it a progressive game, and it becomes even more addictive because you can see the amount of the huge payout taunting you in the LCD display above your head.

When it comes to parenting, using an intermittent schedule of reinforcement in your disciplinary protocol is a particularly bad idea. Here’s how this translates:

You’re in the checkout line at the grocery store, you know, the area of parental torture where the store owners put the candy, gum, lip balm and other child must-haves. (It’s brilliant marketing, really, just like the weight loss-promising and celebrity drama-exposing magazines in the same area that lure adults to read a story that they will buy the magazine to finish reading.)

Your child asks you for M & Ms because he’s forced to stare at them while waiting in line.

You say no because it’s close to lunch time or because there’s candy at home in the pantry or because you’re on a grocery budget that doesn’t include frivolities.

Whatever the reason, your child cares not and continues to ask and ask and ask.

You say no about 16 times but relent out of frustration on time number 17.

What has your child learned? If you ask enough times, you get what you want.

Playing A Slot Machine Is Reinforced On A Scheduled

What do you think will happen next time?

Or this…

You render a consequence when your daughter pulls your son’s hair. You do this consistently until one time—because you’re tired or distracted or in public or whatever—you don’t, and your child endures no repercussions for her assaultive behavior.

You’ve just significantly increased the odds that your child will try that behavior again.

Or maybe this…

The house rule is that kids are off of electronics by 6:30 p.m., and that is consistently reinforced until, one night, dinner takes a little longer and you allow the kids to stay on until 7:00.

A Gambler Playing A Slot Machine Is Reinforced On A ____ Schedule Of Reinforcement

One night two weeks later, you announce that it’s 6:30 and, therefore, time to turn off screens.

Your son is in the middle of a game on his tablet, and he asks you if he can please stay on “just a little bit longer.”

You say no and remind him that “6:30 is the rule.”

Your kid says, “But you let me do it the other night.”

I’m sure I don’t have to tell you what happens next.

Take home message: Using an intermittent variable reinforcement schedule means that children will be more inclined to test you because sometimes you pay out. And children are all about playing the odds. They’re little gamblers at heart.

Also…Specifying “just this once” or “ok, this time I’ll let you” means nothing to children. They’re going to roll the dice on you next time to see if they can get that response again, never mind that you stated previously that it won’t happen again. “It happened once,” they figure, “so maybe it will happen again if I push hard enough.” So begins the crying…and that’s just for the parents!

Fortunately, the antidote for your child’s gambling is simple: Consistency. When you swiftly and consistently render the same consequences for the same bad behavior every time, it decreases the odds that they’ll try again. Notice that I didn’t say it eliminates the odds. That’s because gambling is a difficult compulsion to treat. I’m a psychologist, not a miracle worker.

DISCLAIMER: Blog material is for informational purposes only. Blog content is not intended to be a substitute for evaluation or treatment by a licensed professional. Information contained herein should not be used to diagnose or treat a mental health issue without consulting a qualified provider. This material is copyrighted and may only be reproduced with the permission of Dr. Bellingrodt.