Reinforcement learning in behavior change.
Highlights:
- Habits form through reinforcement learning because the brain repeats what feels rewarding, not necessarily what is good for us.
- This mechanism evolved for survival, helping humans conserve energy by automating repeated actions.
- Modern environments exploit this system, reinforcing behaviors like phone use or overeating through immediate gratification.
- Dopamine strengthens habit loops by linking triggers, actions, and rewards, making behaviors more automatic over time.
- Breaking habits requires recognizing when a behavior is less rewarding than expected weakens the habit.

When we do something that feels good, our brain remembers that reward. When a similar situation appears in the future, it pushes us to do it again because it seeks that same feeling. This process, reinforcement learning, is what leads us to repeat behaviors, whether good or bad. It’s what forms habits. “It’s probably the most important learning mechanism preserved through evolution in our brain,” explains Dr. Judson Brewer, a psychiatrist known for his bestselling books.
A habit is something we do automatically.
Judson Brewer
This mechanism has an evolutionary basis tied to survival and developed in conditions where resources were uncertain. It helped our ancestors navigate dangerous environments without having to rethink every decision from scratch. Each day required finding food, avoiding predators, and reacting quickly to threats. If the brain had to analyze every action as new, survival would be at risk. The brain also consumes a lot of energy, so saving effort was crucial. Habits allowed energy to be reserved for new or unexpected situations. “A habit is something we do automatically,” Brewer explains. When something is repeated over and over, the brain automates it. “Imagine waking up and having to relearn how to walk, get dressed, or make breakfast. None of us would survive.”
Our ancestors didn’t face highly addictive technologies like smartphones.
Judson Brewer
Today, this same mechanism can work against us by pushing us toward rewards that don’t necessarily add value. “Our ancestors didn’t face highly addictive technologies like smartphones.” The brain doesn’t distinguish between what helps us and what gives immediate gratification. That’s why behaviors like mindlessly checking your phone, overeating, or watching TV instead of exercising follow the same pattern. In its simplest form, a habit starts with a trigger in the environment. That trigger leads to an action. Then the brain evaluates the result, and if the experience is pleasant, it creates a connection between the trigger and the action, making it more likely to repeat. “We’re looking for food, we find it, that’s the trigger, and then we eat it, that’s the behavior.”
At the center of it all is dopamine, which is released when something unexpected happens. It directs attention to that event and helps the brain store it for the future. “The result is that dopamine activates in our brain and reminds us what you ate and where you found it.” With each repetition, the brain strengthens this circuit, making the behavior easier to trigger.
Neuroscience shows that our impulses and passions are actually driving us.
Judson Brewer
Changing bad habits doesn’t depend on willpower, as we often believe. “It has been a dominant paradigm for hundreds, if not thousands, of years,” Brewer explains. This narrative assumes we consciously evaluate options and choose deliberately. But in reality, reinforcement learning operates unconsciously. What we experience as a decision is often just an interpretation of a deeper process that already occurred. “Neuroscience shows that our impulses and passions are actually driving us.” Most of the time, the brain is comparing options without us noticing and choosing the one it predicts will feel better. When we say “I’m going to do this,” the decision has often already been made.
To change habits, you need to pay attention while the behavior is happening and observe how it actually makes you feel, bringing it into awareness. Often, you realize it’s not as pleasant or useful as you thought. “I ask people to pay attention when they smoke, and they realize cigarettes taste awful,” he explains. When you notice it’s not as good as expected, your brain starts to update its belief, and the habit gradually loses strength. This is called a “negative prediction error,” when the brain expects a reward but gets something less satisfying. The gap between expectation and reality gets updated, and the behavior loses value, weakening the urge to repeat it.
The idea of 21 days is a total myth. It’s not time-based.
Judson Brewer
There’s no fixed timeline to break habits. “The idea of 21 days is a total myth. It’s not time-based,” he says. The process depends on how the brain updates the reward value of the behavior. When an experience contradicts expectations, change can happen quickly. If the signal is weaker, repeated exposures may be needed. “If we have something much better, our brain won’t go back to something worse. Why would it?”
Try it. Dr. Brewer recently stopped drinking orange juice, essentially “sugar water”, when he became aware of its real effects. “I had a sugar spike and then a crash.” Each time he drank it, he paid attention. It became a flat, overly sweet experience, and his brain stopped valuing it. I had a similar experience. Before talking with Jud, I didn’t really understand how I quit smoking in a single day. I used to tell the story trying to find a rational explanation. After this conversation, it’s clear. I suddenly realized that smoking was distancing me from the people I loved most. That realization was so strong that I quit, and haven’t gone back, almost a year later.