We're all just rats in cages - and Cory Doctorow knows it Guiheadz

I just stumbled across this piece of Cory Doctorow and it’s an excellent read!

And important for anyone interested in the sneaky ways the incredibly adaptive forces of un-controlled capitalism exploit our cognitive weaknesses to influence our behaviour. Doctorow is one of the few people I know who really understand (and can express) the awe-inspiring qualities of the dopamine net we are all entangled in and how ubiquitous it is in controlling our behaviour – both off and on-line. But… I think he’s made a minor (and extremely forgivable) mistake in how he applied the concept of ‘reinforcement schedules.

So firstly, go have a read of Cory Doctorow’s article on ‘Persuasion, Adaptation, and the Arms Race for Your Attention’.

Good stuff eh? And he’s bang on point with almost all of that. But let’s talk about reinforcement schedules for a minute.

A reinforcement schedule describes how often you get a reward in response to an action. Eg: a payout when pulling the lever on a slot-machine (or a food pellet for pulling the lever in a cage – same thing). If you get a reward every time you pull the lever – that’s continual reinforcement. And leads to a certain level of learnt response and resulting behaviour. However things get interesting when you manipulate the relationship between lever pulls and rewards. You can vary the interval of rewards (ie: a reward every minute – regardless of lever pulls) or the ratio of rewards (ie: a reward for every 5 lever pulls). Each results in quite different patterns of behaviour. And one specific ratio has really interesting properties. One weird trick, you might say… If you don’t know when the next reward is coming – but it’s somehow related to how often you pull the lever – what are you going to do? You are going to mash that lever as often as possible. Ie: an intermittent reward ratio results in really high intensity responding. Think loot boxes, slot machines, or compulsively checking yr email or your phone for new notifications.

So not knowing when the next reward is coming (intermittent reinforcement) leads to remarkable increases in the levels of reward-generating behaviour.

But then Doctorow asserts:

“…Behavioral marketers know that they can prolong the efficacy of these techniques with “intermittent reinforcement” (that is, using each technique sparingly, at random intervals, which make them more resistant to our ability to grow accustomed to them),…”

Which is kind of a little bit of a mis-use of the term. Intermittent reinforcement specifically refers to the relationship of a reward as the result of an action in a given scenario (there is actually some more complexity here about conditioning stimuli etc… but let’s not get into that right now). Here Doctorow is suggesting it refers to how often the entire scenario is presented which is something that the concept doesn’t really talk about. So in all likelihood, he is right in that only occasionally presenting the specific situation where this kind of reinforcement occurs is likely to result in less learning and lower levels of response. Sure. But this is likely to be due to the fact that people learn less because their exposed to the relationship less – NOT because they learn abou the relationship and their behaviour is shaped accordingly.

Anyway – it’s a trivial and understandable mis-reading of the psych literature. And he is totally on point in terms of the implications in tech design, media and the attention economy.