This is Part 5 of my series on leaderboards. Is this your first visit? You may want to read Parts 1, 2, 3 and 4 first.
In this segment, we turn to academic research on extrinsic rewards used as motivation in social networks and gaming.
Fifth perspective: the academics
I’ve talked fairly extensively about my own and others’ experiences with leaderboards in online gaming, gamification experts’ thoughts and psychological theories. But there’s a big component missing in the discussion thus far – that of proper research done at scale. What is actually happening when leaderboards and other game elements are introduced into mature systems?
In the following I look at two different experiments that examine this topic. The first takes on gamifying office work, and the other explores introducing leaderboards into a game that previously didn’t have any.
Rosta Farzan and team conducted a series of experiments in 2008 that they reported in their paper “Results from Deploying a Participation Incentive Mechanism within the Enterprise.”
Prof Farzan’s team introduced a points-based levelling-up system into IBM’s in-house social network. Employees received points for adding content, commenting on others’ content and updating their profiles. Remember, points are a sort of extrinsic reward. The points system was only introduced to half of IBM’s employees (the test group), with the other half acting as the control group.
User engagement of the test group increased significantly when the newly gamified system was released. The participants especially increased the activity that would earn them the most points, indicating the direct relationship of the reward to the activity.
However, after just one week engagement started to decline.
In fact, for two of the three areas looked at, activity declined to levels lower than before the experiment had started.
Activity in the control group also increased once the gamified system was released, indicating that the control group participants were responding to more comments and other contributions by increasing their activity as well. Similarly, a decline in activity of the control group mirrored the decline in the test group. In fact, their activity also decreased to levels lower than when the experiment started (although their decline started the week after the test group’s decline). This is a very interesting finding, as it suggests that long-term use of a system like this is not just bad for the participating players, but also the players who they’re playing against.
And I say players now because it is easy enough to draw parallels between this gamified system and a poker network. In this experiment, a person posts content, others comment, and it grows exponentially. That encouragement to post content resulted in a sharp increase of comments to that content. Similarly, a poker player sits down at an empty table, other players join, players open more tables, and overall liquidity increases.
The researchers put this decline in activity down to a weakness in their experiment design: points didn’t expire, so level movement could only go up and not down. This made the system in the experiment less dynamic than most points systems. This was confirmed through interviews with the participants. Definitely something to keep in mind if you are building a loyalty or points system – make sure there is a risk of sliding back down the flagpole.
At the end of the experiment, the researchers concluded that “It is important to consider that the benefits to the community may not offset the costs of building the system if the incentive mechanism does not continually [incentivise] users to contribute over time.”
Speaking from experience, this is where companies fall into the danger zone. They see what great engagement they’re getting from £X each month, so the next month they increase it to £2X. Players increase their activity further, so the operator ups the prizes the next month. And so on. Remember Ladbrokes’s $1m rake race?
Verdict: inconclusive. While this experiment definitely doesn’t show that a gamified system is positive in the medium or long term, the explanation the researchers have for the decline of activity is reasonable. But it’s very possible the real story is more complex. Unknowable without more research.
Also, this experiment only looked at two of the three elements in the gamification trifecta. If you recall from Part 2, those are points, badges and leaderboards. And although this blog series is about the missing third, leaderboards, I would argue the research is relevant to what we’re discussing.
The entire report is full of interesting insights. I encourage you to purchase it and read it in full.
Elisa Mekler is a psychologist and self-proclaimed gamer at the University of Basel in Switzerland. As she said at a 2013 Gamification Conference:
“There is ample evidence in psychological research that handing out rewards […] will in fact lower people’s intrinsic motivation to engage in a certain activity. […] [However], there is, as of now, very little empirical evidence that has shown that this is true, so this is where our study comes in.”
In her 2013 research paper “Do Points, Levels and Leaderboards Harm Intrinsic Motivation? An Empirical Analysis of Common Gamification Elements,” Ms Mekler and her team reference the Farzan experiment, and they draw the same conclusions as I – adding extrinsic motivation may decrease intrinsic motivation.
In their experiment, Ms Mekler and her team worked with 295 participants. Each of these participants (let’s call them players) was presented with abstract paintings that they needed to provide annotation tags for. The players got 100 points for each tag. After they looked at all 15 of the paintings, they were given their score. They could then compare their score against four (fake) participants in a leaderboard. The players then saw a progress bar with the label “next level” with the points required to move up.
Interestingly, Ms Meckler has a different idea than Prof Werbach on the three key elements in gamification. She sees these as points, levels and leaderboards. I suppose levels and badges are like different sides of the same coin (in some cases), but I may start thinking about the gamification trifecta as a superfecta. Or tetrad. Hm, it’s not nearly as catchy.
Behaviour of the test group was measured on each of the conditions. What the researchers found is that all players in the test group had increased activity vs the players in the control group. The experiment was repeated, and the increased activity declined, although not as low as the levels in the control group. The players in the test group were just more productive, even over time.
The researchers concluded that adding points, levels and leaderboards did not decrease participants’ intrinsic motivation. They posited that this might not have held true had they added cash prizes as incentives, which would add more of a “controlling” factor (this has to do with self-determination theory, a topic for another day).
Watch Ms Meckler’s presentation on this:
Verdict: inconclusive. This is a very interesting study, and I think it should be required reading for anyone interested in this topic. You can order a copy from this link. Of all the research I read, it’s closest to the topic, but I’ve got a few issues with the potential usefulness of it:
- The task was too simple. If we were looking at whether gamification can make something equivalent to data entry more fun, then the answer to that seems fairly obvious. Whether a gamified system makes more complex work fun, or whether game elements like leaderboards can make casino or poker games stickier long-term, is not answerable from this research. This was highlighted in the “limitations” section of the research paper as well.
- The experiment only measured short-term impact. Only 20 minutes. That’s pretty short.
- Their 295 participants all came from the university’s database. Aside from the immediate concern that the people contained within a university’s database will not be representative of the wider game-playing public, there’s volunteer bias.
Volunteer bias can be defined as the bias that comes from the fact that a particular sample can contain only those participants who are actually willing to participate in the study or experiment (Heiman, 2002). There are apparent differences between those people who are willing to participate in a study or experiment and those who do not wish to do so. According to Heiman (2002), volunteers tend to have a higher social status and intelligence, exhibits an increased need for approval, and have a tendency to be less authoritarian and conforming. Also, those who participate and find the topic particularly interesting are more likely to volunteer for that study, same to those who are expected to be evaluated on a positive level (Heiman, 2002).
Still, the research is compelling. Although what it mostly seems to do is provide some background information (always good), then highlight the need for further research.
That’s it for this week. Keep your eyes open for Part 6!
 Image courtesy of this Trekkie Wikia page. Whenever someone says “Somethingsomethingsomething in the enterprise” I get really excited for a millisecond, then I feel a bit let down.
 The experiment does utilise badges, although I haven’t referenced them in this blog, as I am trying to distill a ten-page paper into about as many paragraphs. Read the research paper for more information on this.
 Ms Mekler studies and works at the University of Basel in Switzerland. In addition to being lucky enough to be Swiss and live in one of the most beautiful countries in the world, she studies in the Mensch-Maschine Interaktion department. Although it’s no doubt the same thing as Human Computer Interaction, the name reminds me enough of Kraftwerk that I haven’t been able to get this song out of my head all day.
 Thanks, Psychwiki.