«Three Field Experiments on Procrastination and Willpower Nicholas Burger, Gary Charness, and John Lynham* September 17, 2008 Abstract: We conducted ...»
Three Field Experiments on Procrastination and Willpower
Nicholas Burger, Gary Charness, and John Lynham*
September 17, 2008
Abstract: We conducted three field experiments to investigate how people schedule and
complete tasks, providing some of the first data concerning procrastination and willpower under
financial incentives. In our first study, we paid students $95 if they completed 75 hours of
monitored studying over a five-week period. We also required people to meet interim weekly
targets in one treatment, but not in the other. In a second study, the task consisted of answering multiple-choice questions on seven consecutive days, with staggered start dates and an endogenous task ordering (tasks varied by number of questions). In our third study, participants answered 20 multiple-choice questions over two consecutive days, varying whether this was during the week or on the weekend. Participants were assigned to either an easy or difficult Stroop test (used by psychologists to deplete willpower) on the first day, before any questions could be answered. We find evidence of procrastination and willpower depletion/replenishment, as well as evidence suggesting a self-reputation interpretation. And yet the behavioral interventions we used led to outcomes that surprised us in all three studies, although these outcomes are largely consistent with the standard neo-classical model.
Keywords: Field experiment, Incentives, Procrastination, Studying, Willpower JEL Codes: A13, A22, B49, C93, D0 * Contact: Nicholas Burger, Rand Corporation, email@example.com, Gary Charness, Dept. of Economics, University of California at Santa Barbara, firstname.lastname@example.org, John Lynham, Dept. of Economics, University of Hawai’i at Manoa, email@example.com
1. INTRODUCTION People experience self-control problems when their preferences are not consistent across time. One form of self-control problem concerns persistent bad habits or addictions, such as overeating or cigarette smoking. An individual knows that he or she will later regret a current choice of self-indulgence, but nevertheless engages in the activity. The other side of the coin is a situation where an individual is faced with an activity that will lead to future benefits, but is unappealing at the moment. This often leads to procrastination, common in everyday life.
People vow to stop smoking, stop eating ice cream, or start exercising tomorrow. Procrastination has been found to be quite pervasive among students: Ellis and Knaus (1977) find that 95% of college students procrastinate, while Solomon and Rothblum (1984) find that 46% nearly always or always procrastinate in writing a term paper.
There have been at least a handful of studies that consider how one might overcome selfcontrol problems. Aside from exerting willpower in the face of a disagreeable task, one approach is to bind one’s own behavior with costly restrictions. Wertenbroch (1998) presents anecdotal examples of binding behavior, including tactics such as putting savings into a Christmas-club account that does not pay interest or buying small packages of goods such as cigarettes or ice cream.1 Schelling (1992) mentions reforming drug addicts who send out selfincriminating letters, to be divulged in the case of a relapse into drug use.
Empirical studies of habits and procrastination are new in economics. Recently, there have been some field interventions, which attempt to study these issues in a controlled environment. Angrist and Lavy (2002) offer substantial cash incentives in Israel for matriculation; while this is ineffective when individual students are selected for the treatment, matriculation rates do increase when this program is school-wide. Charness and Gneezy (2006) See Ariely and Wertenbroch (2002) for a more complete literature review.
pay students at two American universities to attend a gym during a period of time, finding that attendance rates increase substantially not only during this period, but also after the intervention ends. Angrist, Lang, and Oreopoulos (2007) offer merit scholarships to undergraduates at a Canadian university, with some success in improving performance, but mixed results overall.
Another device is to set deadlines for one’s self; for example, many a researcher has agreed to present a yet-unwritten paper in the future, in the hopes that the embarrassment of being forced to cancel or make a change will be a strong motivation for writing the paper prior to the presentation. In fact, many activities seem deadline-driven, particularly in our contemporary society in which people seem to be short on time. Ariely and Wertenbroch (2002) assign three tasks to be completed over a three-week period and find that externally-imposed costly deadlines during this period are more effective than self-imposed (and binding) costly deadlines, which in turn are more effective than having no additional deadlines. Burger and Lynham (2007) examine weight-loss bets in England, where one could bet on achieving a weight goal by a deadline;
however, the vast majority of bettors lost their bets with the agency.
In this paper, we report the results of three field experiments designed to provide data on procrastination and willpower. The primary goal of our research is to identify patterns in behavior that will both aid other researchers, and perhaps policy-makers, in designing mechanisms that are effective in overcoming obstacles to performance and inform theorists so that more descriptive models can be developed.2 We also discuss our results with respect to several models of self-control and willpower.
In our first study, we paid students $95 to complete 75 hours of studying at a monitored location in the campus library over a five-week period. In one treatment, participants were In this respect, this paper falls into all categories (“Searching for Facts”, “Whispering in the Ears of Princes”, and “Speaking to Theorists”) of the Roth (1995) taxonomy.
required to complete at least 12 hours during the first week, at least 24 hours by the end of the second week, etc., while there were no interim requirements in the second treatment. We expected people to procrastinate with their timing, leaving the bulk of the required studying until the end; in line with the results in Ariely and Wertenbroch (2002), we expected that externallyimposed costly deadlines would be effective, so that the group with the weekly studying requirement would be more likely to complete the task.3 However, completion rates were actually 50% higher with no interim requirements, as would be predicted by a standard neoclassical model. The patterns of study time show a pronounced weekly cycle, even in the no-weekly-requirement treatment, with little difference in the aggregate from week to week; however, individual analysis reveals substantial heterogeneity, with some people logging the bulk of the hours in the early weeks and some other people doing so in the late weeks. We find evidence that, over time, students who achieve the studying goal improve their performance in the course relative to those students who did not. Finally, women complete the studying task more often than men do.
Having observed different behavior on weekends and weekdays, we designed a second study in which the task consisted of answering different numbers of multiple-choice questions (the order was endogenous) on each of seven consecutive days; seven groups started on different days of the week. We asked people to designate in advance their plans for task completion and then observed their actual behavior. We offered people lottery draws for iPods for submitting their plan, for completing the study, and for answering enough questions correctly.
We find that completion rates vary substantially across the starting-day groups, even though everyone must perform a (chosen) task on each day of the week. Further, people are the
Fischer (2001) also presents arguments for breaking a task into smaller components. On p. 261, she states:
“Therefore, the best way for a supervisor to …reduce the risk of missing the ultimate deadline may be to break it into smaller tasks with more deadlines to better compete with the other demands on the student’s time.” least likely to drop out on a Monday or Tuesday, regardless of the start date. We do see evidence of procrastination among even the most disciplined people in the population, as the average number of questions answered for people who succeed at the task increases steadily over the course of the seven days. People who stick with their plan are no more likely to complete the task than people who do not. Finally, women are twice as likely to succeed at the task as men.
For our third study, we reduced the duration of the task (answering 20 multiple-choice questions) to two consecutive days and allowed the participants to complete the entire task in one sitting, if desired. We promised definite amounts of money for completing the task and for each correct answer, rather than lottery chances. We also required people to complete either an easy or difficult Stroop test on the first day. The difficult version features cognitively-discordant tasks, and is considered by psychologists to be willpower-depleting.4 Most people (63%) answered all 20 questions during the two days, and about 40% of these people answered all the questions on the first day. Indeed, among those who completed the task, those people assigned the difficult Stroop answered significantly fewer questions on the first day. However, a real surprise is that people who were assigned the difficult Stroop were somewhat more likely to finish during the allotted two days than those people assigned the easy Stroop. There is also evidence that people didn’t try as hard on the second day, as the percentage of correct answers was significantly lower then. Finally, in contrast to our first two studies, males are significantly more likely to finish this task of more limited duration.
The remainder of this paper is organized as follows: We provide details of our experimental design in section 2, and we present some theoretical models and their predictions in section 3. We describe our experimental results in section 4, offer some discussion in section 5 and conclude in section 6.
We thank Emre Ozdenoren, Stephen Salant and Dan Silverman for suggesting many of these design changes.
Study 1 Our experiment was conducted at the University of California at Santa Barbara. We obtained permission to have anonymous access to the records of students in a large introductory class and then recruited as many as possible from this class. We then advertised the session to first-year students in the general experimental subject pool.5 All students were told that they could attend an introductory meeting about an experiment that would involve a non-trivial amount of money to be earned over time. Interested students were randomly assigned to an introductory meeting.6 Participation was voluntary and everyone who showed up was guaranteed $5 if they were not interested in participating. At the meetings, we explained the nature and rules of the experiment. This process lead to a total of 73 eventual participants (out of 87 students who showed up to the meetings); 42 were from the class and 31 were from the campus-wide experimental subject pool. As we shall see later, there was no appreciable difference in behavior across these two sets of participants.
We chose the task of studying because it is a common activity for students, but one that is susceptible to procrastination. Studying has obvious long-term benefits, but is costly in the shortrun insofar as other activities have more immediate appeal.7 Nevertheless, there are already incentives in place for the studier; thus, we did not pay the usual average per-hour rates for experiments, but chose to pay $95 (not as salient as $100) for 75 hours of monitored studying.
We thank ORSEE for the free recruiting software, which allowed selective invitations.
All the students in a particular informational meeting were assigned to the same treatment group. This was done to reduce social interaction threats (Cook and Campbell, 1979).
Students may therefore wish to do more studying than they actually manage; this is similar to self-control problems such as dieting or smoking.
We showed participants the studying location, a room in the library that was frequently (but intermittently) monitored.8 This study area was available for between 12 and 16 hours each day. Subject to the availability constraint, students were free to log in and out by handing over an ID card to the monitor who would then log the student in or out on a computer. In addition, students were each given a large identifying number, unique to each individual. This was visible to the monitor at all times. The studying area was monitored hourly at a varying time each hour to ensure students were present at the studying location when signed in.
Each student was assigned a web page where he or she could check on the number of hours logged, and could then contact us in the case of any discrepancy. In addition, students who satisfied weekly studying requirements ‘banked’ their contingent earnings; their web pages had a check-like graphic showing the credit already amassed (of course, this credit was only to be paid if the student completed the overall 75-hour requirement); students who failed to meet a weekly requirement were notified at the end of the applicable week that they were no longer eligible to earn the $95. At the end of the five-week period, those students who had completed the requirement(s) received their earnings and filled out a short questionnaire.
We would like to immediately address two possible concerns. First, students had access to both computers and wireless Internet, so we cannot be certain how much of their time in the library was devoted to studying. However, the anecdotal evidence from the monitors is that, although students were occasionally just sitting and checking e-mail or Facebook, etc., studying was by far the most common activity.9 Second, one might also be concerned about contamination, since people from both treatments studied in the same area. Again, we have only anecdotal evidence against this: 1) Monitors did not observe students in conversation with one We thank the UCSB Library staff (and in particular, Eric Forte) for helping to arrange this for us.