About that Harvard Instagram study…

You may have heard about the Harvard study wherein researchers “trained a machine to spot depression on Instagram.” The paper’s subject is perfectly weaponized to make the media rounds, combining data, AI, health, a popular social network, and an enticing question to encourage clicks (what filter means you’re depressed?). MIT Technology Review, Wired, The Next Web, The Independent, and others all hit post. The story has been lighting up Twitter for nearly a week now.

But once the depression filter was revealed (Inkwell, of course), I’m pretty sure everyone stopped reading. If they had, they’d have found a different story about depression: the crowdsourced workers who fuel the algorithms, which will evaluate us, are very depressed.

To find this sad story let’s run down the numbers cited in the MIT Technology Review article:

  • The researchers asked 500 workers on Amazon’s crowdsource platform, Mechanical Turk, to complete a survey which contained a standard clinical depression survey.
  • The researchers asked those 500 workers if they would share their Instagram posts for the study. 170 workers agreed.
  • Out of those 170 workers, 70 were clinically depressed, based on their survey responses.
  • The researchers sorted the Instagram photos shared with them(again, with Mechanical Turk workers) and used that data to train a machine learning algorithm.
  • The researchers tested their algorithm on the photos of 100 individuals and correctly identified 70% of those who are depressed. (Who these individuals are, how they got their photos, and how they diagnosed them is not specified — I assume they repeated the steps above.) 70% accuracy sounds pretty good! Allegedly, this hit rate is better than general practitioners. But it is *hardly *statistically significant. A test group of 100 is laughably small and the paper has yet to be peer reviewed. (Nick Stockton covers this on Wired, atoning for the publication’s earlier breathlessness.)

But they’ve buried the real story.

The depression rate among adults in the United States is 6.7%.

The depression rate among the crowdsourced workers who shared their photos is 41.2%. Over six times the national norm.

Working on Mechanical Turk, it appears, is crushing.

Mechanical Turk does not pay well. Because of their status as independant contractors, Turkers (as they are called) are not covered by labor laws. Their hourly pay ranges from $1-$5.

But poverty does not appear to be the driver for this high depression rate. According to the CDC, poverty doubles the average US depression rate. Mechanical Turk, according to the Instagram study, multiplies it by six.

With the recent rise of deep learning, Mechanical Turk has become a training ground for algorithms. Turkers sort data which will be used to create machine learning products. The best summary of Mechanical Turk, its workers, and the machines they train is this episode of NPR’s Planet Money.

Listening to Planet Money, its easy to see how crowd work can spur frustration and feelings of helplessness beyond poverty itself. There are no bosses or structure, just rapidly cycling tasks. Pay for repetitive work is generally insultingly low. There are no avenues for recourse other than self-organization and open letters to Amazon which generate no response.

When we discuss the issues inherent with AI and machine learning we usually focus on the perils of allowing computers to make decisions humans currently own. We rarely discuss the people whose work or attention create the algorithms themselves.

This is a mistake. Crowd work will only grow in the future, either through sharing-economy applications or online work. It’s existence without appropriate, modern regulation is worth discussion.

In an ironic twist, the decisions made by the powerless people on Mechanical Turk will be amplified in algorithms which will eventually have power over us all. Do the depressed judge depression or photos differently than the happy? If the people training these machines do not represent us, we will cede decisions to algorithms with which we will likely disagree. The case discussed here regarding Mechanical Turk is even worse: the work of sorting itself could turn a representative population into a depressed one, making skewed decisions unavoidable.

It is a missed opportunity that crowd work remains largely invisible while its output, machine learning, is a topic du jour.