Debunking a Viral Internet Post About Breastfeeding Racism

Posted by ryuzaki0 on Thursday November 13, 2014 @05:50AM from the believe-the-worst dept.

Bennett Haselton writes: A editorial with 24,000 Facebook shares highlights the differences in public reaction to two nearly identical breastfeeding photos, one showing a black woman and one showing a white woman, each breastfeeding an infant. The editorial decries the outrage provoked by the black woman's photo compared to the mild reaction elicited by the white woman's photo, and attributes the difference to racism. I tried an experiment using Amazon's Mechanical Turk to test that theory. Read on to see the kind of results Bennett found.

You can see the side-by-side pictures in the November 10 editorial by Ruby Hamad. My first thought, upon seeing the pictures, was that this is not a controlled experiment -- the woman on the left is breastfeeding in public, while the woman on the right is breastfeeding against a blank wall inside a presumably private room. While I think breastfeeding in public should be completely normalized, it's not the same thing as breastfeeding in private, and so that might have accounted for the difference in reactions, if there was any.

My second thought was that the data on people's reactions was not collected in a systematic way. According to the editorial, the black photo of the black mother, Karlesha Thurman, was posted on the Facebook page Black Women Do Breastfeed, and "[w]hile Karlesha received many supportive comments, the backlash was so severe, she eventually deleted the photo." The photo of the Australian woman, Jacci Sharkey, was posted by the University of the Sunshine Coast on their Facebook page, where it received 275,000 Facebook "likes", but also, according to the editorial, "more than a few detractors, proving that breastfeeding in public is (still!) a contentious issue for women of all races." There's no apples-to-apples comparison gauging people's reactions to the two photos under similar conditions.

But just because the methodology was imprecise, doesn't mean that the underlying phenomenon might not be real. Maybe Internet users really do have different gut reactions to pictures of black women and white women breastfeeding.

One quick way to get a rough answer is Amazon's Mechanical Turk service, where you can pay legions of workers some small amount of money per person to complete some menial task that can't be automated by a computer. I've used it dozens of times for surveys (such as gauging whether people would strongly prefer slideout keyboard phones) and for amateur psychological experiments (including one experiment which suggested that people who answered a math problem correctly were more likely to disagree with an attorney general's dubious legal argument). So I created a poll on Mechanical Turk, limited to U.S. users and with a payout of 25 cents for each person who answered. The poll asked:

Our academic department has asked everyone to submit a "fun" photo of themselves, so that our photos can be displayed together on the department home page. One of our employees submitted a photo that has caused some internal debate about whether the photo is inappropriate. I wanted to do a poll to get the opinion of a random sample of Internet users of different backgrounds.

Do you think this is an appropriate picture to be used in a photo collection on our academic department home page?

Since the original photos had been published in different contexts anyway, I tried to find a middle ground for the wording of the survey question, to emphasize that the photos were going to be published in a "fun" setting, but still integrated into the women's professional environments. The survey-takers were then (randomly) shown either the black woman's photo or the white woman's photo, and answered "Yes, the image is fine" or "No, the image is inappropriate". Then respondents were asked to fill in their age, gender, ethnicity, and education level.

(One thing that I've found with all of my previous surveys on Mechanical Turk, is that there is strong evidence that survey-takers are not answering randomly. Strong correlations often occur where you would expect them to -- for example, in a survey about what are the greatest causes of global strife, the same people tend to select "Energy shortages" and "Environmental damage" above other options, whereas another subgroup will tend to select both "Atheism" and "Decline of traditional values". And any survey where I've added a textbox for users to enter "more thoughts", most users enter something reasonably thoughtful which corresponds to the multiple-choice answers they've selected. Formal research by the psychologist Samuel Gosling has similarly found that Internet surveys can be useful for psychological research and are not plagued with bot-responders or random answers. So I'm working under that assumption.)

The results: Out of 47 respondents who saw the black girl's picture, 36 said the image was inappropriate (77%). Out of 54 respondents who saw the white girl's picture, 38 said the image was inappropriate (70%). For such a small sample, that's not enough to definitively say whether the small difference is due to random chance, or due to small differences in opinion in the population being surveyed. What it does show, even with such a small sample, is that in the underlying population there's almost certainly no huge gap between people's opinions of black women vs. white women breastfeeding in photos.

In both surveys, both male and female respondents voted the photos "inappropriate" with about the same frequency. For the black woman's photo, 22 out of 26 men (86%) and 14 out of 21 women (67%) voted the photo inappropriate; for the white woman's photo, 19 out of 30 men (63%) and 19 out of 24 women (79%) voted it inappropriate. There also didn't appear to be any correlation between the age of the respondents and their responses. (You can view the breakdown of answers in terms of respondent demographics here for the black woman's picture and here for the white woman's picture; the crummy layout is because I just copied-and-pasted the output from my own custom-written survey-taking tool, where I usually just view the results for myself.) As for the gap between black and white survey-takers, in the case of the black woman's photo, 24 out of 34 white survey-takers (70%) and 5 out of 6 black survey-takers (83%) voted it inappropriate, while for the white woman's photo, 25 out of 36 white survey-takers (69%) and 4 out of 4 (100%) of black survey-takers voted it inappropriate -- but those discrepancies probably don't mean much, since the population of self-identified black respondents was too small in both cases to draw any conclusions.

Even with small samples, though, I would argue that this is a better way to answer the question of latent racism than to draw fuzzy conclusions based on the trolling comments posted on a Facebook photo. My guess is that even if there was an underlying difference in the frequency of negative comments posted to the two photos, part of it could have been due to the photo being posted in a Facebook group titled "Black Women Do Breastfeed", a group name that is practically begging for trolls to wait for a chance to try and provoke an outraged response. The white woman's photo, on the other hand, was posted on the University of the Sunshine Coast Facebook page, which is not the kind of place that maladjusted nitwits hang out trying to start a flame war. And for the trolls who did post on the white woman's photo, their natural inclination would be to make some immature comment about b00bs; whereas for the trolls posting on the black woman's photo, the easiest cheap shot would be to make it about race. But that doesn't mean that there is actually a racially motivated difference in people's reactions to the photos.

Besides, if you want to use Facebook to raise awareness of racism, there are properly controlled scientific experiments that have demonstrated the extent of prejudice, such as the infamous 2003 resume callback experiment which showed that resumes with white-sounding names on them received about 50% more callbacks than resumes with black-sounding names. A viral story with 24,000 Facebook shares, about two isolated incidents under different circumstances, is not necessarily evidence of racism. It might be. But you have to do some kind of controlled experiment to check first.

3 of 350 comments (clear)

Min score:

Reason:

Sort:

from the believe-the-worst dept by enjar · 2014-11-13 06:12 · Score: 3, Informative

"Bennett Haselton writes"
Yep. Checks out. But I don't believe it.
I also don't understand the point of this post. Is Slashdot hoping to get picked up on HuffPo and on a bunch of mommy blogger sites? I don't really see how Bennett's keyboard diarrhea this week is anything remotely related to "News for Nerds".
Re:Seconded. by bluefoxlucid · 2014-11-13 08:00 · Score: 4, Informative

Actually, his sample sizes are small. He says this about your 70% to 77%:

For such a small sample, that's not enough to definitively say whether the small difference is due to random chance, or due to small differences in opinion in the population being surveyed. What it does show, even with such a small sample, is that in the underlying population there's almost certainly no huge gap between people's opinions of black women vs. white women breastfeeding in photos.
This is correct: for around 20 or 30 people, you can expect random chance of e.g. 20% (I don't care to remember how to do the math here). That's 20% of the value: if 70% of group A respond one way, then you would be within random chance if group B's responses fell within 56% and 84%, and not have any conclusion. Bennett says here that groupings of 70% and 77% don't conclude a difference due to random chance, but they DO indicate a small magnitude.
Let's say that the actual numbers are 72% and 71.5%. If you performed a properly controlled experiment with tens of thousands of people on each side, you'd find one group showing 72% and one showing 71.5%. Your alpha value would be around 0.001%, so you'd expect an identical population to show something like 72% and 71.93% A value of 71.5% would be conclusive of a nearly 0.4% difference between populations.
With the small sample size, you'd need a bigger gap. If the numbers were 70% and 20%, you'd have conclusive evidence of a significant difference between populations. At 70% and 77%, you have no evidence for any difference at all; a small difference could exist, but it is exceedingly unlikely that a LARGE difference exists.
Following this logic, 86% and 67% are about the same, and 63% and 79% are about the same. If you want these values to be different from each other, you need bigger sample sizes. Small sample sizes like this are only good for striking divides such as "is your skin more like a banana or chocolate?" surveying black vs white people.
To put this into perspective: out of 14 trials with a deck of 20 red/black cards shuffling 5 times and then predicting the top card, I am 68% likely to predict the correct card drawn from a deck; out of 180 trials, I am 54% likely; out of 700 trials, I am 53.8% likely to correctly predict the card. I did better on early trials, consistently getting 2/3 or more correct. Even hundreds of trials in, I haven't closed on random chance; but we also have about a 5% confidence value at 700 trials, and 53.8% - 5% is less than 50%, so it's quite possible I'm exactly 50% likely to select correctly.

--
Support my political activism on Patreon.
There is no power. by FhnuZoag · 2014-11-13 23:42 · Score: 4, Informative

I am actually a statistician. And this 'study' looks pretty worthless.
The problem is the issue of a 'huge gap'. What gap is huge? Well, we can try and do a power calculation. How big does the gap between the black and white targets *need* to be, to have a good chance of showing up in this test?
This is simple enough to calculate. Plug in some numbers:
1. Sample size in each group - 50
2. Level of Significance - 0.05
3. Power - i.e. the desired probability of finding there to be a significant difference, *if a difference exists*. I've chosen a standard number of 0.8 - i.e. allow for a 20% chance of missing a true effect by accident.
Fixing the proportion of inappropriates for the white woman at 70%, we find.... 91.8%.
In other words, with this sample size, we actually only rule out a difference of 70% vs 91.8%, or in other words, an over 2/3rds drop in the proportion of people finding the picture appropriate.
To rephrase: if the truth was that 2/3rds of the people who think a white woman is breastfeeding would *not* think a black person breastfeeding is appropriate - a situation that I think you'd agree is very racist - then we'd miss such an effect in an experiment like this over 1/5th the time. Even assuming the experiment was conducted ideally, and no one was just randomly clicking to earn money.
This article is meaningless.