ER 13: On Community, ChatGPT, and Human-Centered AI
Or, Emmie complains about driving (and dominant paradigms in AI ethicsc)
Welcome back to the Ethical Reckoner. What’s this, two consecutive months of publishing? I'd better be careful or I’ll get on a roll. Today, I have a story for you, some musings about everyone’s new online BFF, ChatGPT, and some thoughts on where ethical AI should go from here.
On December 30, a friend and I were driving on the I-15 from Los Angeles to Las Vegas. This highway runs through the middle of the desert. There’s never any traffic. And yet, on the afternoon of the 30th, we saw more brake lights than cacti.1 As we crawled through the third major slowdown—this one with no apparent cause—Google Maps pinged us about an alternate route that would save us 17 minutes if we exited the highway, drove along an access road for a few miles, then got back on the highway past a portion of the congestion. After hours in the car already and no end in sight, anything seemed better than staying on the highway, so we took the exit.
Unfortunately, so did a bunch of other people. We zoomed along for a couple miles, then encountered a line of cars who all either knew these backcountry roads intimately or (more likely) had gotten the exact same Google Maps suggestion we had. We sat, unmoving, for several minutes. Then a car came driving back the other direction, flashing its brights. “What are they doing? Is the ramp closed?” I asked. “No idea, but it’s unsettling,” responded my friend. We wondered if we should turn around, but didn’t relish the idea of getting back on the bumper-to-bumper highway. When we still hadn’t moved and a second car came by with brights flashing, though, we made the call. I swung the car around and we booked it away. Of course, now all the people in front of and behind us who had done the same thing had to get back on the congested highway too. The segments of green, traffic-free road that Google had recommended we use to bypass the initial congestion shone bright red and even maroon on the map. We despaired of ever making it to Vegas.
In the end, we did some tactical dirt roading (not recommended by Google) to get around a massive backup of cars trying to get back on the highway at the next ramp, but getting off the highway had not, in fact, been 17 minutes faster. Following Google’s recommendation had been many minutes slower for us, for everyone else who got off, and ultimately even for the cars who had chosen to stay the course but had to deal with the throngs trying to get back on the highway.
I’m telling you this story not to complain about Google Maps screwing up2 but to make a point that Google Maps screwed up in an interesting way. Google Maps is designed to optimize the route that you, a single car, are taking, based on present traffic conditions. It is not designed to take into account in advance how the recommendations it makes to everyone else might impact that route. In other words, it doesn’t consider that it’s already sent 300 cars down a side road that, at the moment, has less congestion, but will become worse than the main road if it keeps routing cars that way. Research suggests that more people using navigation apps may actually make traffic worse overall, partly because of how it routes users down side streets. The Atlantic dubs this “perfect selfishness” on the part of commuters, but I think the blame lies with the app providers—we as drivers are conditioned to believe that using mapping apps is the best option for everyone and don’t realize that they could be harming the wider community using them.
Another area that could use some community orientation is the recent explosion of generative AI—tools that create photos, text, audio, or video that can be indistinguishable from that created by humans. The most recent one to make waves is ChatGPT, and between giggling about biblical verses about removing a peanut butter sandwich from a VCR and trying to get around its safety barriers by asking for poems about how to hotwire a car, people started raising questions about how this will impact the wider world. A big concern is from teachers, who fear that ChatGPT will make it far easier for students to cheat; The Atlantic dubbed it “the end of high-school English” because of its ability to generate a pretty good five-paragraph essay and our corresponding inability to identify what text has been written by humans and by ChatGPT. Whether or not it’s hyperbolic to ring the death knell for the essay, students are already using ChatGPT to cheat, and generated content detectors aren’t up to the task. The NYC public school network blocked ChatGPT on its devices and networks, but critics have noted that it will be fairly easy to get around; acting like an ostrich won’t have any effect. Generative AI is here to stay.
This is forcing teachers to scramble and rethink the structures of their classes. Some teachers are embracing it and think that using ChatGPT in the classroom could help teach students to function in a world transformed by generative AI, but the point remains that forcing often-overworked teachers to completely revamp their teaching philosophy partway through the school year seems unfair, to say the least. It’s also detrimental to the students who may be tempted to use ChatGPT to cheat and thus won’t learn as much as they would have otherwise. OpenAI is working on ways to identify generated text, but CEO Sam Altman said that there will still be ways to get around it. That is true, but comparing the rise of ChatGPT in schools to the rise of calculators in math class is a little disingenuous. Calculators weren’t all of a sudden dropped into the pocket of every student, while ChatGPT was an overnight sensation.
Generative AI is a set of incredible technologies, and they may lead to some amazing new educational methods. But what probably have been better for the community would have been consulting with teachers and students about what they would need from OpenAI to minimize disruption. This might have involved releasing a ChatGPT-detection tool at the same time as ChatGPT, giving teachers time to think about whether and how to incorporate it into the curriculum while preventing cheating on their existing assignments.3
This is just another example of big tech companies overlooking the impacts of their work on certain communities. There’s another community that is often ignored when discussing AI, and it’s the data labelers. AI often relies on training datasets that are labeled by humans, which it uses to learn how to classify new items. Whether on crowdwork websites like Mechanical Turk (where workers complete micro-tasks for small payments) or via outsourcing firms in other countries, AI companies try to find the cheapest way to label datasets. For OpenAI, this was using a firm called Sama, which is a dedicated data annotation and labelling company that employs workers in Kenya, Uganda, and India. It has worked with Google, Microsoft, Walmart, and Facebook, and is a certified B Corp that claims to have helped 50000 people lift themselves out of poverty.
OpenAI contracted with Sama to label examples of harmful content, which OpenAI used to train a toxic-content detection tool.4 Despite the fact that Sama advertises themselves as an “ethical AI” company, working conditions are… not great. Workers for OpenAI were paid between $1.32 and $2.00 an hour, including a monthly bonus for having to deal with explicit content. This is a relatively reasonable wage in Kenya, where the minimum wage for receptionists in Nairobi5 last year was $1.52/hour, but in general, receptionists aren’t working in workplaces “characterized by mental trauma, intimidation, and alleged suppression of the right to unionize.” Labelers and moderators had to read hundreds of graphic, violent pieces of content per day, including descriptions of murder, suicide, and child sexual abuse.6 Employees reported being time-pressured and unable to access effective mental health services, despite Sama’s claims to the contrary. OpenAI claims to have been unaware of conditions, but after a “miscommunication” were Sama sent them images of child sexual abuse—which some employee had to have collected—OpenAI terminated their contracts with Sama. Some workers were transferred to jobs without the explicit content bonus but left with all the trauma of having been exposed to it; others were fired. In the end, this community of workers was left mentally scarred, not wealthier—many reported living hand-to-mouth, and threatened strikes for higher pay were quashed—and ultimately, perhaps, jobless.
This overall speaks to something I haven’t been sure of how to talk about for a while, and it’s this: I’m profoundly uncomfortable with the idea of “human-centered AI” (HCAI) as it’s usually discussed. For those of you not in the AI research bubble, human-centered AI is the idea that AI should prioritize the well-being of humanity and “enhance humans rather than replace them.” It is the guiding principle of the EU’s AI strategy, and institutions including Stanford, Oxford, and the University of Bologna have research groups dedicated to its study.
In principle, I agree with the core idea of human-centered AI: AI should be good for humans. However, it should also be good for a lot more. Human-centered AI tends to focus on two levels: the individual (or user) and humanity. Generally7, it tries to build AI that respects individual humans through principles like transparency, fairness, and justice, and/or AI that addresses problems humanity faces, like climate change. It also uses measures like human rights impact assessments and keeping humans “in the loop” of decision-making to ensure that humans aren’t undermined.
However, there’s a lot that gets lost between the micro and the macro levels. Take ChatGPT, for example. OpenAI doesn’t explicitly claim to be a human-centered AI company, but their mission statement is “to ensure that artificial intelligence benefits all of humanity” and their white papers generally address issues that impact individuals, like bias and transparency. For instance, one of the main issues they had to address with DALL-E (their image generation tool) was reducing bias; they implemented a technique that makes generated images for prompts without race or gender “more accurately reflect the diversity of the world’s population.” The prompt “A portrait of a teacher” originally returned six middle-aged white women, but after mitigation included two men and multiple people of color. This reduces representational bias, which is rife in other image generation tools like Lambda, which is notorious for returning highly sexualized images of women, especially Asian women, because of what’s in its training dataset.8 What’s interesting to me is how they couch it in language of being better for the world. Yes, it is undoubtably better for the world to have more diversity in what will likely replace stock images, and to have individuals not exposed to such bias, but it’s also better for the communities who are otherwise offensively or under-represented.
In this case, we still got to a more ethical result, but looking at the ethical world of AI as either an individual-level or a humanity-level problem doesn’t always get to the best conclusion. Let’s return to the Sama workers in Kenya whose labor made ChatGPT less likely to spew hateful content. The individual level looks at the result of using AI for a single user: when a user asks ChatGPT something, it is less likely to get an offensive response. The humanity level looks at the aggregate results of using ChatGPT: there is less hateful content online than there otherwise would be, and society is thus less exposed to it. Check, check, it’s human-centered. But what about the workers who were traumatized? Yes, someone had to label the training dataset. But to delegate that heavy, heady task to poorly-paid workers at a company that refused to provide adequate mental health resources in a faraway country where OpenAI could claim plausible deniability for working conditions? We cannot call that human-centered. It is literally treating humans as cogs in the AI machine—the antithesis of prioritizing them.
We—as individuals, society, and humanity—would be better off if we took a community-centered approach to AI. Communities, be they offline or online, familial or not, cannot thrive unless their individual members do, and humanity cannot thrive unless the multiplicity of communities that we naturally form are thriving. It also gives us a framework by which to prioritize applications and risk mitigation of AI. If AI creates harmful effects that manifest most clearly at the community level—like polarization within a specific affinity group—we can bound the problem more clearly. Or, if a community is suffering from a problem that only impacts individuals a small amount, we can use this framework to consider whether AI could help. This still prioritizes the well-being of humans, but it forces us to look at every community that’s impacted by AI at every stage of the development cycle; we can’t use the pseudo-utilitarian excuse that something benefits all of humanity to avoid addressing the ugliness of smaller-scale impacts. Yes, trade-offs will have to be made. Labelling harmful content will always be a horrible job, but we can at least make sure that the workers doing it are treated with respect, paid well, and have access to mental health resources during and after the job is done.
The White House’s new Blueprint for an AI Bill of Rights does a good job of centering the community, especially when considering pernicious emergent effects, and the AI community should keep expanding this. One shortcoming of the Bill of Rights is that it hardly mentions the environment, but communities are impacted by local environmental issues, as well as by climate change as a whole. We don’t have to start from scratch in this. The philosophy of “ubuntu,” which originated in Southern Africa, is summarized by Rev. Desmond Tutu as “a person is a person through other people." In other words, it is our social bonds that enable our humanity—our communities. We are made people by how we “enable the community around [us].” In the AI age, our communities are not just our immediate family, not just our neighbors. Our communities are the people touched by the tools we use, and it’s vital that we remember this and prioritize community care. This is especially critical in AI, where not all communities are represented in the development process. Rather than wait for the slow reshaping of the developer pipeline, we can actively identify vulnerable communities and bring in their feedback every step of the way. But you don’t have to be an AI developer to put this into action; there are ways big and small to practice it: don’t take the exit in a traffic jam. Even better: Google, stop recommending the exit in a traffic jam. Gather likeminded people on Discord, if you’re leaving Twitter. Form tech labor unions that include not just the coders, but the cooks, cleaners, and data labelers. It boils down to: recognize the communities you’re in. Care about them. Work to make them better, and work to make sure the AI tools we’re building make them better too.
We probably should have known that Times Square isn’t the only American New Year’s destination and found out later that there were 400,000 people on the Strip for New Year’s Eve. Having come to Vegas to rock climb, we were not among them.
Ok, I am definitely complaining about the drive and Google Maps screwing up, but I promise there’s an intellectual point.
There is an argument that such an approach would enable a “head-in-the-sand” response from teachers to pretend that nothing is changing, but I think that will be untenable given how explosive this has been, and the fact that many teachers actually seem intrigued about what it can do and how it can benefit their teaching and students.
For certain kinds of machine learning, the algorithm needs a dataset where examples are labeled with a characteristic it needs to learn. So, if you’re trying to train an algorithm to distinguish between cats and dogs, you would give it thousands (if not millions) of pictures of cats and dogs, each labelled with whether it’s a cat or dog. The algorithm eventually learns different patterns that define “cat-ness” or “dog-ness” and decides which is more likely for new examples. In this case, they fed it with examples of text labeled as harmful.
Minimum wage in Kenya depends on profession and area, but Kenya’s absolute minimum wage at the time was ~$0.73/hour, meaning Sama’s workers were making less than double that. Would you live in the US and do this for $13.05/hour? Didn’t think so.
Facebook used Sama as a base for its entire Sub-Saharan Africa content moderation operation; moderators were paid similar wages to review graphic posts (including images and videos of murder, suicide, and sexual abuse) and decide if they violated community standards. Sama will not be renewing their contract with Facebook, meaning that 200 jobs will be cut.
I am painting this with an extremely broad brush that will not capture all of the nuances of all HCAI research.
For more on the need to audit the datasets used for generative AI, see “On the Dangers of Stochastic Parrots.”
In the spirit of community care: Thanks to Jackson for putting up with me on the drive and helping generate the seeds of this post, and to Isabella for beta-reading and giving some excellent feedback.
The thumbnail image was generated by DALL-E with the prompt “Community, ChatGPT, human-centered AI, oil painting.” I am aware of the ethical questionability of using generated art, but I ask for forgiveness because Substack defaulted the preview to a massive picture of my face and it was horrifying.