ER3: On Algorithmic (In)Justice
Algorithmic bias, criminal justice, and chocolate chip cookies.
Welcome back to the Ethical Reckoner. Today, we’re entering Minority Report territory and talking about algorithmic justice. Algorithmic justice is a field of artificial intelligence that involves the use of AI in criminal justice. This includes in the courtroom, like algorithms that aid in parole, bail, and sentencing decisions, and “in the field,” with “predictive policing” algorithms that attempt to foresee where crimes will be committed. The way these algorithms work, like all predictive algorithms, is by taking in data (here, about a defendant’s past behaviour or past crimes) and then using the patterns in that data to make a prediction about new cases. After going through some of the technical issues with these algorithms, we’re going to zoom out to interrogate the system that they’re supporting. Also, I’ll be mostly talking about these algorithms in the American context,* but unfortunately, the algorithms and structures we’re discussing are widespread, especially in the Western democratic context.
Hypothetically, if we could perfectly predict where crimes would be committed and who would commit them, we would have a safer society. But there’s a reason Minority Report is a dystopian movie; human behaviour is complex, and we can’t predict it perfectly. First, the quantity of data required is prohibitive; would we have to model every person’s brain from the moment of birth to figure out when and where they were going to commit a crime? Second, and more relevantly, algorithmic predictions can only use the data we feed them to make predictions, so if the data is biased or inaccurate, it’s garbage in, garbage out.
To illustrate this, suppose that a cookie company is building an algorithm to predict what kind of cookies will be popular based on past sales in a certain area. This could be a great way to boost sales without doing costly market research. However, say that the people in the area they’re drawing their data from generally dislike chocolate chips in cookies.** When the bakery asks their algorithm if a peanut butter chocolate chip cookie would be successful, it would probably say no, even though such a cookie would surely be a smash hit with the general population.
This sort of algorithmic bias gets really serious when you think about anything less trivial than cookie flavours. Remember stop-and-frisk? It was a policy in New York where police officers could stop and search people in public. The officers were ostensibly required to have “reasonable belief” that the detainee was about to or had been involved in a crime to conduct a stop and that they could be armed to perform a search, but in many cases this “reasonable belief” seemed to have been grounded in racism. In New York City, Black and Latino people were nine times more likely to be stopped compared to white residents, but white detainees were twice as likely to be carrying a gun. It was ruled illegal for being “implicit racial profiling” and the NYPD is being reformed; crime continues to fall in NYC despite the policy’s elimination.
In the meantime, “stop and search,” a similar UK policy, is also seeing an increased racial disparity. In 2017, Black people in England and Wales were 14 times more likely to be stopped. In 2019? 40 times more likely. And yet, prohibited item possession rates are the same across ethnicities.
Feeding this data into an algorithm to predict who should be stopped and search would just result in even more Black people being unjustifiable detained. In fact, that’s exactly what’s happening with predictive policing algorithms, which attempt to predict where crimes will be committed to better allocate police resources. Unlike in Minority Report, these algorithms (like the leading PredPol, or the Durham police force’s HART) don’t try to predict individual behaviour, but patterns of group behaviour. The problem is that many of the crimes being fed into them are not violent crimes but “Part 2” or “nuisance” crimes—things like vagrancy, small-scale drug offences, and aggressive panhandling—which would likely go unreported unless a cop was there to observe. As Cathy O’Neil explains in her book “Weapons of Math Destruction,” “these nuisance crimes are endemic to many impoverished neighbourhoods… including them in the model threatens to skew the analysis.” Feeding more of these crimes into the algorithm sends more police into those neighbourhoods, where they’re “more likely to arrest more people,” creating a “pernicious feedback loop” where “the policing itself spawns new data, which justifies more policing” (O’Neil 2016).
But, you say, what if we don’t put race into the equation? Well, those algorithms don’t include race, but American cities are still highly segregated, so an algorithm that makes judgements based on geography is also making judgements based on race. Companies making these algorithms bend over backwards to stress that they do not include data about race in their algorithms, including Northpointe. Northpointe (now Equivant) developed COMPAS, a courtroom risk assessment algorithm that attempts to assign “risk scores” of 1 to 10 to defendants to assess how likely they are recidivate, or commit crimes in the future. The scores are intended to be used for parole decisions and treatment program eligibility. Northpointe does not disclose their proprietary algorithm, but it is derived from publicly available criminal record data and a defendant survey that includes questions like “How many of your friends/acquaintances are taking drugs illegally?” and to agree or disagree with statements like “A hungry person has a right to steal.” It does not include race.
And yet, the algorithm is biased: a ProPublica analysis found that Black defendants are twice as likely to be mis-categorised as high risk than white defendants, and white defendants were more likely to be miscategorised as low-risk. Even controlling for criminal history, recidivism, age, and gender, Black defendants were more likely to be predicted to commit future crimes.
So, the algorithm clearly seems biased, even though it doesn’t explicitly take race into account. However, it does use variables like geography and income that are highly correlated with race. These are called “proxy variables” because they serve as representations of another variable (here, race) in the data.
Is the algorithm really biased, though? Northpointe says no, claiming that at a certain high-risk score threshold, the likelihood of recidivism is equal across racial groups. Alexandra Chouldechova (2017) explains that this is a consequence of applying different definitions of fairness; ProPublica was looking at the “error rate balance,” while Northpointe was using a “predictive parity” approach. It’s mathematically impossible to satisfy both at the same time. And to make it even more complicated, there are even more definitions of fairness than just those two.
In America, risk assessment tools are used in 49 states and the District of Columbia. COMPAS is used in at least four states, including Florida (where the ProPublica analysis was conducted) and Wisconsin, where a defendant, Eric Loomis, sued on the grounds that using COMPAS as a factor in his sentencing violated his due process rights. In State v. Loomis, the court ruled that COMPAS did not violate due process rights, but that judges using it had to be presented with five written warnings about how the way COMPAS risk scores are calculated is not disclosed, that COMPAS cannot make meaningful individual decisions because it’s based on group-level data, that there has been no validation study in Wisconsin, that studies “have raised questions” about racial bias in the algorithm, and that it wasn’t developed to be used for sentencing decisions.
This speaks to a broader issue with algorithmic risk assessments. COMPAS scores were originally only intended to be used in parole eligibility decisions and post-sentencing decisions. However, human cognitive biases favour the use of data because it has a veneer of impartiality, even though it’s produced by messy, biased humans and is thus a reflection of our biases. So, judges are prompted to over-rely on the algorithm and use it beyond its intended purpose. Courtroom algorithms are meant to be “another tool in the judicial tool box”—a supplement to judicial discretion, not a replacement—but risk turning the judiciary into “a farce where the machine hijacks the judge, completely squeezing a judge’s discretion.”***
These algorithms risk supplanting judicial autonomy, but could also augment it in a negative way. The process where algorithms provide input to judges—for decisions that may or may not be part of their original purpose—allows for both judge and algorithmic bias to enter proceedings; judges may be more or less likely to take the algorithm’s assessment at face value depending on, say, the race of the defendant. This in turn could result in further inequitable outcomes that are fed as data into algorithms, the decisions of which reinforce judges’ biases, and we end up in a negative spiral of increasing computer and human bias.
So, let’s take a step back. Criminal justice algorithms are biased, but it depends on the definition of bias that you use. Ultimately, though, the reason that it is impossible for COMPAS to satisfy all the definitions of fairness at once is because Black and white defendants do recidivate at different rates. In the dataset ProPublica analysed, 51% of Black defendants re-offend, compared to 39% of white defendants. The error rate balance approach ProPublica used is negatively correlated with Northpointe’s predictive parity approach; “when dealing with two populations with different recidivism rates, better fairness in one sense (predictive parity) can only be achieved by reducing fairness in the other sense (error rate balance).” Mathematically speaking, making classifications about two groups with different recidivism rates will always result in some unfairness.
This all goes back to my point about the data being fed into algorithms. Criminal justice, especially in America, is historically a racist institution. The rise of the Black Lives Matter movement has brought this to the forefront of the public consciousness, but it’s been true since Southern slave patrols started chasing down individuals escaping slavery in the 18th century. According to a 2016 Sentencing Project report, Black people in America today are incarcerated at a rate over five times that of white people. In twelve states, more than half of the prison population is Black. To illustrate this beyond the statistics, over the last year we saw peaceful BLM protestors greeted on the steps of the Capitol by armed guards in riot gear, while mostly-white rioters were carefully ushered down the stairs after invading the building.
This does not mean that Black people are inherently more criminal. First, our policing system is biased. If you’re stopping a higher proportion of Black people than anyone else, even if the rate at which they have contraband is the same as anyone else, you’ll end up arresting a disproportionate number of Black people. Second, much of the rest of our society is set up to incarcerate people of colour more often than white people. Cathy O’Neil explains that Black communities are over-policed to begin with, and are also often under-served by resources and interventions that can improve individual circumstances and decrease crime. Poverty and race are highly correlated (another example of proxy variables), and those who can’t afford a lawyer are assigned an often-overloaded public defender, resulting in lower-quality defence.
Over-policing is only exacerbated by the vicious cycle created by PredPol and other predictive policing algorithms, which by their nature are meant to detect crimes “on the street,” rather than white-collar crimes and other less visible crimes—the crimes of the wealthy and white (O’Neil 2016). Our criminal justice system chooses what areas and crimes it will police, and that results in more people of colour being sent to prison.
Ultimately, the goal of algorithmic justice is rooted in detecting and punishing criminals. Perfectly predicting where crimes would be committed and who would commit them just so you can go and arrest them Minority Report-style focuses on retributive justice, not rehabilitative or restorative justice—its goal is to punish, not to rehabilitate. So, if we could know every young Black man who is going to be involved in a shooting, that would just result in us locking up more young Black men, instead of focusing on social interventions that would prevent shootings and obviate the need to lock up more people.
Instead of quibbling over which definition of fairness is the least unfair in algorithmic justice, we need to focus on actually changing the underlying system that’s creating the unfairness in the first place. This means substantial, bedrock reform to the criminal justice system. Much has been written by those better informed than I am on where to start—the Equal Justice Initiative is doing great work to reduce excessive punishment, overturn wrongful convictions, and advocate for the underserved—but I will point out that the algorithms that are currently contributing to these problems could actually be part of the solution. Instead of treating them as an impartial source of predictive truth, treat them as a quantification of human bias. Analyse why they are unfair, then look at the processes that are feeding them data—what crimes we’re prosecuting, where we’re sending police, who we’re granting parole—and try to change them. One could argue that we should just look at incarceration rates, but it’s easy to become inured to the obvious disparities that have persisted for so long. Using algorithms as a tool to “objectively” quantify our biases could serve as the nudge we need, a reminder at just the right distance that our institutions are racist and need to be changed, and an indicator of where to start. We may be able to reverse the negative spiral we find ourselves whirling in and turn it into a positive spiral—what Luciano Floridi dubs a constructive “normative cascade”—where citizens become aware of the need to change how we approach a system ethically. Those ethics feed into the law, which feeds into government, which feeds back to the citizens. We need to row against the downwards spiral we’re in, reversing it into a virtuous cycle that can help create a truly just society.
*I want to acknowledge that, as a white woman, I’ve historically benefitted from the inequalities American criminal justice system, and also that I am not a criminal justice expert. I’ve tried to inform myself, but have likely gotten things wrong, so dialogue is welcome. Also, people of other races are undoubtably targets of racism in criminal justice, but most of the literature around algorithmic justice focuses on anti-Black racism, which informs the focus of this article.
**Let the record show that wherever this is, I do not want to live there.
***This is from a translation of a piece about “smart justice” in China, which is trying to make courts more efficient through automated administrative tools to ease filing processes, as well as judge-assistant tools. Well worth a read.
Thumbnail generated by DALLE-3 via ChatGPT with the request “Generate an abstract Impressionist painting of the concept of algorithmic injustice in a dark color palette”.
Thanks for reading. If you found this interesting, please share and/or subscribe.
Impressive-looking forward to discussing the criminal justice aspect with Chad.
Another great posting! Thanks for making the concept of algorithms understandable for this laywoman ... as well as their application to the criminal justice system.