Mathwashing: How Algorithms Can Hide Gender and Racial Biases
Scholars have long pointed out that the way languages are structured and used can say a lot about the worldview of their speakers: what they believe, what they hold sacred, and what their biases are. We know humans have their biases, but in contrast, many of us might have the impression that machines are somehow inherently objective. But does that assumption apply to a new generation of intelligent, algorithmically driven machines that are learning our languages and training from human-generated datasets? By virtue of being designed by humans, and by learning natural human languages, might these artificially intelligent machines also pick up on some of those same human biases too?
It seems that machines can and do indeed assimilate human prejudices, whether they are based on race, gender, age or aesthetics. Experts are now finding more evidence that supports this phenomenon of algorithmic bias. As sets of instructions that help machines to learn, reason, recognize patterns and perform tasks on their own, algorithms increasingly pervade our lives. And in a world where algorithms already underlie many of those big decisions that can change lives forever, researchers are finding that many of these algorithms aren’t as objective as we assume them to be.
In one recent study which trained an off-the-shelf machine learning AI system on 2.2 million words, Princeton University researchers used a word-association technique to map out what kind of links the system would between words and concepts. It found that the system would associate words such as “flower” and “music” as being more pleasant concepts than words like “insects” and “weapons.”
But even more telling was how the system also interpreted European-American names as more pleasant than their African-American counterparts, or how it also associated the words “woman” and “girl” with the arts, instead of science and mathematics. In analyzing the connections made during the process of natural language learning, the machine learning system did indeed seem to take on some of those existing gender and racial biases that humans might espouse.
The nature of these findings was echoed in another experiment that showed men were more likely to be shown targeted Google advertisements for high-paying jobs, compared to women.
“In all cases where machine learning aids in perceptual tasks, the worry is that if machine learning is replicating human biases, it’s also reflecting that back at us,” said Princeton computer scientist and one of the paper’s authors Arvind Narayanan on IEEE Spectrum. “Perhaps it would further create a feedback loop in perpetuating those biases.”
Elections, Loans, Jobs
At first glance, these potential prejudices might not seem like a big deal, but the real-world consequences can be actually quite serious. These impacts could range from the recommendation engines behind social media newsfeeds surfacing targeted bits of information, which may ultimately end up swaying an election.
Biases underlying digital lending software algorithms may also discriminate against people by giving them a lower credit rating due to factors unrelated to their personal creditworthiness, such as their social media connections, what they buy, what kind of SAT scores they have, whether they are smokers, or whether they use punctuation properly in their text messages.
There might be a loose correlation between these factors, but that does not necessarily imply causation. “[Digital lenders] brag that they score people in part based on their ability to use punctuation, capitalization and spelling, which is obviously a proxy for quality of education,” as Cathy O’Neil, a former math professor who now leads an algorithmic auditing company, points out on American Banker. “It has nothing to do with creditworthiness. Someone who is illiterate can still pay their bills.”
The same concerns apply to applicant tracking software that helps companies match applicants to job openings. Machine learning algorithms can be used to filter and weed out the majority of resumes — up to 72 percent according to one source — even before a human sees them. The problem is that if there’s human bias baked in — whether unintentionally or not — such programs could end up discriminating applicants based on gender, race, age or disability, which is against the law.
While recruiting software can save HR departments time and money, the problem is that it’s difficult to fully understand how the underlying algorithms in all these pieces of software work, even for their creators. These algorithms are often proprietary, and companies are often not very transparent about how they function.
Perhaps one can cope with not getting that dream job. But the potential for life-altering consequences stands out starkly in the automated risk-assessment reports done for the criminal justice system. The scores on these reports evaluate which prisoners are most likely to re-offend, and are used by the courts to decide whom should be granted parole, or alternative treatment, or released. As a recent ProPublica report revealed, the algorithms behind one program that generated risk-assessments were twice as likely to erroneously score blacks as higher-risk reoffenders, while mislabeling whites (who would later go on to commit more crimes) as lower-risk offenders.
“This is a very common issue with machine learning,” said computer scientist Moritz Hardt of the University of California, Berkeley on Science News. “You’re very likely to end up in a situation that will have fairness issues [even if the algorithm was not intentionally designed this way]. This is more the default than the exception.”
In another instance, “predictive policing” algorithms can be used to anticipate when and where future crimes might occur, using data on the times and locations of past crimes. These predictions are then used by law enforcement to determine where patrols should go and when. But as some experts point out, the problem is that crimes can happen anywhere in a city, and this algorithmic approach may lead police to rely too much on biased algorithms, prompting them to unfairly target certain neighborhoods or profile certain people, thus creating a self-perpetuating cycle.
So far, there’s been little evidence that risk-assessment reports and predictive policing models are effective at preventing crime. As William Issac, an analyst with the Human Rights Data Analysis Group, points out in a report in Science: “They’re not predicting the future. What they’re actually predicting is where the next recorded police observations are going to occur.”
Bias Correction and “Mathwashing”
But computers are only as good as their programmers, and it is possible to mitigate these hidden prejudices by identifying any unwanted biases, double-checking results and recalibrating algorithms accordingly. The use of multiple algorithms might also help, as well as keeping a human in the loop to manually monitor the system. Implementing bias-correction algorithms that reorganize search results without altering their rankings is another option, as some researchers are proposing.
What these trends seem to indicate is that we trust algorithms far too much, even though we don’t fully grasp how they work. This act of blind faith that mathematical models are immune to bias — or “mathwashing” the potential pitfalls of biases hidden in our algorithms — can have huge unintended consequences. In a world where socio-economic inequality is growing, these algorithmic biases may end up reinforcing these inequalities, unfairly stripping opportunities from those who objectively qualify for college admission, a loan or job, ultimately depriving the world of their potential contributions and inadvertently separating those who can get an algorithmic pass, from those who cannot.
Google is a sponsor of The New Stack.
Images: ikukevk.com, Princeton University, rawpixel.com, Mitchel Lensink,