Guest post: RIRO
Originally a comment by latsot on To be seen.
It’s not difficult to train face recognition to work with black people. If you trained the machine learning systems with plenty of black and white faces, it would be fine with both.
However, most facial recognition software has historically been trained on mostly white people, so has a problem reacting to dark skin. It presumably didn’t occur to the people who trained the systems that some faces are not white.
There are lots of other examples of software that has taken on the racism of its trainers. For example, many police forces in the US (and some in the UK) use software to predict where crime is likely to happen (for the purposes of resource planning and management). It’s trained on historical data and since the police’s arrest records contain a racial bias toward arresting people who aren’t white, the software predicts that future crimes will occur in areas where the population is mostly non-white. And since the police are institutionally racist, they see this as ‘working’.
It could be classified as a GIGO problem, yes, but it’s better classified as Racism In Racism Out.
I also recall a US lawyer on twitter discussing a piece of DOJ software that is used to suggest sentencing ranges. The ‘expert system’ gets loaded with pretty much everything known about the case and the accused and suggests the sentence range. problem again is it’s been trained on past cases and so embodies all the previous ‘quirks’ of the system.
Even in my work field, we’ve started exploring machine learning systems to evaluate our data. because there is some subjective input to describing and parsing the data, we’re finding that the system picks up the peculiarities of the person doing the training. It’s the sort of bias that is going to take very careful and open specialist input to mitigate. I’m not sure you can ever be sure bias will not exist in such systems.
More likely it didn’t occur to them that it would make a difference in facial discernment. I’ve worked on enough research/data analysis teams to be accustomed to discovering mid-project that our corpus is lacking in some relevant fashion.
Rob:
Yes, that sentencing software exists but I’ve lost track of how widely it is used.
You’re right that one can never be certain that a training set doesn’t contain bias. It’s one of the reasons machine learning isn’t a good solution to every problem. Or at least isn’t necessarily a good solution. Investors, of course, see it differently. Their criteria for whether a solution is a good or bad fit to a particular problem differs greatly from those of the people whose lives it affects. See also: blockchain.
Null:
Yeah, I was being deliberately glib. In fact, I suspect there will have been multiple failures in constructing the training set including your suggestion “well, we’ve all got the same basic features, right?”; the conference panel excuse “we just couldn’t find any pictures of black people”; time and cost pressure etc. But at some point, the results will have been evaluated and found good by people who can’t have tested it very carefully either because they were themselves racist and/or because they had so much invested in the system appearing to work. Racism doesn’t have to be of the flaming pitchforks variety; considering convenience and economic decisions more important than the possibility of people’s lives being adversely affected counts too.
Anyway, horses for courses and right now ML is being used in lots of places it ought not to be. AI isn’t going to destroy the world by becoming actually intelligent (not anytime soon) but it’s going to destroy a lot of lives by being thoughtlessly built and deployed. Already has done.
Interesting problem. I’ve been working in AI for about 20 years now and keeping bias at bay is a daily challenge. 20 years ago I used to get blank stares when I would make a statement like, ‘The US medical system is actually a white male medical system’, since most of the knowledge at the time was derived form the signs and symptoms of white middle-aged guys.
Once we stated using more granular gender and ethnic classifiers in our training programs, we saw a pretty dramatic separation of diagnostics and profound differences in the signs and symptoms that had been missed or ignored. In particular, it turned out that women in general were much more stoic than men with regards to symptoms. Now it’s more common knowledge that fatigue and insomnia are as likely to be signs of cardiac ischemia in women as chest pain is in men.
Of course MLP also detected huge socioeconomic classifiers that clearly were associated with disease prevalence in particular ethnic subpopulations.
And wouldn’t ya know it – about the time all this science was poised to overturn the apple cart, the American people decided to simply do away with truth as it is too inconvenient.
One additional caveat – In my experience, machine or deep learning is not a complete solution to any problem. It is a useful component in a more comprehensive solution set. The current fad in MLP is a great example as is the Watson problem.
And sexism, and age-bias. The early voice recognition efforts out of Microsoft were famously only able to work well with white men between the ages of 25 and 35. You are what your training corpus was!