“Algorithmic fairness” has become a hot topic, but it’s not really solving the right problem. I’ve only tangentially worked in this area, so apologies in advance for my ignorance.
Recent accusations that algorithms exhibit racial and other biases have led to widespread efforts to make machine learning (and other sociotechnical systems) more “Fair, Accountable, and Transparent”. But “fair” does not imply “just”. If we were to start history over from a blank slate, some of the fairness criteria that have been developed might help avoid some kinds of discrimination. But we must deal with the real history of past unfairness.
People and organizations in power are increasingly using algorithms powered by machine learning to exert their influence. The math inside the algorithms may be morally neutral, but whoever owns the data and whoever defines the objective that the algorithm attempts to maximize have power. Fundamentally, the objective of most algorithms is to score best at predicting present data given past data. But if past data bear the scars of past oppression (e.g., redlining), an algorithm can score highly by predicting that those scars will persist (e.g., that a person in a historically redlined area will not repay a home loan). Algorithms perpetuate the biased status quo.
Efforts towards fairness in machine learning have attempted to define alternative objectives and measures that, for example, strive towards “equal opportunity” in loan prediction. But the full scope of the problem is very broad:
- Different treatment at the hands of intelligent systems (loan approval etc)
- Different treatment by people using intelligent systems (e.g., predictive policing, risk assessment tools)
- Different usefulness of intelligent systems to minorities (e.g., speech recognition not working as well for minorities, or photo libraries erroneously grouping all my Native American friends’ faces together)
- Different influence on relationships (e.g., the “Algorithmic glass ceiling in social networks”)
- Different self-perceptions (e.g., when I search for my dream profession, the people I see don’t look like me)
… just to name a few. The problem directly affects those who are minorities (thus not well represented in some data), historically oppressed (thus vulnerable to predictive injustice), and underenfranchized (thus unable to voice how decisions of powerful systems affect them). But it also affects majority groups: for example, biased resume classifiers may keep minorities out of some workplaces, and lack of diversity has documented negative effects on creativity and team productivity. Organizations will also be subjected to accusations of injustice made through the legal system or social or political activism.
Those who hold the data and design the algorithms are, in one sense, in the best position to address the injustice done or furthered by their algorithms. Most large tech companies now have teams working on “fairness”, and Google and Microsoft publish prolifically in this area. But can criteria and measures developed by the current tech elite really ensure equitable (not just equal) treatment for all? And can they even be “unfair” in ways that seek to heal past injustice? Solutions will will also require government involvement, both to drive policy about what sorts of systems will be allowed to interact with citizens in what ways, and also to drive procurement of tools that the government will use. Ensuring that all this actually does good in practice, though, requires researchers, journalists, and others to act as auditors and watchdogs.
Predictable consequences of work towards algorithmic fairness include:
- Increase in attempts (internal and external) to hold organizations accountable for their algorithmic decision-making. Along with this will come an increase in attempts to access sensitive data in order to be able to audit these decisions, which often occupies an ethical and legal grey zone — so law will develop to govern these practices.
- For those in sizeable minorities, this increase in accountability will lead to a reduction in obvious harm.
- For those who are in groups that are not clearly quantifiable in commonly collected demographic data, the effect will be more mixed. Since detecting and documenting harm typically requires identifying who is harmed, groups who are harder to identify will be harder to protect. Some approaches may result in reduction of harm for these groups also (e.g., Distributionally Robust Optimization), but many will not, especially if the algorithm is able to discern enough of a difference to predict differently for them.
- Algorithm design choices, “interpretable” visualizations of machine learning algorithms, and aggregate “fairness” measures will be offered as evidence in legal discrimination cases. This evidence will be misunderstood.
- Technology suppliers who are able to document the “fairness” of their technologies will be chosen as lower-risk options by government and private procurement processes. Since established suppliers with lots of data and resources will be better able to make this documentation, and thus gain access to yet more data, the “rich will get richer”.