The Proxy Problem in Algorithmic discrimination

June 21, 2023
-

Algorithmic decision-making tools are increasingly used in all sectors of society. At the same time, there are growing concerns about Ai’s potential to perpetuate and exacerbate existing societal biases and discrimination. One way Ai algorithms produce discrimination is through reliance on proxies, leading to a subtle yet pervasive form of unfair and unjust outcomes for marginalized and underrepresented groups. This article explores the proxy problem, its implications, and some strategies to address it.

What is a Proxy?

A proxy is basically a stand-in or substitute for something else. In the context of data systems, a proxy is a variable or factor that is used to represent another variable that is either difficult to measure directly or not readily availablebecause of legal restrictions.

Human minds often rely on proxies to make judgments or decisions in everyday life. For example – we might use job title as a proxy for someone’s competence, income level, or educational background. Or we can infer someone’s personality or character from their physical appearance. Our minds use these shortcuts to help us navigate complex social environments and make quick judgments and decisions – especially when we don’t have complete information. These kinds of judgments have led to stereotypes, biases, and prejudices which contribute to our unequal society.

Modeled after the human brain, algorithmic systems also fall into this trap – they too rely on proxies to make decisions and judgments when a relevant variable or characteristic is either difficult to measure directly or unavailable because of legal restriction. For example, if an algorithm is prohibited from considering socially sensitive data in its decision-making process (i.e. on race or gender), it will rely instead on proxies for race and gender – such as income levels, online behavior, online shopping history, hobbies or membership in certain organisations. For example, joining a women’s professional network might indicate someone’s gender and sex, while participation in a cultural association could suggest their racial or ethnic background.

Algorithms rely on proxies because socially sensitive data (which is usually unavailable because of legal restrictions) has high predictive value for future outcomes. Algorithms are trained essentially to find these connections between input and output data – even though they might not know the real reasons behind those connections. So, when the law restricts access to sensitive data (like data on race and gender) algorithms rely on proxy characteristics in order to remain statistically accurate.

The Problem?

The problem arises because relying on proxies can lead to indirect discrimination. For example, indirect discrimination could involve a hiring practice that prioritises applicants that have a certain level of proficiency in English or Afrikaans. While this requirement may appear neutral on the surface, it disproportionately disadvantages black South Africans, who may not have had the same access to Afrikaans education as other racial groups due to the country’s history. In this case, the language requirement could be seen as a proxy for race and result in indirect discrimination against black applicants.

On the face of it, the algorithm will appear to have complied with non-discrimination laws, because it did not make any decisions or recommendations based on gender or race. But while the outcome is nonetheless discriminatory, although in an indirect way.

Detecting and addressing proxy discrimination is quite a formidable challenge – because it is not always immediately apparent that the algorithm is generating biased outcomes, in part, because of Ai’s lack of transparency and explainability. This can make people lose trust in these systems and raises questions about how well Ai is actually working. For example, a financial institution using Ai to make credit decisions might unintentionally favor white male applicants and disfavor black females, even though the Ai system is not explicitly credit decisions based on race or gender. This is because the Ai system is using proxies for race and gender, such as area code and shopping history, which leads to the same discriminatory result.

This type of algorithmic discrimination can have far-reaching consequences beyond those directly affected by Ai-based decisions. A real-world example is the use of area or postal codes as proxies for income or creditworthiness, which has been shown to disproportionately impact low-income neighborhoods, thereby reinforcing existing inequalities.

Strategies to Address the Proxy Problem

To address the proxy problem, it is crucial to incorporate human rights in the design and implementation of data-driven systems. This involves carefully designing, implementing, and monitoring Ai systems to ensure they uphold fairness, equality, and transparency. In our experience, this approach builds trust in Ai systems and their decision-making processes. Here are some strategies that can be used to minimize proxy discrimination in algorithms:

Utilizing human rights impact assessments and metrics to monitor Ai algorithms can help identify disparities in outcomes across different demographic groups. By quantifying the performance of algorithms, were able to identify risks and inform necessary adjustments to minimise discrimination and ensure compliance with ethical principles and standards.

Developing Explainable and transparent Ai systems that provide clear explanations for their decisions is crucial in promoting transparency and accountability – enabling stakeholders to better understand the decision-making process and ensure that AI systems are consistent with legal human rights principles.

Providing education and training on human rights and ethical Ai practices, on discrimination and fairness considerations for developers, data scientists, and other professionals working with Ai systems. This will build awareness and promote the development of more responsible algorithms and prevent discrimination against vulnerable groups.

Engaging with communities directly affected by Ai systems is essential in gaining insight into their concerns, experiences, and perspectives. Seeking input from affected communities ensures that Ai solutions are designed with their needs and interests in mind, ultimately minimizing the risk of discrimination and promoting trust in the system.

Proxy discrimination is a critical issue in Ai, with the potential to perpetuate discrimination at unprecedented speed and scale. By carefully considering Ai system design and implementation through a rights-based approach, as well as adopting strategies to address the proxy problem, organisations can work towards creating more discrimination-aware and equitable Ai solutions. Ultimately, this is an ongoing process requiring constant monitoring and evaluation, while placing human well-being at the center.

At TECHila Law, we are dedicated to helping organisations navigate the complexities of ethical Ai development and deployment. With our deep understanding of human rights and Ai, we’re well-placed to support clients in cultivating Ai solutions that respect and uphold the rights of all individuals.

Author: Keketso Kgomosotho

Keketso Kgomosotho is an Attorney & Co-Founder of TECHila Law – a consulting firm specialising in law, human rights and emerging technology. He also a PhD candidate at the University of Vienna.

+2761 528 5571

[email protected]

Follow Us:

The Proxy Problem in Algorithmic discrimination

Contact Info

Pages