No, you can't 'eval' your way to fairness

Laura Summers

Ethics & Privacy
Python Skill None
Domain Expertise None

Cold open Fairness is fundamentally not tractable to classic optimisation techniques.

The exposition Fairness is not a state of the world, it's an experience of it. No technology is fair in a vacuum. Fairness can only be understood when a technical system collides with humans in the world. It is felt as much as it is calculated. We can look at statistical results in aggregate to understand patterns, but these do not tell the story of the individual.

Further, attempting to optimise numerical fairness metrics is fundamentally coercive and technocratic: putting our thumb on the scale globally, injecting "positive bias" into single dimensions, framing fairness as a data problem rather than a problem of human dignity. It's a "one metric to rule them all" approach that fails to acknowledge differences in preference, culture, experience. To build systems that support human agency we must first abandon our idea of a single moral machine which consistently outputs correct answers from inputs and algorithms. Any system treating people as fungible or undifferentiated is structurally unfair.

What might consent-based fairness look like instead? Asking "Do you want extra help?", making sure individual preferences and self-reported disadvantage can add a layer of human respect into the equation. But we rarely see even this. Instead we see universalist design that decides what's good for people without consulting them - the same pattern that Design Justice critiques as erasing those who experience intersectional disadvantage.

What does this have to do with evals? We're seeing a wave of off-the-shelf libraries measuring bad behaviours in LLM outputs, often simplifications of older fairness metrics. And yes, they can catch obvious failure modes like slurs in outputs. But this is one failure mode among many. Installing a library and calling the job done is fairness washing. The harder, more fruitful approach is to explore the space of failure modes, consider what an ideal world would look like, and design measures, mitigations, and feedback loops accordingly. It also means grappling with the fact that we cannot avoid doing harm. What we can do is harm reduction, humility, and striving toward something better while acknowledging the impossibility of the task.

Third act This talk won't offer easy answers. Attend if you want to grapple with the gnarly problems of building systems for humans. We'll borrow ideas from Design Justice and the disability rights movements: nothing about us without us. Let's ask and answer better questions. You'll leave with sharper mental models and tools for the next tricky conversation at work.

Outline (30 minutes): The problem (10 min):

  • Fairness as experience, not state.
  • Why optimisation fails.
  • The individual vs the aggregate.
  • Why treating people as fungible is structurally unfair.

The critique (10 min):

  • Off-the-shelf fairness evals as fairness washing.
  • The temptation to install a library and call it done.
  • What these tools can and cannot catch without further analysis.

The alternative (10 min)

  • Borrowing from Design Justice and disability rights.
  • Exploring failure modes rather than optimising metrics.
  • Harm reduction over false perfection.
  • Transparency, explanation, empowerment.

What you'll take home You'll leave with sharper mental models for thinking about fairness in technical systems, frameworks borrowed from Design Justice and disability rights movements, and tools for the next tricky conversation at work about what fairness actually means. There are no easy answers here, but there are better questions.

Laura Summers

Laura is a very technical designer™️, working at Pydantic as Lead Design Engineer. Her side projects include Sweet Summer Child Score (summerchild.dev) and Ethics Litmus Tests (ethical-litmus.site). Laura is passionate about feminism, digital rights and designing for privacy. She speaks, writes and runs workshops at the intersection of design and technology.