AI recruiting systems are rapidly reshaping talent acquisition by automating candidate filtering, ranking, and selection. However, their growing influence raises critical concerns around fairness, robustness, and decision transparency. This talk introduces a practical testing methodology for evaluating AI recruiting pipelines beyond traditional accuracy metrics.

We will examine how synthetic data and augmentation techniques can expose hidden weaknesses, improve coverage, and stress-test edge cases. The talk will address the role of proxy variables, why they matter, and how they can help uncover unintended model behavior. We will also explore fairness measurement strategies, including individual and group fairness metrics, and discuss how these approaches reveal structural bias in ranking and scoring outcomes.

Because parts of the evaluation process can be automated, the session will demonstrate how Python-based agents and LLM “referees” can assist in generating and augmenting CVs and certificates, validating predictions, and assessing explanation quality. This automation can accelerate workflows, increase reproducibility, and reduce human error.

Participants will walk through a complete testing pipeline, supported by insights from real-world projects that illustrate how different tools and strategies expose systemic risks and guide mitigation. Attendees will leave with practical techniques to make recruiting systems more reliable, transparent, and trustworthy in real deployment contexts.

Sebastian Krauss

My name is Sebastian and I work as an AI Test Engineer at Validaitor. With a background in Mechatronics and Autonomous Systems, and hands-on experience at Bosch, Fraunhofer, and in international research settings, I focus on the intersection of AI trustworthiness and real-world deployment. My current work involves developing methods to test AI models for vulnerabilities, safety risks, and secure behavior - ensuring AI systems perform reliably and ethically. I like to share my experience with other techies all around the world. When I don't look into a screen, I like bouldering and books. :)

Is my AI Recruiting biased? - How to evaluate these systems

Sebastian Krauss

Sebastian Krauss