With Large Language Models (LLMs), generating high-quality text and images is easy and so is misusing it. As AI-generated content becomes harder to distinguish from human generated content, developers are increasingly asking: How can we verify whether a piece of text comes from an LLM? We’ll explore Python’s simplicity and rich ecosystem of libraries to solve this problem. This talk introduces the foundations of LLM watermarking and shows how developers can implement these techniques entirely in Python. We’ll discuss two core approaches, EXP sampling method and KGW method. We will walk through the implementation of the KGW method using simple, transparent code, and compare it with the EXP approach. There's no need for a large model or a GPU cluster to understand how these systems work and the core ideas can be implemented in pure Python using simple code. The code repositories, which includes both methods will be provided so that the attendees can follow along. Along the way, we’ll discuss the trade-offs and the limitations of current research. And for those wondering, “Do I have to implement all this myself?”, the talk concludes with a demo of MarkLLM, an existing open-source toolkit that provides a unified Python interface for experimenting with watermarking algorithms. Attendees will leave with a clear understanding of how watermarking works, when it’s useful, and how to integrate these techniques into real-world Python projects.
LoRaWAN gateways typically depend on cloud-based network servers, creating a vulnerability during internet outages. This talk presents a hybrid solution: a Raspberry Pi-based mobile gateway that operates on The Things Stack Sandbox while simultaneously decoding all device messages locally. The system leverages existing network infrastructure for broad coverage during normal operation, while maintaining full local data access when connectivity fails. This is particularly valuable for emergency response scenarios and remote monitoring where sensor data must remain available regardless of network conditions. The implementation uses Python for gateway orchestration and API integration, while incorporating existing JavaScript libraries (`lora-packet` and device decoders) for LoRaWAN decryption and payload decoding. Data is stored locally in SQLite for reliability and easy access.
pytest lets you write simple tests fast - but also scales to very complex scenarios: Beyond the basics of no-boilerplate test functions, this training will show various intermediate/advanced features, as well as gems and tricks. To attend this training, you should already be familiar with the pytest basics (e.g. writing test functions, parametrize, or what a fixture is) and want to learn how to take the next step to improve your test suites. If you're already familiar with things like fixture caching scopes, autouse, or using the built-in `tmp_path`/`monkeypatch`/... fixtures: There will probably be some slides about concepts you already know, but there are also various little hidden tricks and gems I'll be showing.
This talk has one simple message: *please document your code*. If you attend my talk, you'll hear me explain why I praise documentation, and why you should too. While writing documentation is generally acknowledged to be a "good thing", most engineers do not document their work. I'll offer my optionated lament on the life and death of literate programming. A lament is a poetic discourse, expressing sadness, or feeling sorry about something. I'll give some examples of the *bad things* that can happen when people don't write documentation. Then, after making you feel bad, I'll give examples of how you can *feel good*. I'll explain why writing documentation is a "good" edifying activity, which helps you to be a better person, and make a better world. I'll review types of open source documentation (Python and Unix), documentation frameworks (Diátaxis), and Python tools (Sphinx, Jupyter, Quarto) you can try out as soon as my talk is finished. Then, I'll get "cool n' futuristic" by talking about AI. I'll emphasise the importance of text to AI-assisted coding and agentic workflows for "spec-driven development" (e.g. Agent-OS with Claude Code), before tempering your excitement by giving you some old-fashioned advice on "good" writing style by George Orwell. In summary, if you come to my talk, you might experience an unusual mixture of sadness combined with hope. To conclude, I'll tell you to "please document your code". You'll laugh, go to the next talk, and forget my advice.
In every marketing project, teams strive to find more data, a longer timeframe, and more detailed splits, just to fix noisy channel attribution. But what if structure played a bigger role than size and volume? In this talk, we try to prove this. Using a simple toolkit like Arviz and PyMC, we show you a simple hierarchical mix model, and how, by applying partial pooling, we can stabilize important KPIS like ROAS estimates across sparse channels- without the need for more data. We will go through the code, transformation, and the real-life practices that allow us to get as close to the truth, to be able to have a meaningful impact in the marketing world. The approach will be centered around marketing mix models, different transformations, and how useful it will be for the business.
Building an agentic system that collects and evaluates company information in real time—without curated datasets—requires solving difficult challenges in data acquisition, quality control, and agent orchestration. This talk outlines the solution design for such a system, implemented with Python-based tooling including LangGraph, and emerging protocols such as A2A and MCP, within a multi-agent workflow. Because MCP and A2A are still new and lightly documented, we will share implementation lessons and a practical example of a hub-and-spoke architecture based on a recent real-world system. Attendees will learn architectural patterns for multi-agent systems, common pitfalls of using MCP/A2A in real-world scenarios, and strategies for maintaining data quality in agent-based workflows.
When you need to simulate interconnected dynamical systems in Python, scipy.integrate gives you ODE solvers but no structure for managing complex block diagrams, signal routing, or discrete events. MATLAB/Simulink offers this structure but locks you into proprietary tools. PathSim is an open-source framework that bridges this gap. Born from real engineering challenges in control systems and physics simulation, it brings block diagram modeling to pure Python while supporting modern workflows: stiff system solvers, event handling for hybrid dynamics, co-simulation through FMI, and integration with the scientific Python stack. In this talk, I'll share the development journey, demonstrate what makes PathSim different from basic ODE solving through live examples (from oscillators to event-driven systems), and show how its architecture enables everything from rapid prototyping to hardware-in-the-loop testing. You'll learn when block diagram simulation is the right tool and how to get started.
Managing asynchronous task queues in Django with tools like Celery can be overkill for many projects. Django-Q is a lightweight alternative that integrates natively with the Django admin. In this talk, you will learn how to streamline your background tasks and cron jobs, featuring a practical demo to get you started immediately.
Open-source tools have become essential for making programming education possible in schools that lack modern equipment, stable internet, or licensed software. This talk shares practical insights from teaching Python in under-resourced environments in Namibia, highlighting how tools like Linux, Python, and cloud-based notebooks enable meaningful learning even when only a few devices are available. It also draws on findings from research with teachers across different schools to show the real barriers faced when adopting open-source solutions, and the creative strategies used to overcome them. Attendees will gain a realistic understanding of what digital education looks like in low-resource settings and learn practical approaches for making programming more accessible and inclusive using open-source tools.
Increasing unit labour costs and the imperative need to reduce energy consumption raises the necessity to enhance productivity in industrial production. Python is an excellent tool for GKN Aerospace, as the world’s leading tier one aerospace supplier, to address the needs for higher utilization and unmanned operation on the shopfloor on its site in Kongsberg, Norway. As an example, the presentation shares insight into the in-house developed “Production Execution System”, consisting of a Python backend and a REACT frontend. The application orchestrates all necessary data on cell-level, like NC-programs and additional digital services of the company’s IT environment during unmanned production. Furthermore, it supports the operator with necessary information to ensure highest quality of engine parts in a work environment of increasing digitalization and workload.
Causal inference asks the hardest question in data science: "What would have happened if things were different?" While traditional methods often rely on rigid rules, statistical tests or "black box" adjustments, Probabilistic Programming Languages (PPLs) like PyMC and NumPyro offer a transparent, flexible, and powerful lens to view these problems. In this talk, we move beyond the standard "correlation is not causation" disclaimer. We will build a unified workflow that starts with robust A/B testing, moves to bias adjustment in observational data using multilevel models, and culminates with advanced Deep Causal Latent Variable Models (CEVAE).
Fine-tuning an LLM on five years of Telegram chat shows how real-world conversations challenge models: messy context, sarcasm, unbalanced data, and tokenization pitfalls. This talk covers dataset design, LoRA training, surprising behaviors, and what it means to build a “digital self” responsibly.
How mature are your data pipeline operations? A Roadmap to Operational Excellence. Data teams often struggle to scale their pipeline operations, trapped in a cycle of manual fixes and reactive fire-fighting. But what does "good" actually look like? In this talk, we introduce a standardized 5-level maturity model for Data Operations, focusing on three critical pillars: Orchestration, Data Quality, and Data SLOs. We will deconstruct the journey from "Struggling" (manual scripts, no guarantees) to "Mastery" (automated, resilient, and measured). Attendees will leave with a concrete framework to assess their team’s current standing and a clear, step-by-step roadmap to raise the bar toward operational excellence.
As data engineers, we are used to spinning up a Spark Cluster every time we want to do data processing and handle the overhead that comes with using such a mighty framework. But is this really necessary? In this talk I will argue that single-node processing with Polars is in many cases easier and cheaper. I will compare a typical ETL & Feature Engineering task in Spark and in Polars and offer a pragmatic opinion on when to use one or the other.
Testing traditional software is "simple"... same input, same output. LLMs? Not so much. Same prompt, different result every time. So how do you actually know if your AI product is good? Most teams struggle with this. Generic metrics like "Helpfulness: 4.2" sound scientific but don't drive real decisions. And when a new model releases, it's weeks of debates instead of data. This talk introduces Error Analysis: a methodology to discover the concrete failure modes of your AI product and turn them into measurable evals. You'll learn how to build a failure taxonomy that enables real prioritization. Which issues are critical? Which are frequent? What should developers fix next, and how do you measure success? The payoff: A real quality number for stakeholders. Concrete improvement tasks for developers. And when a new model drops, a ship-or-skip decision within 24 hours based on actual data. Expect a meme-powered walkthrough, real-world examples from production, and a clear path to implement this yourself starting with just 20 traces.
In the twisting vaults of a subway, metro, or U-Bahn, there’s often no reliable cell service, wifi, or GPS. Which means riders had no good way of keeping track of their stops or ETA when underground. After collecting extensive ground truth data, we trained a motion classifier using the phone's accelerometer to identify a moving train. This prediction is fed into a location model that combines it with the train schedule to estimate a location, even when GPS fails. We cover our unique data pipeline, feature engineering, and the optimization for high-scale, offline edge deployment to millions of users.
Free-threaded Python aims to significantly improve performance, allowing multiple native threads to execute Python bytecode concurrently. In this talk, we will explore the current state of Python's free-threading initiative and assess its practical readiness for widespread adoption.
Large Language Models (LLMs) are becoming central to modern applications, yet effectively evaluating their performance remains a significant challenge. How do you objectively compare different models, benchmark the impact of fine-tuning, or ensure your LLM responses adhere to safety guidelines (guard-railing)? This hands-on workshop addresses these critical questions.
We've been told for years that stateless services are the holy grail of scalable web architectures. But what if this foundational principle is actually hurting both development velocity and runtime performance? This talk challenges the dominant paradigm by demonstrating how stateful, object-oriented programming can automatically scale to millions of users without the typical infrastructure complexity. I'll show how keeping objects with their state in distributed memory eliminates the need for explicit caching strategies, reduces database bottlenecks, and dramatically simplifies your codebase. You'll see how a simple Python class can transparently scale across multiple servers, handling millions of concurrent users without implementing REST endpoints, message queues, or cache invalidation logic. We'll examine why the historical evolution of web services created the myth that "stateless is good" and demonstrate an alternative where Python objects live persistently in a cluster, maintaining their state while the framework handles distribution, persistence, and failover automatically. Learn how to write Python code that scales from prototype to production using the same simple object-oriented patterns throughout.
[Cold start](https://en.wikipedia.org/wiki/Cold_start_(recommender_systems) is a critical bottleneck for marketplaces: new items lack behavioral signals and reviews, so ranking models under-expose them, delaying the very signals needed to rank them well. This talk shares practical solutions developed at scale for a travel marketplace, including guaranteed exposure at key positions, efficient real-time re-ranking under latency constraints, and targeted boosting for unactivated items. Attendees will learn how experiment-driven iteration shaped a robust system that accelerates early traction for new items without sacrificing overall marketplace health.
When conducting studies of the urban form, an important resource many researchers turn to is the massive OpenStreetMap dataset. But, as extensive as this dataset is, it lacks one very important aspect about the cities it covers: the people who live there. In this talk, I show you how to add this missing element to your research by bringing in German Census data to create rich analysis capable of answering some of the most pressing issues facing our cities today. I exemplify this by walking you through my own research in urban geography and sustainability with a study of how equitably distributed common amenities are in cities across Germany. Throughout, we look at how Python and PostgreSQL can be used as effective tools to enable this research and keep it organized.
Tired of data quality issues crashing your PySpark and Pandas pipelines? This talk introduces [dataframe-expectations](https://github.com/getyourguide/dataframe-expectations), a lightweight, open-source library for declarative data validation. We will dive into the library's design and demonstrate how to easily define and apply data quality expectations to catch errors early, reduce debugging time, and ship more reliable data products, faster. Learn to build more robust data pipelines and move from reactive problem-solving to proactive data validation.
Quantum networks connect quantum devices including quantum computers, enabling applications not realizable in classical networks, such as secure quantum computing in the cloud and quantum key distribution. These networks are now moving from theory to reality, and as part of the Quantum Internet Alliance, we are actively building a prototype quantum network in Europe, driven by applications developed in Python. In this talk, we will introduce quantum networking and demonstrate how to program quantum network applications in Python by walking through the quantum teleportation protocol. We'll conclude by sharing resources so that you can begin experimenting with quantum network programming yourself. No prior quantum experience required.
NiceGUI has grown from a small experiment into a widely used framework for building modern web-based user interfaces entirely in Python. After five years of development, thousands of users, and countless design iterations, we have gathered a rich set of insights into what makes a UI framework feel truly “Pythonic” while still leveraging the power of the web platform. This talk presents the key lessons learned while evolving NiceGUI, with a focus on how Python’s own language features can meaningfully improve the developer experience. We explore how context managers, method chaining, decorators, async/await, type hints, dataclasses, and even well-chosen default arguments contribute to a clean, expressive, and maintainable UI API. Attendees will walk away with a deeper understanding of how to design Python-first interfaces—whether for web apps, dashboards, or internal tools—without needing to write JavaScript, CSS, or frontend boilerplate.
[Ansible](https://docs.ansible.com/) is a popular [infrastructure as code](https://en.wikipedia.org/wiki/Infrastructure_as_code) tool for server configuration and software deployment. This tutorial will cover things that I wish the first day that I started using Ansible to manage the projects at my work.
Python’s ecosystem evolves through thousands of developers importing, combining, and reinventing third-party packages. But how predictable is this process? And why do programming ecosystems behave in ways that are hauntingly similar to seemingly unrelated, distant realms of innovation, such as patents? In this talk, we analyse 15 years of Python code snippets from Stack Overflow and uncover a striking pattern: the way Python developers create and recombine libraries follows the same mathematical laws as those governing US patents and scientific publications. New packages enter the ecosystem at a slowing rate, while new combinations of packages appear at a remarkably stable, almost linear pace. We show how these regularities emerge naturally from a simple adjacent-possible urn model—a surprisingly powerful tool that captures how innovation unfolds as old ideas are reinforced, and new ones occasionally appear. The obtained quantitative view can explain how Python changes over time, how novelty spreads, and what governs the evolution of the software we use every day.
The traditional process for auto damage evaluation is relatively slow, subjective, and prone to fraud. With this presentation, the goal is to show a Multi-Agent System designed for the automation and standardization in real-time of the car damage evaluation, disrupting the initial claims workflow. The system is built around an Orchestrator Agent with the role to coordinate specialized AI agents: a Vision Agent (powered by OpenAI GPT-5.2) for damage analysis and severity classification, two Cost Estimation Agents (powered by Perplexity's sonar-pro) to provide comparative quotes (OEM vs. Aftermarket), and a Shop Finder Agent for local repair options. The system produces a report that includes a description of the damage, severity, comparative repair costs in local currency, and recommended repair shops, all embedded into a Gradio interface. The task of this approach is to reduce the processing time, improve transparency for customers, and provide insurers with objective data to enable faster claims resolution.
In an era where new AI models, benchmarks, and frameworks emerge daily, many of us feel caught in a relentless cycle of catching up, what is called "AI fatigue". This talk dives into the causes and consequences of that fatigue, from information overload and social media hype to the constant pressure to stay relevant. Drawing on personal experience and community insights, we explore why chasing every new paper or trend often leads to burnout rather than mastery. More importantly, we share practical, evidence-backed strategies to stay informed without losing balance: curating a focused “information diet,” setting clear boundaries, using summarization tools intelligently, maintaining a personal knowledge base, and embracing “JOMO”—the joy of missing out. We also discuss how organizations can combat fatigue structurally by promoting focus, curiosity, and psychological safety. This session is for anyone, from beginners to seasoned professionals, seeking to rediscover genuine curiosity in AI while preserving mental well-being. Attendees will leave with concrete tools, actionable habits, and a renewed sense that it is not only acceptable but healthy to not know everything.
What distinguishes a lousy plot from a beautiful chart that communicates insights effectively? This talk will show you the underlying principles of good data visualization, offer lots of practical tips and tricks and give an overview of the data visualization landscape in Python. After the talk, you will be able to create better charts, whether for exploring your own data or for communicating results to others.
Open table formats have *almost* freed us from vendor lock-in. They form a critical building block of the modern, composable data stack. The most prominent open table format is Apache Iceberg - not only because of its storage layout, but also due to its REST catalog specification. Iceberg has gained significant traction through a recent stream of feature announcements from the community itself, major cloud providers like AWS, and data platform leaders such as Snowflake and Databricks. But cutting through the hype: how does Iceberg actually perform in the real world if you are *not* Netflix or Apple which are capable of *Building Your Own Snowflake* (BYOS)? Can you realistically migrate from legacy solutions to Iceberg and enjoy all its promises without tradeoffs? That, of course, is a rhetorical question. Some even argue that Iceberg got parts of the specification fundamentally wrong!?! Curious? Join me for another episode of Open Table Formats in the Wild™. Expect a practical look at the current state of Apache Iceberg and Apache Parquet, alongside a gentle introduction to DuckLake and Vortex as promising contenders for table and file formats, respectively.
New studies showed how the brain is not a passive receiver of stimuli but an active predictor of stimuli. People with autism have more difficulties when the predicted and received stimuli doe not match. How do we create a tech workforce where autistic individuals can work more comfortable due to predictability?
The James Webb Space Telescope has revealed a mysterious population of "Little Red Dots": extremely distant objects that have upended our understanding of the early Universe. However, revealing the true nature of these marvels requires computationally-intensive statistical modeling of complex astronomical data. In this talk, we explore how we used JAX and NumPyro to help solve this puzzle. We will introduce these powerful Python tools, demonstrate how they accelerate complex statistical data analysis, and show how they provided evidence that Little Red Dots may in fact be "Black Hole Stars."
AI-assisted coding became the default. Tools like GitHub Copilot, Cursor, and Claude can generate hundreds of lines of Python in seconds. However, the real challenge isn’t how fast we generate code — it’s how we ensure that generated code actually represents our intent, follows best practices, and integrates cleanly into existing systems. In this talk, I share my personal journey adopting *Spec-Driven Development (SDD)*, a way to engineer the context in which AI writes code. Using real examples from my daily work building production-grade RAG systems, I show how specifications can become a practical way to interact with AI coding tools. I present an explicitly opinionated comparison of emerging tools such as SpecKit and OpenSpec, focusing on what each tool is good at, where it breaks down, and when I would (or would not) use it.
Traditional RAG systems struggle to understand holistic connections in distributed, constantly changing knowledge sources that characterize real-world organizations. While document-based approaches using vector embeddings provide basic retrieval, they fail to capture relationships and answer complex questions about interconnected information. Graph-based RAG offers a solution, but existing implementations like Microsoft's GraphRAG explicitly avoid dynamic operations due to complexity, requiring costly rebuilds when knowledge changes. This talk introduces a production-ready dynamic knowledge graph system that supports real-time insertion, querying, and deletion of information. Through practical implementation details you will learn to build maintainable knowledge graphs that evolve with data, handle ambiguous entities and preserve information lineage.
Time series foundation models (TSFMs) such as Chronos, Lag-Llama, TimesFM, and Siemens’ own GTT have shown strong generalization capabilities across diverse forecasting tasks. However, integrating these models into a large organization is primarily a software engineering and MLOps challenge rather than a modeling one. In this talk, we present a real-world case study based on Siemens KPI Forecast, a Python-based forecasting platform that operationalizes multiple TSFMs as reusable, production-grade services. The platform integrates both open research models and Siemens-developed models behind a unified API, supporting zero-shot inference, fine-tuning jobs, and fine-tuned inference depending on user needs and operational constraints. We focus on how Python is used to compose heterogeneous components including open and closed-source models, internal data products, APIs, and orchestration layers into a consistent time series specialist user experience. The session also covers operating such services with clear SLAs in a B2B environment, including monitoring, versioning, and governance. Attendees will gain practical insights into turning TSFMs into reliable Python services that scale across teams and use cases.
AI development usually focuses on feasibility and implementation, but a new buzzword is now being used: 'sovereignty'. While customers are excited about it, what does it mean for them and for AI developers? In this presentation, we analyse different aspects of sovereignty and explore how it can be used to build trustworthy AI solutions. We will also discuss current examples from politics and development to identify the best practices for secure data processing.
Have you ever asked yourself: Why is there no good food option close to this main station? This talk tries to find out if this is a systematic problem - using publicly available data and Google APIs. After this talk, you will know about the best- and worst-rated restaurants close to main stations in Germany, if kebabs or pizza places are systematically a better choice, and which station is the worst to eat in all of Germany.
Why did the model say "No"? In an era where machine learning models increasingly influence high-stake decisions, "trust me" isn't a sufficient explanation. Yet, the logic behind many model decisions remains a black box, often hiding bias and making it difficult to establish trust. In this talk, we move beyond the mystery of the "ghost in the machine" and into practical debugging using a structured *XAI Decision Tree*. Instead of guessing which method to use, we will walk through a logical framework that narrows down the field based on a few critical questions: the type of data you have, the level of model access available, and whether you need to explain a single prediction or the entire system. The audience will leave with a clear path to choosing the right explainable AI (XAI) method - such as SHAP, LIME, or Integrated Gradients - and the corresponding Python framework for their specific use case. This session will cover: - Importance of XAI: Understanding why XAI is crucial using a real-world example - XAI Landscape: An overview of existing XAI methods and how they are related - XAI Decision Tree: How to use the structured XAI decision tree to choose the right explanation method for your use case - Local vs global: A common understanding of local vs global explainability - XAI in Practice: A demo showcasing XAI in practice as well as corresponding Python frameworks to use
If you have worked with real-world data before, you know that processing it can be challenging. Data often comes scattered across tables, in inconsistent encodings, with duplicated rows and is generally dirty. In this tutorial, you will learn how to process large amounts of data reliably and quickly using `polars` and `dataframely`. What we love about `polars` is that it's easy to use, fast and elegant — it allows us to build and compose complex transformations with ease. On this basis, we built `dataframely`: a library for defining and validating contents of polars data frames. With `dataframely`, we can build pipelines without ever getting confused about what's in our data frames. We document and validate our expectations and assumptions clearly, which makes our pipeline code simpler and easier to understand. "Is this join correct?", and "where did this column come from?" are questions you will not have to worry about anymore. In this tutorial, you will become familiar with `polars` basics by writing a simple pipeline: you will read data, transform it to make it ready for use, and you will learn how to do that fast. With `dataframely` schemas, you will upgrade your code from "it works" to "it's beautiful!", and along the way, `dataframely` will help you eliminate entire classes of bugs you will never have to think about again. After the tutorial, you will be all set to use these tools in your own work.
Python makes it easy to build APIs quickly, but many APIs that start clean become fragile, difficult to change, and risky to maintain as users, features, and teams grow. This talk focuses on designing Python APIs that remain stable, understandable, and adaptable over time. Drawing from real backend systems built and maintained in production, I will share practical lessons on API design decisions that worked, those that failed, and the trade-offs behind them. We will cover topics such as boundary definition, validation strategies, versioning, authentication design, and how small early choices can either support or block long-term evolution. This talk is aimed at developers who already build APIs with Python and want to move beyond “it works” toward systems that can survive real users, ongoing change, and long-term maintenance.
Imposter syndrome affects engineers everywhere, but underrepresented professionals often face amplified self-doubt due to geography, limited access, and systemic biases. In this talk, I share my 7-year journey as an African AI engineer building global impact from outside major tech hubs. From founding DataFestAfrica and leading remote AI opportunities to getting the attention of organizations like Huawei, MongoDB, McKinsey, and AnyScale, I’ll show how community, mentorship, open source, media presence, and strategic partnerships can create opportunities and influence. Attendees will gain practical strategies to overcome self-doubt, expand their reach, and make a meaningful difference in tech, no matter where they are.
Large-scale distributed systems are inherently complex: hundreds of asynchronous services continuously emit updates, retries, corrections, and partial state. Turning that constant stream of noisy events into something that can be comprehensively searched through in real time, at the scale of hundreds of billions of records per day, can be harder that it looks. Whether you're building large-scale data systems, fighting real-time processing bottlenecks, or simply enjoy Kafka horror stories, you'll leave with practical ideas, a few scars, and hopefully fewer retries.
Want to make your tech tutorials accessible but don't know where to start? This talk shares practical techniques anyone can use. In June 2025, I started creating tutorials for deaf and hard-of-hearing learners because my partner is hard of hearing. I learned that accessible content helps everyone: international learners, people on noisy trains, junior developers and tired seniors at the end of the day. In this talk, I will share practical techniques for creating accessible tech tutorials: • Creating videos with meaningful subtitles (manual timing, simple language) • Principles of simple language for technical content • Structuring content so everyone can navigate it easily I am a content creator who learned these techniques through experimentation while teaching Excel. The talk presents my actual workflow with examples from creating tutorials for deaf/hard-of-hearing learners. Whether you're creating video tutorials, writing documentation, or teaching workshops, you'll leave with actionable steps to make your content more accessible. Why it matters: Tech education is growing globally. Making our content accessible isn't just good ethics—it makes our teaching better for everyone.
### **DungeonPy** – an interactive Dungeon&Dragons app for remote campaigns As a matter of fact, tabletop RPGs are secretly distributed systems: one canonical world state, many clients, lossy links (players), and strict access control (“no peeking at the DM notes”). This talk introduces **DungeonPy**, which evolves a Python D&D companion from two local app – a Pygame battle map and a PySimpleGUI initiative/condition tracker – connected by lightweight TCP messages, into an authoritative server with multiple role-aware clients. The result is a fully real-time interactive setup, where the DM controls the full state and can reveal information selectively – under the hood it’s all about client intents, server validation, state updates, event broadcasting and periodic snapshots. We will cover protocol design (deltas vs snapshots, ordering/idempotency), server-side view projections (DM omniscience vs per-player truth and fog-of-war), UI-safe concurrency, and testing your homemade message bus without summoning race conditions. Expect patterns you can reuse in any stateful client/server app – just with more goblins.
Python relies heavily on special values such as `None`, `NotImplemented`, `Ellipsis`, and `dataclasses.MISSING`. These values are not incidental: they encode language semantics, enable control flow between objects, and shape API design. This talk examines sentinel values as a first-class concept in Python. We will look at why None is often the wrong representation for absence, how NotImplemented enables double dispatch in rich comparisons, and where sentinel values appear throughout the standard library. A central focus is typing. While sentinel values are ubiquitous at runtime, Python currently has no standardized way to express them precisely in type hints. We will examine why Optional, overloads, and Literal fall short, what limited narrowing is possible today, and why creating a “real” custom sentinel with reliable type narrowing is still unsolved. Finally, we will discuss [PEP 661](https://peps.python.org/pep-0661/), i.e., the deferred proposal to standardize sentinel values and their typing semantics, and what its deferral means in practice. Using real-world examples, including Pydantic’s experimental missing concept, this talk provides a clear mental model for sentinel values and realistic guidance for using them in typed Python codebases today.
Modern systems are complex - and testing them in real environments is often expensive, risky, or simply not reproducible. Simulation is a practical way to explore behavior under controlled conditions: run scenarios, validate assumptions, inject failures on purpose, and repeat experiments without touching production. In this talk, I build a concrete event-based simulation with `SimPy` to compare `load-balancing algorithms` under different conditions. I’ll show how `SimPy`’s processes and events fit together, how to structure the simulation cleanly, and how to move beyond a one-off demo by making runs reproducible and configurable - using `configuration files` and a simple `command-line interface`.
Have you ever happened to use GPS and realised that it is not working properly? The Sun could be responsible. In this talk, I present a **real-world machine learning forecasting system** designed to predict a Space Weather phenomenon affecting GNSS accuracy and radio communications. The system is based on **CatBoost** and integrates data from space- and ground-based observations. **SHAP** is used to debug model behaviour and to build trust in model outputs. The talk focuses on **model design and evaluation choices**, showing how interpretability and uncertainty-aware forecasting can be combined in a real-time operational pipeline.
Life sciences compliance isn't forgiving. When your software helps companies navigate FDA regulations, ISO 13485, and EU MDR, "move fast and break things" isn't an option. Audit trails matter. Documentation is mandatory. Getting it wrong means regulatory findings, delayed product launches, or worse — patient safety risks. During the development of our AI Assistant we made every mistake in the most unforgiving environment possible. After more than a year building with PydanticAI, pydantic-evals, and Claude — nearly 3,000 commits and 20+ contributors — here are 7 anti-lessons so you don't have to repeat them: 1. **"We need a multi-agent system"** — We built one. Then deleted it. 2. **"Agents need sophisticated planning"** — A todo list beat our workflow engine. 3. **"Give the agent lots of specific tools"** — Two high-level tools replaced dozens. 4. **"Encode workflows in code"** — Markdown files the agent reads at runtime won. 5. **"It works when I test it"** — Simple tests ≠ real user journeys. Realistic evals or you're blind. 6. **"Automate everything"** — Human stays in the driver's seat, not the trunk. 7. **"Apply what made you successful before"** — Your engineering instincts might hurt you here. Real code, real git commits, real mistakes from a domain where mistakes are expensive. **Come for the mistakes. Leave with shortcuts.**
Develop FastAPI applications faster with the contract-first approach using the OpenAPI Generator, no GenAI required. Machine learning models are often deployed as APIs, but the "agreement" between the consumer and the service is often fragile. How does the consuming app know if a parameter is optional or required? When the code diverges from the documentation, integration breaks. In this tutorial you will learn to define an API contract using OpenAPI specification. We will use the OpenAPI Generator to automatically generate API endpoints and strictly typed Pydantic data models. Following this approach for all applications supports standardization, consistency, and maintainability across all projects. The session will cover three key areas: **Design**: We will define an OpenAPI specification as our single source of truth for the API and end consumer. **Generate**: We will use the OpenAPI Generator to create a FastAPI skeleton and show possibilities for customization to fit specific project needs. **Implement**: We will connect our generated app to a ML model where we will create Mystic Creatures for Real Life Problems
Every week, development in AI brings us another groundbreaking release, another model version, another must-have integration. In this rapidly shifting landscape, how does one build production systems that won’t be obsolete by the time you deploy them? We'll explain how trusting in proven engineering principles from software development and machine learning, like separation of concerns and evaluation practices, became our anchor in an ever-changing landscape of AI development. We share lessons learned from building two MCP applications using FastMCP and PydanticAI. Against these challenges, we found that fundamental engineering principles provided the foundation we needed. Participants in the process of developing AI tools will leave with practical strategies for building AI-powered systems that are flexible enough to adapt, yet stable enough to trust.
Modern applications rarely fail in obvious ways. Instead, they break at the edges: unexpected inputs, race conditions, misused APIs, and assumptions nobody realized they were making. This talk presents ten practical and repeatable ways to intentionally break an application, using a QA mindset with a strong Python focus. The session is designed to help QAs sharpen their investigative approach and move beyond happy-path testing, while giving developers concrete insight into where real-world failures often originate. Each “way to break an application” highlights a common risk area such as data handling, state management, timing, configuration, or integration boundaries. Attendees will learn how to think more destructively (in a productive way), design better tests, and recognize fragile design decisions earlier. The goal is not to assign blame, but to improve collaboration and software quality by understanding how systems actually fail in practice.
Everyone talks about LLMs, RAG, and AI agents - but who truly understands them? Marketing promises magic while documentation assumes expertise. Recent research from Gartner reveals the consequences: only 8% of HR leaders believe their managers possess adequate AI competency, while companies that restructure work around AI achieve revenue goals twice as often as those who merely train employees. The problem isn't lack of information; it's the lack of genuine understanding through experience. We took a different approach. Instead of slides or tutorials, we built "AI Factory" - a non-profit educational platform in the form of escape room game where players learn by doing. Craft prompts under budget pressure. Watch guardrails fail in real-time. Break their own RAG pipeline. Each mistake teaches more than any documentation ever could. In this talk, we'll share what we discovered while building and testing this game with real users: why failure-driven learning outperforms tutorials, how game mechanics create memorable "aha moments," and the surprising concepts that clicked only through play.
The timeless phrase “garbage in, garbage out” is even more important today with the growing usage of non-deterministic generative neuronal networks, which amplifies the effect of bad data quality. This presentation describes Data Quality Monitor — a tool to bring transparency into data quality and help drive real improvements. In the talk, we'll cover what defines a successful data quality monitoring solution and share findings from our initial evaluation of available open-source frameworks. Next, we'll showcase our implementation based on DQX. DQX is a lightweight, open-source framework for performing row-level data quality checks programmatically, with business rules organized in manageable YAML files. DQX, originally developed by Databricks Labs, integrates seamlessly with PySpark, making it easy and affordable to run data quality checks within our IoT data lake. Finally, we will discuss the organizational processes and structures required to effectively respond to data quality issues.
The City of Munich is modernizing its communication: With the transition to the Zammad ticketing system, there is a unique opportunity to not only manage citizen inquiries but to proactively process them using Artificial Intelligence. The Zammad-AI project utilizes a two-stage process consisting of intelligent classification and RAG-based (Retrieval-Augmented Generation) response drafting to significantly reduce the workload of administrative staff. In this talk, we demonstrate how we integrated Zammad-AI via an internal Kafka message bus to process tickets in real-time. We explore the technical workflow—from thematic context analysis to the generation of valid response drafts based on a department-specific knowledge base.
How do you evaluate performance when you predict more than 10 million time series each day? While a good plot can be worth more than a thousand metrics for a single time series, with large-scale machine learning models implemented with *LightGBM* and *PyTorch* we have to resort to meaningful aggregations. We will share insights and learnings from the past 2 years of deploying and operating our article-level demand forecasting models at the pricing department of Zalando. This talk moves beyond basic metrics to showcase the pitfalls of aggregated error measures and the best practices we’ve developed to keep our stakeholders informed and our models accurate.
AI recruiting systems are increasingly used to filter, rank, and select applicants at scale. Yet their deployment raises essential questions: How reliable are these models in real hiring environments, and how do we ensure fairness and safety across diverse applicant profiles? This talk presents a structured approach to testing and validating AI-driven recruiting pipelines. It highlights the role of synthetic test data, data augmentation, and fairness metrics in uncovering systemic risks and mitigating bias. Attendees will walk through a complete evaluation workflow. The session also incorporates insights from real-world testing practices, demonstrating how rigorous validation can increase trust and transparency in recruitment AI.
AI is sometimes hard to explain, especially for people outside of tech. With robots, AI becomes visible and tangible. In this talk we want to show how we can use Python and the huggingface reachy mini as an example to make AI more concrete, interactive, and engaging for beginners and non-experts.
Many machine learning tools assume abundant, independent data, rely on a single data split plus cross-validation, and leave test-set separation to the user. In application-driven domains such as industrial materials science and pharmaceutical development, data are scarce, high-dimensional, and often correlated, creating conditions under which standard ML pipelines frequently fail. Small datasets are highly sensitive to the random seed used for splitting, and common pitfalls such as feature selection before splitting or distributing correlated samples across train and test sets cause data leakage and inflated performance metrics. Octopus is an open-source Python AutoML library explicitly designed for small-data, high-dimensional regime. It enforces strict nested cross-validation for model and hyperparameter selection, quantifies performance variability across multiple splits, and tightly controls data leakage. Its modular architecture embeds an internal ML engine, several feature selection methods (e.g., MRMR, Boruta), and external AutoML solutions such as AutoGluon into a unified, rigorous validation framework, enabling systematic and fair comparison of methods on limited data. In addition, Octopus supports survival analysis, addressing time-to-event problems common in healthcare and materials science. This talk will use realistic small-scale datasets to illustrate how conventional pipelines can be misleading and how to obtain more reliable models when every sample matters.
As one of Europe’s largest retail corporations, REWE Group owns and manages prominent supermarket chains such as REWE and PENNY, among many other subsidiaries. In this talk I will give a brief overview of how we introduced a formal mentoring program, Pair & Share, at the central analytics department of REWE Group with its more than 150 data scientists, engineers, analysts and other colleagues. Before Pair and Share, there was no formal process for personal, technical or methodological growth. Although there are plenty of possibilities, further training and education was self-organized and fragmented. To increase growth among our colleagues and build and strengthen inter-team exchange, we introduced the formal mentoring program, Pair & Share. This talk will cover a brief overview of REWE Group and our analytics department followed by a motivation for Pair & Share. Afterwards I will explain how we planned the mentoring program and defined the parameters like the matching process, the time frame and how to recruit participants. I will also share my experiences of the first six months of mentoring, what kind of roadblocks but also pleasant surprises we encountered. The talk will be concluded with an outline of how we plan to continue and improve the program.
Wolt’s Universal Venue Ranker (UVR) is a large-scale, sequence-aware ranking model for personalized restaurant recommendations, deployed across more than 30 countries. UVR replaces three previously independent models—Neural Collaborative Filtering, a second-pass ranker, and a first-time-user model—by combining a transformer with a gradient-boosted decision tree for ranking. The model follows a two-stage design. In the first stage, an encoder-style transformer learns a personalized user state representation from historical restaurant purchase sequences enriched with spatiotemporal signals such as time and location. In the second stage, a CatBoostRanker uses the transformer output as an input feature alongside additional user-, venue-, user–venue-, and delivery-specific features to score and rank candidate venues. In this talk, we present the model and service architecture, the training and evaluation setup, and both offline and online results from a multi-country online A/B test, demonstrating significant improvements in global conversion rate and new venue trial rate. We also share practical lessons from deploying and operating a multi-stage ranking model under strict latency constraints at global scale.
Modern web frameworks such as Hono have renewed interest in schema-driven development and the “Lambdalith” architecture, where an application is delivered as a single AWS Lambda function. While this model provides a predictable developer experience, Python-based serverless systems often struggle to achieve the same consistency, validation, and maintainability in production. Deploying Python web frameworks to AWS Lambda frequently requires additional execution layers—such as ASGI adapters or container-based runtimes—which add complexity and blur data boundaries. For teams that prefer clear, minimal Lambda handlers, these abstractions can hinder both development and operations. This session shares production-proven patterns for building schema-driven Lambdalith applications in Python using AWS Lambda Powertools and Pydantic, without relying on heavy framework abstractions. Through real-world examples, we show how these tools simplify handler logic, standardize request and response validation, and improve observability and error handling. Attendees will leave with practical techniques for building reliable and maintainable Python Lambdalith systems, and insights they can immediately apply to modernizing existing serverless codebases or delivering new production services with confidence.
Large language models have been widely used in tool-calling workflows thanks to their strong performance in generating appropriate function calls. However, due to their size and cost, they are inaccessible to small-scale builders, and server-side computing makes data privacy challenging. Small language models (SLMs) are a promising, affordable alternative that can run on local hardware, ensuring higher privacy. Unfortunately, SLMs struggle with this task - they pass wrong arguments when calling functions with many parameters, and make mistakes when the conversation spans multiple turns. On the other hand, for production applications with specific API sets, we often don't need general-purpose LLMs—we need reliable, specialized models. This talk demonstrates how to increase the accuracy SLMs (under 8B parameters) for custom tool calling tasks. We will share how leveraging knowledge distillation helps to get the most out of SLMs in low-data settings - they can even outperform LLMs! We will present the whole pipeline from data generation, fine-tuning, and local deployment.
Weather and environmental data power analytics, ML, and operations—but APIs differ wildly and data prep is slow. Wetterdienst is a Python library that provides a unified, Polars‑first interface to multiple weather services (DWD, ECCC, EA, NOAA/NWS, Geosphere Austria, IMGW, Eaufrance, WSV, and more). It standardizes request patterns, returns tidy (long) data, converts to SI units, handles caching, timezones (UTC by default), and retries—so teams can focus on analysis instead of plumbing. This talk introduces Wetterdienst’s provider architecture, core request patterns, performance practices with Polars, and how to integrate via Python, CLI, or its REST API. We’ll walk through real examples (station discovery, parameter selection, timeseries retrieval), exporting to databases, and patterns for robust pipelines in ETL and ML.
E-commerce cataloging at idealo operates at extreme scale: 4.5 billion offers from 50,000+ shops across six countries, with peak ingestion rates of 4.8 million offers per minute. While large language models (LLMs) provide strong classification accuracy, they are too slow and costly for billion-scale real-time processing. This talk shows how idealo builds a cost-efficient, high-throughput machine learning system that leverages LLM knowledge without deploying full models in production. We present how knowledge distillation from a large e5 instruction model enables a compact multilingual MiniLM encoder to achieve high accuracy, and how optimized inference runtimes and specialized hardware such as AWS Neuron help meet strict latency and cost requirements. Beyond modeling, we highlight key operational challenges: constructing training datasets from massively imbalanced data, selecting the right encoder architecture from today’s model landscape, and designing a robust MLOps lifecycle with automated data sampling, training, deployment, and monitoring. Attendees will learn practical techniques for scaling ML systems under real-world constraints, how to extract value from LLMs when they are too large to serve directly, and how to transition research prototypes into reliable, high-volume production pipelines.
"Zero-copy" data transfer promises free communication between Spark's JVM and Python workers, but at 6 billion rows daily, the reality is far more complex. This session explores the low-level mechanics of distributed inference, focusing on the serialization bottlenecks that plague large-scale Gradient Boosted Tree models. We will conduct a forensic analysis of execution plans generated by `pandas_udf`, `mapInPandas`, and SynapseML. By profiling memory hierarchies and CPU cycles, we visualize the true cost of pickling, Arrow record batching, and JNI context switching. Join this deep dive to understand the physics of distributed inference and learn how to tune `spark.sql.execution.arrow.maxRecordsPerBatch` to prevent OOMs without starving the CPU.
Forecasting talks love a clean ending: “and then we improved WMAPE by 3.7%.” Nice. Now put that model into production without suffering from instability. You retrain your model on a few new weeks of data and suddenly the one-year forecast jumps 15–20%. Planning teams redo decisions, trust erodes, and your “accurate” model becomes unusable. This talk is about forecast stability: how much forecasts change when you add new data and rerun the same pipeline. We run a simple experiment: train a model, forecast one year ahead, add recent data, retrain, and measure forecast-to-forecast change. We repeat this across common forecasting approaches including ETS/ARIMA, Prophet, XGBoost with lag features, AutoGluon ensembles, neural/global models, and TimeGPT-style APIs. You will see that high accuracy does not guarantee usable forecasts, and that some models are systematically more volatile than others. We then cover practical ways to stabilise forecasts without freezing them, focusing on reconciliation and ensembling (including origin ensembling). This talk is for forecasting practitioners who want models users actually trust, not just good metrics.
In the daily work of a data engineer, building new data pipelines often takes priority, while maintaining them and ensuring their correctness becomes an afterthought. This focus can quickly turn into a pitfall: failures go undetected, incorrect data silently propagates, and complaints from stakeholders arrive before engineers notice any issues. In practice, incorporating observability into every new data pipeline helps avoid these problems and enables teams to steadily increase system complexity while maintaining trust and peace of mind. In this talk, I introduce observability in the context of data pipelines, covering its three core pillars: metrics, alarms, and logs. We will explore concepts such as white-box versus black-box monitoring and best practices like the four golden signals and how they apply to data pipelines. I will show easy to implement first steps and share real-world experiences where improved observability helped uncover previously unknown incorrect behavior, gradually improve data quality, and build trust in data systems. This talk is well suited for data engineers that had little exposure to observability and want to learn about strategies how keep sane while managing a jungle of pipelines.
Containers are a fundamental part of the modern developer's toolkit, yet they are frequently misunderstood and described as "lightweight virtual machines." This talk demystifies containerization by building a functional, minimal engine from scratch using only the python standard library. We will step away from high-level tools like docker to explore how the linux kernel provides isolation through features like `chroot`. Using a hands-on approach, we will demonstrate how to set up a sandboxed environment, isolate a filesystem, and execute processes within it. This session is designed for developers who use containers daily but haven't yet had the opportunity to look under the hood or explore the underlying operating system principles. By implementing a simplified version of these tools, you will gain a clearer, more practical understanding of the core mechanics that make containerization possible.
A data scientist builds a Streamlit or Dash prototype, the business wants to validate it, and the hard parts begin: getting access to live data, making the app available company-wide, and ensuring every user only sees what they are allowed to see. Following "best practices" turn a simple demo into weeks of platform work, leaving data scientists frustrated and blocking them from shipping apps to end users. In this talk we will **live-demo** Merck's self-service app service we have developed and hardened over multiple years. It lets **teams deploy Streamlit (and friends) in 3 minutes** while meeting best practices like SSO, CI/CD, and governed data access control. The platform has become essential for Merck to ship data apps at scale: in 2025 it powered **750+ active apps** reaching **8,000+ unique end users**. **Under the hood, we show:** how a use-case based access model enables scoped resource permissions so apps can safely access data on-behalf of the user. We also show starter templates that generate a deployable Git repo with example pages (e.g. Snowflake access or internal LLM chatbot). Finally, we cover the guardrails needed to operate this safely. **What you will learn:** a cost-effective reference architecture based on AWS that you can adapt to your hyperscaler or platform, practical patterns for balancing the trade-off between central control and decentral freedom, and how templates and CI/CD help teams iterate quickly without compromising security or reliability.
Most hyperparameter optimization (HPO) stops at the model boundary. But what happens when your system relies on a complex chain of steps, a short-horizon model, a long-horizon model, ensembles, postprocesses etc? Tuning one piece in isolation often leads to sub-optimal global results. In this talk, we explore how we used Ray to move beyond simple model tuning. We’ll dive into a "Pipeline-as-a-Trial" architecture where Ray acts as the brain, triggering independent, scalable cloud workflows ( SageMaker Pipelines or Databricks Workflows) for every hyperparameter set. We will discuss: * The architectural shift from tuning models to tuning pipelines * How to build the DAG/pipeline on Sagemaker/Databricks using declarative configs * How to use Ray to orchestrate heavyweight remote jobs without bottlenecks. Attendees will learn how to optimize entire pipelines (in a scalable manner on cloud) to minimize global metrics like WAPE, rather than just local model loss.
As large language models (LLMs)-powered “AI highlights” become the first information people see on the Web, a key question arises: how much variety and perspective do these systems actually deliver for information-seeking queries? Do LLMs offer broader viewpoints than traditional search or Wikipedia pages? Do larger models really produce more diverse answers—or are they all converging on the same language, and framing, raising concerns about “knowledge collapse”? Drawing insights from experiments across LLM families, real-world topics, and hundreds of user-style prompts, this talk introduces an open-source framework for benchmarking and tracking epistemic diversity in LLMs. We focus on practical lessons for data scientists building and evaluating LLM-powered search, summaries, and knowledge systems—where diversity of information actually matters.
Before you ship an AI agent to production, you need to understand how it can be broken. Jailbreaking and prompt injection attacks are not edge cases—they are an inevitable consequence of deploying real-world, action-taking AI systems. This talk is a practical primer on the most common ways agents fail under adversarial pressure. We’ll break down how jailbreaking and prompt injection attacks actually work, including techniques such as excessive agency, prompt leakage, and weaknesses in vector search and embeddings. We’ll examine why popular AI guardrails consistently fail in practice, and offer little more than a false sense of protection. We’ll also address a common misconception: the absence of major AI security incidents does not mean systems are safe. Instead, it reflects limited deployment, constrained agency, and cautious rollout. As organizations adopt browser agents, autonomous tools, and systems that can take real-world actions, these vulnerabilities quickly become critical attack surfaces. This talk focuses on what organizations should do instead: applying proven security principles—least privilege, isolation, monitoring, and abuse modeling—adapted to the unique properties of AI systems. Attendees will leave with a clear understanding of the real risks, why they matter today, and the concrete steps to take before shipping an AI agent into production.
Ever wanted to build a website that can run python, but you're worried about running user submitted code on your server? In this talk I'll show how Holoviz Panel can create an interactive coding environment where students can write functions, solve exercises, and experiment safely, all while their code runs locally via WebAssembly.
Discover how to build a GPU shader generator in pure Python, without having to write a compiler. We start by discussing how Pythonic embedded domain-specific languages (EDSLs) can help address the common challenges of shader programming. We then examine the architectural decisions shared by popular frameworks like Warp and Taichi and outline their limitations. In particular, their reliance on introspection means supporting only a subset of Python - a language within a language - while compiler-like backends necessitate complex implementations in languages like C++. The talk introduces an alternative architecture making it possible to overcome these limitations. Instead of introspection, we capture the program's logic by tracing execution with proxy objects at Python runtime, similar to JAX and PyTorch. Instead of building an IR, we emit target code eagerly, line-by-line, similar to how PyTorch Eager Mode launches computations. And because we don't implement a compiler, the implementation remains 100% Python. Attendees will leave with a toolbox of Python metaprogramming patterns empowering them to write a code generator in Python without having to implement a compiler.
Keywords: **Explainable AI, enhanced RAG, GraphRAG, LLMOps, dialog system evaluation.** Designing reliable dialog flows for LLM-based systems remains challenging once conversations require branching, correction, or multi-step reasoning. Dialog graphs often evolve organically and accumulate structural issues: endless correction loops, dead subpaths, redundant validation steps, overly generic catch-all branches, or linear sequences that should be collapsed. Such phenomena raise operational costs, significantly increase TTFT and make the system answer less predictable and explainable. Many solutions try to introduce an all-fit generalized RAG retrieval solution. Contrary to this, we present our empirical learnings on how to enhance system speed, lower overall costs and offer a better dialog graph explainability through enhanced LLM call tracing and iterative enhancements for common dialog paths. We also show that more elaborated knowledge retrieval strategies like GraphRAG may drastically enhance overall response quality and shorten the dialog graph. We evaluate several approaches and give recommendations on how to leverage more complex document indexing phases for inference time benefits. Overall, the session argues that scalable conversational systems require not only better prompts, but explicit graph structures paired with rigorous tracing and data-driven optimization.
Python is the standard solution for many machine learning and data science applications, from large cloud systems, to workstations, and even on larger embedded or robotics systems. But as we move down into more constrained environments regular (C)Python starts to be a less good fit. The MicroPython project provides a Python implementation that is tailored for such environments, and this makes it possible scale down to microcontrollers with just a few megabytes of RAM (or less!). As a bonus, MicroPython with WebAssembly also makes lightweight browser applications possible. In this talk, we will discuss how to combine Internet of Things (IoT) hardware, MicroPython and browser to build stand-alone smart sensor systems and laboratory gear for physical data science.
The rise of time-series foundation models like Chronos-2 and TimesFM has sparked a debate: can a single pre-trained model replace the specialized "local" models we have tuned for years? We moved beyond the hype to test these models in production-like environments, from high-level market trends to granular article-level demand. In this talk, we share a transparent look at our journey: the zero-shot capabilities of these models, the reality of fine-tuning with exogenous business drivers, and a comparison between generative models and state-of-the-art classical methods. We categorize what is currently possible, what remains a challenge, and provide a roadmap for teams looking to integrate foundation models into their forecasting stack without sacrificing reliability.
Python UDFs often become the slowest part of PySpark pipelines because they run row-by-row and pay a high cost crossing the JVM↔Python boundary. Spark’s Arrow-backed execution changes that cost model by moving data in columnar batches, which can reduce overhead and enable efficient, vectorized processing in Python. In this session, we’ll cover practical patterns for writing Arrow-friendly UDF logic and integrating it with fast Python execution engines that operate on Arrow data. We’ll compare common approaches—scalar UDFs, Pandas UDFs, Arrow-native UDFs, and table-shaped Arrow transforms—then translate the results into a decision guide you can apply to production pipelines. Attendees will leave knowing when Arrow helps, when it doesn’t, and how to design UDF-heavy transformations that scale.
Building inclusive data teams and sustainable career paths is a challenge many organizations struggle with—especially in fast-growing, highly technical environments. Data careers are often portrayed as linear, while diversity initiatives remain abstract or ineffective in practice. This talk shares concrete, experience-based lessons from building an inclusive data organization that supports career growth, fosters an internal data science community, and achieved more than 50% women representation in data roles. Rather than focusing on theory, the session highlights practical decisions, structural changes, and leadership behaviors that made inclusion measurable and sustainable. Attendees will gain actionable insights into designing career paths that support non-linear journeys, creating internal data communities that encourage learning and collaboration, and implementing diversity practices that strengthen—rather than dilute—technical excellence. The talk is relevant for data scientists, engineers, team leads, and managers who want to build better teams and healthier data cultures.
LLM applications frequently pass tests but fail users in production. This talk examines the gap between evaluation metrics and user experience through three lenses: **Expectations** (what "working" means to users), **Functional** (system-level vs. component-level success), and **Operational** (real-world reliability). Drawing from production experience, we'll share scenarios of expectation mismatches, silent failures, and undetected drift—plus practical strategies for bridging the gap. The core message: evaluation should answer whether your system serves users, not whether it passes tests.
How do you teach a computer to understand the nuance of a territorial strategy game? Over the last year, driven by private interest and a Master’s degree in Computer Science, I set out to build an autonomous agent for Antiyoy—a minimalist yet strategically deep hexagonal game. This talk takes you through the entire journey, from the first line of environment code to a trained model capable of complex strategic play. The talk addresses a fundamental problem in modern AI: how do we translate complex, turn-based rules into a language a neural network can interpret? We will explore the architecture of a custom Reinforcement Learning environment, the mathematical elegance required to handle hexagonal coordinates, and the engineering behind a massive action space of over 17,000 possibilities. You will hear how we guided the agent through the "sparse reward" problem and what happened when a model was finally allowed to play against its own mistakes. Whether you are interested in the mechanics of PyTorch and PettingZoo or simply curious about the "brain" of a strategy bot, this session provides a practical roadmap for tackling high-dimensional problems in Python.
Modern scientific workflows increasingly rely on interactive analysis, reproducibility, and high-quality visualisation. **PyPLUTO** is a Python package designed to explore, analyse, and visualise numerical simulations produced by the **PLUTO** code for computational astrophysics. This talk shows how *PyPLUTO* leverages the Python ecosystem to transform raw simulation outputs into clear, flexible analysis and visualization workflows. The session demonstrates how domain-specific simulation data can be integrated with tools such as `NumPy` and `Matplotlib` to support efficient post-processing, rapid exploration, and production of publication-quality figures. Attendees will see how structured Python workflows can replace fragmented, ad-hoc scripts, how visualisation accelerates scientific insight, and how Python lowers the barrier between simulation output and interpretation. Although examples are drawn from computational astrophysics, the approach is broadly applicable to any field working with structured simulation data. The talk highlights how lightweight, Python-based post-processing tools can improve clarity, reproducibility, and productivity without imposing heavy frameworks or tightly coupled visualisation pipelines.
The Python Abstract Syntax Tree powers tools like pytest, linters, and automatic refactoring. In this talk, we'll approach syntax trees from first principles and see how Python code can be treated as structured data. We'll then explore how syntax trees can be used to automate refactoring across large codebases. Using a real-world example and the libCST library, we'll build a small refactoring tool and share practical advice for writing and applying automated refactorings. You'll leave with a clear mental model of syntax trees and a solid starting point for writing your own refactoring tools.
You’ve likely used a tool like black, flake8, or ruff to lint or format your code, or a tool like sphinx to document it, but you probably do not know how they accomplish their tasks. These tools and many more use **Abstract Syntax Trees (ASTs)** to analyze and extract information from Python code. An AST is a representation of your code's structure that enables you to access and manipulate its different components, which is what makes it possible to automate tasks like code migrations, linting, and docstring extraction. In this workshop, you’ll learn how to use the Python standard library’s ast module to parse and analyze code. Using just the standard library, we will implement a couple of common checks from scratch, which will give you an idea of how these tools work and help you build the skills and confidence to use ASTs in your own projects.
This talk is an exciting journey that revisits the past decade of Production Machine Learning from 2015 until now, and provides a pragmatic outlook of the next decade towards 2035. We’ll revisit some of the cornerstone python projects that served as the foundation of the "messy innovation" boom (feature stores, orchestration, model serving, monitoring), as well as how it transitioned towards the LLMOps era shifting the stack from training-centric to inference-centric. We will also provide a pragmatic set of predictions for the next decade of MLOps, including some of the trends in ML monitoring, agentic systems and beyond - this will provide actionable guidance to all practitioners to ensure we stay ahead of the curve on the expected skills and domains required to thrive in the near future to come.
On Kubernetes, your Python app runs in a hostile environment, fighting for resources in a straitjacket, bombarded with signals, and being killed and ruthlessly dragged back to life time and again. This is in stark contrast to the wonderful weather of a Linux web server or the blissful utopia of localhost. If not hardened properly, your Python app will find the burden of being containerized too hard to bear. And the result? Zombies! Whether you are a Kubernetes expert, or you just deployed your first containerized Hello World, we will together explore how the Python Interpreter, the Linux Kernel and Kubernetes interact with each other. We will uncover why Python struggles as an init process, how Kubernetes CPU-limits fight the Global Interpreter Lock (GIL) and why Python’s Garbage Collector cannot save you from sudden OOM kills. Most importantly, we will see how to identify, debug, and avoid containerized Python pitfalls. The goal of this talk is to help you stop treating your container like a server and learn to write Cloud-Native Python that knows exactly where it lives.
Python is especially powerful due to its deep meta programming capabilities. In this talk, I give an overview of one example: meta classes. I show how you can use them to customize class creation, ensure data integrity, or define your own syntactic sugar for classes.
Simplicity scales better than complexity. In this talk we share what we learned from a year-long refactor of our Python-based infrastructure where we majorly improved developer velocity and overall developer happiness with two choices: moving everything into a monorepo and replacing our microservices architecture with a Django monolith. Instead of going deep on any single technology, we offer a holistic view of how these decisions enabled a multi-disciplinary team to move faster on a shared codebase. We'll introduce a blueprint for a uv-based Python monorepo, discuss why we chose "boring" tools over custom solutions, and share the metrics we used to measure success. The metrics dashboard will be open-sourced as part of this talk.
What if you could run real data/ML workflows right in your browsers - sandboxed, with no installation or sending your data anywhere? Such an approach would have tons of benefits: it is easy to distribute, safer by default, and can scale almost infinitely with virtually no infrastructure costs. This talk is a pragmatic overview of the current in-browser ML stack. We’ll cover what workflows are realistic today (from training of traditional ML models to on-device LLM inference), how packaging/loading works, and the constraints one should be aware of. By the end of the talk you will have a clear sense of when in-browser ML is a good fit, and when it isn’t.
You deploy an agent to automatically route incoming customer support tickets. At first, it is a clear win: response times improve, customers are happier, and support teams finally get some rest. Then time passes. Nothing crashes. Dashboards stay green. No alerts fire. Yet the agent’s decisions slowly degrade first slightly, then inconsistently, and eventually becoming confidently wrong. This is data drift. LLM-based agents in production operate in constantly changing environments. Products launch, outages happen, terminology evolves, and priorities shift. Unlike traditional ML models, LLMs can produce plausible, well-phrased outputs even when they are incorrect, making these failures difficult to detect. In this talk, we focus on practical techniques for continuously evaluating and monitoring LLM-based agents after deployment. Using a support-ticket routing agent as an example, we examine drift signals such as increasing classification uncertainty, spikes in fallback categories, shifts in embedding distributions, and growing disagreement with historical or human decisions. The emphasis is not on training or prompt tuning, but on operating agents safely over time: detecting silent failures early and knowing when intervention, retraining, or retirement is required before users notice.
Many investment strategies look convincing because they performed well in the past, but these results are often easy to misread and do not always say much about how the strategy would work in the future. In many cases, strong backtest results come not from real skill or insight, but from hidden rules, unclear data choices, or unrealistic assumptions. In this talk, I show how Tidy Finance principles help make these issues visible and easier to examine. Using clear examples from Tidy Finance with Python, I demonstrate that once assumptions are made explicit, many impressive results no longer hold up.
One of the hardest parts of applied NLP has always been breaking down complex business problems into machine learning components. It's so hard because it requires domain expertise and reasoning about the specific use case, and it's the one thing technology couldn't fix. But what if we could take some of the learnings from AI-powered coding assistants and apply them to solving real-world NLP problems? In this talk, I'll show how we've built powerful assistants and tools to help developers solve NLP tasks using open-source software, and create modular solutions that are small, fast and fully data-private.
Building machine learning models for audio deepfake detection seems straightforward until datasets span multiple languages, such as Hindi, Korean, Mandarin, and German. In practice, multilingual Automatic Speech Recognition (ASR) systems often fail in production because language-specific acoustic variations and assumptions about the processing pipeline break down at scale. This talk examines the engineering challenges of building a multilingual deepfake detection system using a Python-centric pipeline. It covers practical issues encountered during large-scale audio preprocessing, including memory-efficient data loading, resumable feature-extraction workflows, and validation strategies designed to prevent cross-lingual leakage. The session also shares lessons from deploying a multilingual ASR-based system, with a focus on pipeline structure, evaluation correctness, and operational robustness in real-world settings.
AI is fundamentally changing how quickly business and domain teams can create new logic, validations, and insights. In regulated environments, this new speed collides head-on with legacy systems, monolithic architectures and IT landscapes that were never designed for continuous AI-driven change. This talk presents an open, Python-based platform architecture that turns AI-driven pressure into an architectural advantage. Instead of embedding AI into existing monoliths, the platform introduces a central control layer that orchestrates independent, stateless apps—ranging from classical algorithms to AI agents—without binding them to specific infrastructure or legacy constraints. The control layer, implemented using Python and optionally Django, provides workflow orchestration, security, tenant management, and self-service registration of new components. This allows domain teams to deploy AI agents—such as anomaly detection for regulatory reporting—within days, while IT retains governance, auditability, and operational stability. The talk argues that AI will amplify architectural weaknesses—and shows why modular orchestration layers will become essential for AI-ready systems far beyond finance.
In RAG-based systems, the main challenge is often not tuning the LLM itself, but making documents available in a form that can be retrieved reliably. In enterprise settings, the dominant input format is still PDF, ranging from text-heavy reports to slide decks, scanned documents, and visually dense presentations. Traditional document processing pipelines rely on OCR and layout analysis to extract text, followed by chunking and embedding. While this works well for text-heavy documents, much of the original structure is often lost—especially for presentations, multi-column layouts, and visually driven content. Images, charts, and diagrams typically require separate processing, increasing pipeline complexity and fragility. Recent multi-modal embedding models enable a different approach: embedding entire PDF pages directly as images. This preserves layout, visual hierarchy, and embedded graphics in a single representation and significantly simplifies document ingestion. This talk compares classical OCR-based document processing pipelines with multi-modal page embeddings, drawing on benchmarks conducted on real-world enterprise documents across different models. It highlights where this approach performs well, where its limitations lie, and how to design practical, cost-aware retrieval systems in Python.
This talk provides a beginner-friendly overview of Python’s parallel programming ecosystem. You’ll discover the key libraries and techniques—JIT compilation, multithreading, multiprocessing, distributed computing, HPC/grid computing, and even a first look at quantum programming—to help you write faster, more efficient code, regardless of your hardware.
AI systems face six dimensions of scale: data, model, user, operational, infrastructure, and cost. While most talks focus on infrastructure and user scale, this session tackles the hardest dimension: Cost Scale, maintaining predictable and optimized compute costs as usage grows. Drawing from production experience, I'll demonstrate how I achieved 70% cost reduction through three architectural patterns: semantic caching that eliminates repeat LLM calls, model cascading using hybrid architecture (open-source SLMs + API-based LLMs), and optimized conversation state management. You'll see real production metrics, honest failure post-mortems, and the critical trade-offs between batch vs. real-time inference, monolithic vs. microservices, and open-source vs. API models. Learn how architectural choices directly impact unit economics, and why scaling AI is fundamentally different from scaling traditional software. Walk away with reusable Python patterns (FastAPI, Pydantic, Redis, DynamoDB) and decision frameworks for building economically sustainable agentic AI systems.
Synthetic data is often presented as an easy fix for missing or sensitive datasets, but in practice, it can silently introduce bias, leakage, and misleading evaluation results. This talk presents a practical, end-to-end pipeline for creating synthetic datasets that are reproducible, task-aligned, and bias-aware. We will walk through design decisions that matter: template-based generation vs. free-form generation, entity balancing, controlling distributional skew, filtering failure cases, and validating dataset quality before training any model. The session emphasizes what actually works in real pipelines, common failure modes that look fine at first glance, and concrete best practices for Python developers to apply when building synthetic datasets for machine learning, NLP, or evaluation.
After nearly 20 years in data science, from MLPs, SVMs, and random forests to deep learning, I’ve seen many “revolutions” come and go. The current tectonic shift around GenAI and LLMs feels different from previous hype cycles. Even with some understanding how these things work, I am still blown away by the stream of stunning new capabilities. But they also introduce new kinds of risks that go far beyond technical performance. This talk offers a pragmatic, experience-driven perspective on GenAI in industrial settings, including supply chains and the emerging wave of AI agents. We’ll disentangle real opportunities from snake oil, especially where hype-driven promises meet senior management expectations. An anti-bullshit take on the possibilities ahead, with honesty, anecdotes, and (for those who know me, of course) a bit of humor.
Most systems are built under constraints: legacy code, regulation, organizational boundaries, and long-term accountability. This talk explores how Staff+ engineers and tech leads can make sound architectural decisions when “perfect” isn’t an option. Focusing on platforms and tooling, it presents practical ways to identify real constraints, preserve flexibility, avoid over-engineering, and communicate trade-offs that hold up over time - technically and organizationally.
AsyncIO vs threads isn't about "which is faster" - it's about scheduling, memory, and the kind of load you run. We'll unpack what threads and asyncio do under the hood (OS scheduler vs event loop + epoll), run practical benchmarks, and show why many "async" libraries still rely on thread pools (aiofiles, Motor, Django bridges). Then we'll repeat the same tests on Python 3.14's free-threaded (no-GIL) build and discuss when an interpreter upgrade can beat an async rewrite.
AI agents are increasingly deployed with autonomy: calling tools, accessing data, modifying systems, and making decisions without human supervision. While prompts and guardrails are often presented as safety solutions, they break down quickly in real-world agentic systems. In this talk, we explore how to enforce safety constraints in AI agents beyond prompting, using engineering techniques familiar to Python developers and data engineers. We will examine common failure modes in agentic systems such as tool misuse, goal drift, and over-permissioning and show how to mitigate them using policy layers, capability boundaries, and execution-time validation.
AI code agents like Claude Code are powerful but require careful isolation. Learn how to run them in secure containers with persistent credentials, API logging, and complete filesystem isolation—protecting your host system while maintaining full functionality.
Developers don’t need to become business analysts, but they do need business skills. This talk shows how learning to communicate with stakeholders, uncover real business needs, and bridge gaps between tech and business can dramatically increase your impact. Learn practical techniques to become a trusted technical partner and deliver solutions that truly matter.
The AI world is buzzing with claims about “agentic intelligence” and autonomous reasoning. Behind the hype, however, a quieter shift is taking place: Small Language Models (SLMs) are proving capable of many reasoning tasks once assumed to require massive LLMs. When paired with fresh business data from modern lakehouses and accessed through tool calling, these models can power surprisingly capable agents. In this talk, we cut through the noise around “agents” and examine what actually works today. You’ll see how compact models such as Phi-2 or xLAM-2 can reason and invoke tools effectively, and how to run them on development laptops or modest clusters for fast iteration. By grounding agents in business facts stored in Iceberg tables, hallucinations are reduced, while Iceberg’s read scalability enables thousands of agents to operate in parallel on a shared source of truth. Attendees will leave with a practical understanding of data agent architectures, SLM capabilities, Iceberg integration, and a realistic path to deploying useful data agents - without a GPU farm.
Python 3.13's free-threaded mode opens new territory for Python concurrency. We embarked on an experiment: could a trading algorithm benefit from true parallelism, and what would it take to get there? This talk documents our research journey from async/await to free threading—the hypotheses we tested, the benchmarks we designed, the unexpected behaviours we discovered, and the systematic approach we took to validating whether GIL-free Python could handle real-time market data. You'll see our experimental methodology, the data we collected, surprising findings about thread scheduling and memory patterns, and what our results suggest about Python's concurrent future.
Python’s scientific stack (NumPy/SciPy) is often confined to single-node execution. When datasets exceed local memory, researchers face a steep learning curve, typically choosing between complex manual distribution or the overhead of task-parallel frameworks. In this talk, we introduce [Heat](https://github.com/helmholtz-analytics/heat), an open-source distributed tensor framework designed to bring high-performance computing (HPC) capabilities to the scientific Python ecosystem. Built on PyTorch and mpi4py, Heat implements a data-parallel model that allows users to process massive datasets across multi-node, multi-GPU clusters (including AMD GPUs) with minimal code changes. We will discuss the design and architecture enabling "transparent distribution": - Heat’s distributed n-dimensional array for data partitioning and communication under the hood; - The synergy of PyTorch as a high-performance compute engine and MPI for efficient, low-latency communication; - Scaling efficiency, encompassing both strong and weak scaling for memory-intensive operations; - Fundamental building blocks—from linear algebra to machine learning—re-implemented for distributed memory space. Attendees will learn how to leverage the cumulative RAM of supercomputers without leaving the familiar NumPy-like interface, effectively removing the "memory wall" for large-scale scientific analytics.
RAG-based AI agents fail in production because retrieval without memory is like a conversation with someone who forgets everything you've said. This talk introduces a memory architecture that transforms how you build AI applications with a Python SDK. Using an open-source Python SDK, I'll demonstrate how to replace fragile RAG pipelines with a unified memory layer combining knowledge graphs and vector search. You'll see live code showing how 6 lines of Python can give your agents persistent, queryable memory that survives restarts learns and improves with interactions. We'll build a working agent memory system using cognee, Kuzu, LanceDB, and your choice of LLM, all running optionally locally. No cloud dependencies required. By the end, you'll understand why the future of AI agents isn't better RAG but better memory.
"Come for the language, stay for the community." If you've been around Python long enough, you've heard this before. I don't know when I first heard it, but I know exactly when I understood it. This talk is a personal reflection on thirteen years within the Python community—from my first tentative steps as a volunteer to organising conferences myself. It's a story about discovering that Python was always about more than code. It's about the people, the values, and the unexpected ways a community can shape a career and a life. This isn't just my story. It's a story I've seen repeated in countless faces at registration desks, in hallway conversations, in first-time speakers finding their voice. I want to talk about what I've learned about kindness, mentorship, and the quiet power of feeling like you belong somewhere. I'll end with an open question: as the ways we connect continue to evolve, how do we preserve what matters while welcoming a new generation? If you're new to this community and wondering what all the fuss is about, this talk is especially for you.
Designing a Python library that scales over time requires more than clean code. In this talk, we present ScanAPI, an open-source Python library for automated API integration testing and live documentation, as a case study in sustainable library design. We explore how architectural decisions, Python features, and automation pipelines help reduce maintenance costs while improving developer experience. We also share how open collaboration and community practices turn a Python library into a long-term, scalable project. Attendees will leave with practical patterns to apply when building or evolving Python libraries in the open.
Ever mixed conda and pip and ended up with a broken conda environment, yet, swear it worked before? This talk explains why! Learn the difference between pip and conda, what happens when you mix them and how to combine them safely using the latest community developed tools and updates in conda.
CPU–GPU synchronizations are a subtle performance killer in PyTorch: they block the host, prevent the CPU from running ahead, and create GPU idle gaps. This talk explains what host-device synchronization is, how it’s triggered by subtle code patterns (dynamic-shapes), and how to diagnose it with NVIDIA Nsight Systems by correlating utilization gaps with long CUDA API calls. We’ll end with practical mitigation patterns, including unit testing for syncs via `torch.cuda.set_sync_debug_mode()` and when a small Triton kernel can help avoid syncs and fuse ops.
Contributing to open source can feel intimidating, even for experienced Python developers. In this hands-on tutorial, participants will make their first real open source contribution to a Python project, learning the complete workflow from fork to pull request. Using a real-world Python library, attendees will practice reading an unfamiliar codebase, making a small but meaningful change, running tests, and opening a pull request following community standards. The focus is on practical skills, tooling, and confidence — not theory. By the end of the session, participants will understand how to start contributing to Python open source projects and feel prepared to continue contributing beyond the workshop.
### Agent-Based Hyperparameter Optimization for Gradient Boosted Trees Hyperparameter optimization for gradient boosted tree models is a repetitive yet cognitively demanding task. Practitioners must combine statistical intuition with detailed, library-specific knowledge—often buried across hundreds of pages of documentation for tools such as XGBoost, LightGBM, or CatBoost. As models and configurations grow in complexity, traditional approaches like grid search, random search, or even Bayesian optimization struggle to incorporate semantic understanding of model behavior. Using XGBoost as a concrete case study, I demonstrate how RAG-powered agents, orchestrated in a structured workflow, can analyze model behavior via SHAP values, diagnose failure modes (e.g. overfitting, feature dominance, interaction leakage), and propose targeted hyperparameter adjustments grounded in both theory and library-specific constraints. The system combines open-source tools including XGBoost, SHAP, Optuna, and LangGraph/LangChain, where agents specialize in tasks such as model diagnostics, documentation-aware parameter reasoning, and experiment orchestration. Rather than replacing existing optimization frameworks, agents operate on top of them, injecting domain knowledge and interpretability signals into the optimization loop.
Python has become the dominant language for scientific computing and data science, largely due to powerful array libraries that enable high-performance numerical computation. This tutorial introduces array-oriented programming as a paradigm and surveys the modern Python array ecosystem. We'll explore when and how to use different array libraries: NumPy for general-purpose array operations, JAX for automatic differentiation, just-in-time compilation of array-oriented code, and GPU acceleration, Numba for just-in-time compilation of imperative code, and Awkward Array for nested and irregular data structures. Through live demos, we'll show how to think in arrays, discuss the limitations of array-oriented programming, and demonstrate how JIT compilation addresses these challenges. Whether you're analyzing data, building machine learning models, or doing scientific simulations, understanding the strengths and trade-offs of each library will help you choose the right tool for your problem.
Real-time bytestreams between systems in different organizations or secured environments, whether for batch dataset delivery or continuous streaming, are surprisingly hard. Traditional solutions fall short: message brokers like Kafka use discrete messages, file storage like S3 works for batch exchange but lacks streaming and coordination, while HTTP client-server approaches require one side to host and expose server endpoints, introducing security and operational overhead. This talk introduces the ZebraStream Protocol: an open, HTTP-based bytestream protocol with coordination mechanisms that let you stream data—Parquet files, compressed archives, encrypted content—directly between decoupled systems using Python's file-like interface. No message framing, no server hosting, no exposed endpoints. We'll explore the design of a bytestream protocol for data sharing and integration that crosses the file-stream boundary, enabling seamless integration with pandas, DuckDB, and any Python library expecting file-like objects, supporting use cases from ETL pipelines to IoT data delivery, cross-org collaboration to home network automation.
Feeling unsure about your next step in your career? The data & AI field is evolving faster than ever. New tools, new roles, and constant “next big things” can make even experienced professionals feel unsure about where they are heading, and how to make intentional career decisions in the middle of all this change. You might be doing well, feeling comfortable. Interesting work, steady progress, recognition. And still, there’s that question in the background: Where is this actually going? This interactive workshop helps you explore different future paths, understand trade-offs, and gain clarity about what kind of work and influence you want next.
When I joined my current company as a software engineer, I encountered a blank slate: no CI/CD pipelines, no deployment infrastructure, barely any monitoring—in short, no software infrastructure at all. This talk shares the key learnings from building a DevOps environment from the ground up. I’ll walk through the essentials: which foundations were laid first, what tools and practices made the difference, and how automation became a daily habit. Through real-world examples, I will demonstrate how pragmatic and incremental steps can jump-start productivity, reduce manual toil, and help teams avoid common pitfalls.
Large Language Models are rapidly changing how we think about recommendation systems. Traditional pipelines based on collaborative filtering or matrix factorization are being complemented and sometimes replaced by embedding-based and LLM-driven approaches. In this talk, we explore how modern recommendation systems can be built using LLM embeddings, vector databases, and hybrid architectures that combine classical ML with generative models. We will discuss practical design patterns for personalization, retrieval, ranking, and user modeling, focusing on real-world constraints such as latency, cost, and evaluation. The session emphasizes hands-on insights from production systems and highlights where LLMs add real value and where they don’t. Attendees will leave with a clear mental model for designing scalable, LLM-powered recommendation systems beyond toy examples.
Command Line Interfaces (CLIs) offer an efficient and powerful way to interact with software, but poorly designed interfaces can be incredibly frustrating. Complicated parameter names and unconventional formats can turn using a great tool into a burdensome experience. Large Language Models (LLMs) seem like a great solution to this problem as they can easily add a natural-language interface to any CLI. However, LLMs can introduce their own challenges, such as requiring API keys or high-performance GPUs. In this talk, I'll demonstrate a method for creating natural-language interfaces for any CLI using fine-tuned Small Language Models. These models are lightweight enough to be run directly on laptops or even smartphones. We'll explore the process of generating synthetic data, fine-tuning models, and evaluating their performance using both an in-house CLI and a well-known open-source package as examples.
First steps in programming can be notoriously difficult, and famously require students to learn much more than their first programming language - they need to adapt to an entirely different approach to problem-solving. To some students this comes naturally, while others struggle to develop a "programmatic mindset", sometimes to the point of giving up. Why does this happen, and what can we do about it? In this talk I'll give my answer to these questions, centered around the idea of a **mental model of programming** - the ability to simulate, in your head, how code is executed. I'll explain why this is a critical component of learning to program, and offer pedagogical methods that nurture the creation of a mental model.
Fairness is fundamentally not tractable to classic optimisation techniques. It's not a state of the world, it's an experience of it. No technology is fair in a vacuum - fairness can only be understood when a technical system collides with humans. We're seeing a wave of off-the-shelf libraries measuring bad behaviours in LLM outputs, often simplifications of older fairness metrics. They can catch obvious failure modes like slurs. But this is one failure mode among many. Installing a library and calling the job done is fairness washing. The harder, more fruitful approach is to explore the space of failure modes, consider what an ideal world would look like, and design measures, mitigations, and feedback loops accordingly. This is a talk for people who suspect we can't optimise our way to human dignity.
If I do X instead of Y, will I get the outcome I want? What about in a new unseen situation? Making predictions alone is pointless, one wants to act in the world. Furthermore one must act in situations that are similar but different to all past experience. The real underlying goal of all decision making is really interventional generalisation: the ability to evaluate hypothetical choices in new unseen situations. Unfortunately data science and statistics has a inordinate focus on observation and statistical significance instead of intervention, counter-factuals and generalisation. Improve your modelling both practically and conceptually with the mental tools presented in this talk.
This talk will detail how we used Rust to solve a number of resource utilization inefficiencies while scaling data pre-processing to a petabyte scale and enable next-generation model training at DeepL. Besides other factors, this was done by developing an internal library for interacting with Parquet files in a memory efficient nature. Topics include: • Convincing you to love Rust for its memory safety • Comparing C++ and Rust ecosystems for Python library development • Diving into Python-Rust interoperability • Convincing you to love Rust for its user-friendly (yes, actually!) language features • Providing a high-level overview of the continuously growing impact that Rust is having on the Arrow and data engineering ecosystem
Is it still worth learning SQL in 2026, or can we just "chat" with our data? This hands-on tutorial explores that exact question by pushing Text-to-SQL to its absolute limits. This won't be just happy paths; we will deliberately expose where LLMs fail : ambiguity, hallucinations, and "dirty" data...and build the engineering stack required to fix them! You will build a local data Agent from scratch using DuckDB, MCP and a minimalist semantic layer. By the end, you will understand the hard boundaries of AI reasoning, how a semantic layer acts as a safety net, and why knowing SQL is still (since 1974) the most critical skill for building reliable analytics agents.
Do you find yourself weighing up the pros and cons of using nested types in the Polars library - pondering whether you should encode your variables in structures using lists, arrays or opt for a flat format without complex hierarchy? This talk focuses on the crucial design choices available, the performance implications, and how this impacts the logic of your queries, as well as code readability, when deciding how to implement your big data pipeline in Polars. The methods available for nested types in Polars have seen some significant additions over the last year, with powerful functionality, such as filtering and aggregation, released in the latest versions of the library. These provide much-needed shortcuts for queries interrogating complex nested structures that previously required sophisticated user-defined functions. It makes the use of nested types much easier and intuitive, but does this mean you should nest your data? Through practical examples you’ll learn some guidelines to help you decide.
Python’s static typing ecosystem has long been shaped by mypy, but a new contender has entered the space: ty, a high-performance type checker from Astral that has recently exited alpha. With a focus on speed, modern ergonomics, and tight tooling integration, Ty represents a new direction for Python type checking. In this talk, we’ll explore what ty looks like in practice. We’ll cover its core features, how it behaves on real-world codebases, and what changes when type checking becomes fast enough to run constantly. We’ll also compare ty directly with mypy, highlighting strengths, limitations, and trade-offs teams should understand before adopting it. This session will help Python developers evaluate whether ty is ready for production use today—and what it suggests about the future of Python typing tools.
Data analysis and machine learning often involve sensitive information. But how can we ensure that our analyses and releases do not inadvertently reveal information about the individuals in our data? Traditional approaches such as anonymization or releasing only aggregate statistics have repeatedly proven insufficient. Differential privacy is a mathematical framework that offers provable privacy guarantees while still enabling useful data analysis. In this tutorial, we provide a hands-on introduction to differential privacy, covering key concepts relevant to understanding and applying it in practice. The focus will be on practical implementation rather than underlying theory. Using interactive examples in Python, we will explore the core ideas of differential privacy, highlight its attractive properties and limitations, and demonstrate how to build privacy-preserving analyses using OpenDP, an open-source Python library for differential privacy. Participants will leave equipped to continue exploring differential privacy on their own. Familiarity with the basics of Python programming is helpful, but no prior knowledge of differential privacy is required.
What only a few years ago started out as smart tab completion turned into a way of working in which a growing number of programmers don't even bother to open up an IDE anymore. Let's take a moment to contemplate the changing nature of software engineering as a profession, and to explore chances to avoid looming disaster.
Multimodal learning - systems that combine vision, language, audio, and other sensory inputs—has moved from a niche research topic to a central paradigm in modern machine learning. Today’s most influential models no longer operate on a single modality but instead learn rich representations by combining language with images, videos, sound. This shift has fundamentally changed how we build, train, and evaluate current machine learning systems. Python has played a decisive role in this transformation. Acting as a unifying layer across modalities, Python enabled researchers and practitioners to seamlessly combine computer vision, natural language processing, and speech within a single ecosystem. Python-based frameworks lowered the barriers between research communities, and accelerated the rise of large-scale, weakly supervised, and foundation models. However, this success has also introduced new challenges. The ease of experimentation masks growing issues around scalability, reproducibility, and evaluation. Multimodal systems increasingly depend on complex Python-based stacks whose abstractions can obscure underlying assumptions and costs. ...
While The Cloud is just someone elses computer, those computers come together from many places and many, many someone elses. The constituent parts to connect, power, house, and ultimately operate those computers are from many more places and someones still! We explore what these infrastructure pieces of The Cloud are explicitly; and how the many definitions of digital sovereignty can be viewed from the viewpoint high up in The Cloud.
Python has been at the center of my work in machine learning and AI for more than a decade. It is where I start from scratch, experiment with ideas, and build systems that help me understand how large language models really work. In this keynote, we will explore how Python enables this entire journey, from defining model architectures and training loops to scaling data and computation across devices. I will also reflect on how Python continues to support both the large models of today and the evolving systems of tomorrow, even as new backends take over the heavy lifting.
**Code-generating LLMs have matured** to the point where they can reliably scaffold **data pipelines and data agents**, when used in a **supervised, engineering-first workflow**. This tutorial demonstrates how to combine modern **AI coding assistants** with a **production-ready Python deployment platform (Tower.dev)** to build and operate **real data systems**. Participants will learn how to structure **collaborative Human/AI Assistant development loops**, where engineers provide **architecture, domain knowledge, and review**, while AI accelerates implementation. We will build a **data pipeline** and a **lightweight data agent**, iterating with an AI assistant to **generate, test, and improve code**. The session also covers critical **operational concerns** such as: - **Security** - **Scaling** - **Observability** - **Debugging** You will also see how **production feedback can be looped back into the assistant** to continuously improve generated code. This is **not about “vibe coding”** a website. It is about **disciplined, review-driven AI collaboration** that meaningfully improves productivity for **data practitioners at all levels**.