The Cost Scale Problem AI systems must scale across six dimensions: data, model, user, operational, infrastructure, and cost. While traditional software scales by adding servers, AI systems have non-linear cost curves that can destroy unit economics overnight. This talk focuses on Cost Scale, maintaining predictable compute costs as usage grows. What You'll Learn (40 min + Q&A) Why AI Scaling Is Different (5 min)

Traditional vs. AI systems: learning, degradation, token-based pricing
The six dimensions of the scale framework
Real economics: €50/month → €15,000/month → €4,500/month optimized

Architecture Foundation (4 min)

Monolithic vs. microservices for AI
Event-driven architecture choice
Batch vs. real-time inference trade-offs

Pattern 1: Semantic Caching (12 min)

Vector-based query matching with Redis
Architecture walkthrough and code examples
Production metrics: 72% hit rate, $0.001 vs. $0.015 per query
Failures: threshold tuning (0.95 → 0.85), semantic false positives

Pattern 2: Model Cascading - Hybrid Architecture (12 min)

Open-source (Llama 3.2) + API (Claude/GPT-4) strategy
Semantic router implementation with Pydantic
Code examples: routing logic, confidence scoring, fallbacks
Production metrics: 40% cost reduction through intelligent routing

Pattern 3: Conversation State (8 min)

Multi-tier storage: Redis (hot) + DynamoDB (warm/cold)
Context window management and summarisation
Production metrics: storage vs. token cost trade-offs

Integration & Monitoring (3 min)

Full system architecture
FinOps dashboards and cost alerting
Final results: 70% reduction, sub-2s latency, 0.3% error rate

Takeaways (3 min)

Decision frameworks for each pattern
Open-source vs. API vs. hybrid criteria
GitHub repo with reference implementations

Who Should Attend Engineers moving AI to production, platform teams managing costs, managers evaluating AI feasibility. Basic Python and LLM knowledge helpful.

Technologies Python, FastAPI, Pydantic, Redis, DynamoDB, sentence-transformers, Llama 3.2, Claude/GPT-4.

Why It Matters Production-tested patterns with real metrics, honest failures, and actionable frameworks for building economically sustainable AI systems.

Daniel Akhabue

Daniel is an AI/ML Engineer and Cloud Solutions Architect with hands-on experience in building AI-driven solutions across diverse domains, including data science, machine learning and generative AI. He has led and contributed to impactful projects across a range of industries such as EdTech, Assistive Technology, HealthTech, FinTech, and Supply Chain Technology, consistently delivering solutions that address real-world challenges.

As a community champion at Data Scientists Network (DSN), Daniel has won multiple Data Science and AI hackathons, and has led AI communities in Nigeria, driving innovation and knowledge sharing. He is also a technical writer, sharing insights and expertise through articles published on leading online platforms in the Data Science and AI space.

In his spare time, Daniel enjoys reading and reviewing research papers, as well as playing chess. He is driven by a deep passion to build solutions that create meaningful societal impact and foster socio-economic development

The Frugal AI Architect: Building Cost-Efficient Agentic Systems in Python

Daniel Akhabue

Daniel Akhabue