Agent-Based Hyperparameter Optimization for Gradient Boosted Trees

Huijo Kim

Machine Learning & Deep Learning & Statistics
Python Skill Intermediate
Domain Expertise Intermediate

Why This Problem Matters in Practice

In every day work as data scientist & ML engineer, hyperparameter tuning often consumes a disproportionate amount of experimentation time, yet many tuning failures stem from recurring structural issues rather than random chance. These issues are typically identifiable by experienced practitioners but remain inaccessible to automated optimization systems due to their reliance on scalar objective functions alone.

What Is New or Different

This work reframes hyperparameter optimization as an iterative reasoning process rather than a pure search problem. The key insight is that intermediate explanation artifacts—specifically SHAP value distributions—can be treated as first-class signals that guide subsequent optimization decisions. Encoding this reasoning explicitly via agents enables systematic reuse of expert heuristics that are otherwise applied informally.

System Architecture and Workflow

The proposed system decomposes the optimization process into agent roles with clearly defined responsibilities, such as diagnostic reasoning, parameter constraint validation, and experiment coordination. These agents interact through a controlled workflow that preserves reproducibility and auditability while leveraging Retrieval-Augmented Generation to reason over model documentation and prior experiment context.

Scope and Limitations

The case study focuses on gradient boosted tree models, with XGBoost used as the primary example. While the approach generalizes conceptually, it is most effective in settings where model interpretability and parameter interactions dominate performance outcomes. The talk explicitly discusses scenarios where agent-based optimization adds limited value or introduces unnecessary complexity.

Audience Takeaways

Attendees will gain:

  • Practical heuristics for translating explanation outputs into tuning actions
  • Design patterns for building agent-based optimization workflows
  • Guidance on integrating agent reasoning into existing experimentation and MLOps setups
  • A realistic understanding of the trade-offs and failure modes of agent-driven systems

Format and Reproducibility

The presentation is a technical case study supported by architecture diagrams and experiment traces. All code, configurations, and artifacts will be made available as open source to ensure reproducibility and facilitate adaptation.

Huijo Kim

I am a machine learning practitioner and former founder working across predictive modeling, computer vision, MLOps, and autonomous systems. After studying mechanical engineering, I worked in the electric vehicle development sector at Hyundai Motor Group, contributing to large-scale, safety-critical automotive systems.

I later founded and scaled an agtech startup from zero to a six-figure ARR business. This experience shaped my focus on building technology that delivers measurable, real-world value rather than chasing technical hype. After exiting, I transitioned into the e-commerce domain, applying machine learning to large-scale experimentation and operational optimization.

My background includes graduate research in robotics, published work in applied machine learning, and hands-on experience deploying end-to-end ML systems. I am particularly interested in explainability-driven optimization, agent-based workflows, and cross-disciplinary system design. I believe polymath practitioners—those who can bridge domains—will be especially valuable in the era of AI.