Chapter 1
Mathematical Foundations of CatBoost
Unlock the theoretical engine behind CatBoost and discover what makes it different from classic boosting models. This chapter illuminates the core mathematical principles that empower CatBoost's outperformance: from its unique ordered boosting and categorical variable strategies to state-of-the-art leak prevention and regularization. Dive deep into the formal structures and see how CatBoost tames overfitting while achieving world-class accuracy-on both familiar and complex datasets.
1.1 Principles of Gradient Boosted Decision Trees
Gradient boosting is a powerful ensemble technique founded on the idea of building an additive model by sequentially fitting weak learners to the residuals of prior models. The theoretical framework of gradient boosted decision trees (GBDTs) integrates concepts from function approximation, numerical optimization, and statistical learning, producing robust predictive models from simple base learners.
At its core, the boosting procedure constructs an additive model of the form
where each hm(x) is a weak learner-typically a decision tree of limited depth-that contributes a small improvement to the overall prediction, and ?m are corresponding weights or step sizes. The data x ?