OpenBoost¶

A GPU-native, all-Python platform for tree-based machine learning.

Quickstart • Models • NaturalBoost • API Reference

Why OpenBoost?¶

For standard GBDT, use XGBoost/LightGBM—they're highly optimized C++.

For GBDT variants (probabilistic predictions, interpretable GAMs, custom algorithms), OpenBoost brings GPU acceleration to methods that were previously CPU-only and slow:

NaturalBoost: 1.3-2x faster than NGBoost
OpenBoostGAM: 10-40x faster than InterpretML EBM

Plus: ~20K lines of readable Python. Modify, extend, and build on—no C++ required.

Quick Example¶

import openboost as ob

# Standard gradient boosting
model = ob.GradientBoosting(n_trees=100, max_depth=6)
model.fit(X_train, y_train)
predictions = model.predict(X_test)

# Probabilistic predictions with uncertainty
prob_model = ob.NaturalBoostNormal(n_trees=100)
prob_model.fit(X_train, y_train)
mean = prob_model.predict(X_test)
lower, upper = prob_model.predict_interval(X_test, alpha=0.1)  # 90% interval

Features¶

:rocket: GPU Accelerated¶

Numba CUDA kernels for histogram building and tree construction. 10-40x faster GAM training compared to CPU alternatives.

:brain: Probabilistic Predictions¶

NaturalBoost provides full probability distributions with uncertainty quantification. 8 built-in distributions including Normal, Gamma, Tweedie, and Negative Binomial.

:snake: All Python¶

~20K lines of readable, hackable code. No C++ compilation needed. Understand and modify the algorithms.

:gear: sklearn Compatible¶

Drop-in replacement for scikit-learn pipelines. Works with GridSearchCV, cross_val_score, and Pipeline.

Installation¶

pip install openboost

# With GPU support
pip install "openboost[cuda]"

What's Included¶

Category	Models
Standard GBDT	GradientBoosting, MultiClassGradientBoosting, DART
Interpretable	OpenBoostGAM, LinearLeafGBDT
Probabilistic	NaturalBoostNormal, LogNormal, Gamma, Poisson, StudentT, Tweedie, NegBin

Performance¶

OpenBoost GPU-accelerates GBDT variants that were previously slow:

Benchmark	Result
NaturalBoost vs NGBoost	1.3-2x faster
OpenBoostGAM vs InterpretML EBM	10-40x faster

For standard GBDT, XGBoost/LightGBM are faster. OpenBoost's value is in the variants.

Who Is OpenBoost For?¶

Kaggle Competitors - Probabilistic predictions that XGBoost can't do
ML Researchers - Prototype new algorithms in Python
Startups - Ship interpretable models fast
Students - Actually understand how gradient boosting works

Roadmap¶

Train-many optimization: Industry workloads often train many models (hyperparameter tuning, CV, per-segment models). XGBoost optimizes for one model fast. OpenBoost plans to enable native optimization for training many models efficiently.

License¶

Apache 2.0