Models¶

All OpenBoost model classes.

Standard GBDT¶

GradientBoosting¶

GradientBoosting `dataclass` ¶

GradientBoosting(
    n_trees=100,
    max_depth=6,
    learning_rate=0.1,
    loss="mse",
    min_child_weight=1.0,
    reg_lambda=1.0,
    reg_alpha=0.0,
    gamma=0.0,
    subsample=1.0,
    colsample_bytree=1.0,
    n_bins=256,
    quantile_alpha=0.5,
    tweedie_rho=1.5,
    distributed=False,
    n_workers=None,
    subsample_strategy="none",
    goss_top_rate=0.2,
    goss_other_rate=0.1,
    batch_size=None,
    n_gpus=None,
    devices=None,
)

Bases: PersistenceMixin

Gradient Boosting ensemble model.

A gradient boosting model that supports both built-in loss functions and custom loss functions. When using built-in losses with GPU, training is fully batched for maximum performance.

PARAMETER	DESCRIPTION
`n_trees`	Number of trees to train. TYPE: `int` DEFAULT: `100`
`max_depth`	Maximum depth of each tree. TYPE: `int` DEFAULT: `6`
`learning_rate`	Shrinkage factor applied to each tree. TYPE: `float` DEFAULT: `0.1`
`loss`	Loss function. Can be: - 'mse': Mean Squared Error (regression) - 'logloss': Binary cross-entropy (classification) - 'huber': Huber loss (robust regression) - 'mae': Mean Absolute Error (L1 regression) - 'quantile': Quantile regression (use with quantile_alpha) - Callable: Custom function(pred, y) -> (grad, hess) TYPE: `str \| LossFunction \| Callable[..., tuple]` DEFAULT: `'mse'`
`min_child_weight`	Minimum sum of hessian in a leaf. TYPE: `float` DEFAULT: `1.0`
`reg_lambda`	L2 regularization on leaf values. TYPE: `float` DEFAULT: `1.0`
`n_bins`	Number of bins for histogram building. TYPE: `int` DEFAULT: `256`
`quantile_alpha`	Quantile level for 'quantile' loss (0 < alpha < 1). - 0.5: Median regression (default) - 0.9: 90th percentile - 0.1: 10th percentile TYPE: `float` DEFAULT: `0.5`
`tweedie_rho`	Variance power for 'tweedie' loss (1 < rho < 2). - 1.5: Default (compound Poisson-Gamma) TYPE: `float` DEFAULT: `1.5`
`subsample_strategy`	Sampling strategy for large-scale training (Phase 17): - 'none': No sampling (default) - 'random': Random subsampling - 'goss': Gradient-based One-Side Sampling (LightGBM-style) TYPE: `Literal['none', 'random', 'goss']` DEFAULT: `'none'`
`goss_top_rate`	Fraction of top-gradient samples to keep (for GOSS). TYPE: `float` DEFAULT: `0.2`
`goss_other_rate`	Fraction of remaining samples to sample (for GOSS). TYPE: `float` DEFAULT: `0.1`
`batch_size`	Mini-batch size for large datasets. If None, process all at once. TYPE: `int \| None` DEFAULT: `None`

Examples:

Basic regression:

import openboost as ob

model = ob.GradientBoosting(n_trees=100, loss='mse')
model.fit(X_train, y_train)
predictions = model.predict(X_test)

Quantile regression (90th percentile):

model = ob.GradientBoosting(loss='quantile', quantile_alpha=0.9)
model.fit(X_train, y_train)

GOSS for faster training:

model = ob.GradientBoosting(
    n_trees=100,
    subsample_strategy='goss',
    goss_top_rate=0.2,
    goss_other_rate=0.1,
)

Multi-GPU training:

model = ob.GradientBoosting(n_trees=100, n_gpus=4)
model.fit(X, y)  # Data parallel across 4 GPUs

fit ¶

fit(
    X, y, callbacks=None, eval_set=None, sample_weight=None
)

Fit the gradient boosting model.

PARAMETER	DESCRIPTION
`X`	Training features, shape (n_samples, n_features). TYPE: `NDArray`
`y`	Training targets, shape (n_samples,). TYPE: `NDArray`
`callbacks`	List of Callback instances for training hooks. Use EarlyStopping for early stopping, Logger for progress. TYPE: `list[Callback] \| None` DEFAULT: `None`
`eval_set`	List of (X, y) tuples for validation (used with callbacks). TYPE: `list[tuple[NDArray, NDArray]] \| None` DEFAULT: `None`
`sample_weight`	Sample weights, shape (n_samples,). TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`self`	The fitted model. TYPE: `GradientBoosting`

Example

from openboost import GradientBoosting, EarlyStopping, Logger

model = GradientBoosting(n_trees=1000)
model.fit(
    X, y,
    callbacks=[EarlyStopping(patience=50), Logger(period=10)],
    eval_set=[(X_val, y_val)]
)

predict ¶

predict(X)

Generate predictions for X.

PARAMETER	DESCRIPTION
`X`	Features to predict on, shape (n_samples, n_features). Can be raw numpy array or pre-binned BinnedArray. TYPE: `NDArray \| BinnedArray`

RETURNS	DESCRIPTION
`predictions`	Shape (n_samples,). TYPE: `NDArray`

RAISES	DESCRIPTION
`ValueError`	If model is not fitted or X has wrong shape.

MultiClassGradientBoosting¶

MultiClassGradientBoosting `dataclass` ¶

MultiClassGradientBoosting(
    n_classes,
    n_trees=100,
    max_depth=6,
    learning_rate=0.1,
    min_child_weight=1.0,
    reg_lambda=1.0,
    reg_alpha=0.0,
    gamma=0.0,
    subsample=1.0,
    colsample_bytree=1.0,
    n_bins=256,
    subsample_strategy="none",
    goss_top_rate=0.2,
    goss_other_rate=0.1,
    batch_size=None,
)

Bases: PersistenceMixin

Multi-class Gradient Boosting classifier.

Uses softmax loss and trains K trees per round (one per class), following the XGBoost/LightGBM approach.

PARAMETER	DESCRIPTION
`n_classes`	Number of classes. TYPE: `int`
`n_trees`	Number of boosting rounds (total trees = n_trees * n_classes). TYPE: `int` DEFAULT: `100`
`max_depth`	Maximum depth of each tree. TYPE: `int` DEFAULT: `6`
`learning_rate`	Shrinkage factor applied to each tree. TYPE: `float` DEFAULT: `0.1`
`min_child_weight`	Minimum sum of hessian in a leaf. TYPE: `float` DEFAULT: `1.0`
`reg_lambda`	L2 regularization on leaf values. TYPE: `float` DEFAULT: `1.0`
`n_bins`	Number of bins for histogram building. TYPE: `int` DEFAULT: `256`
`subsample_strategy`	Sampling strategy (Phase 17): 'none', 'random', 'goss'. TYPE: `Literal['none', 'random', 'goss']` DEFAULT: `'none'`
`goss_top_rate`	Fraction of top-gradient samples to keep (for GOSS). TYPE: `float` DEFAULT: `0.2`
`goss_other_rate`	Fraction of remaining samples to sample (for GOSS). TYPE: `float` DEFAULT: `0.1`

Example

import openboost as ob

model = ob.MultiClassGradientBoosting(n_classes=10, n_trees=100)
model.fit(X_train, y_train)  # y_train: 0 to 9
predictions = model.predict(X_test)  # Returns class labels
proba = model.predict_proba(X_test)  # Returns probabilities

With GOSS sampling:

model = ob.MultiClassGradientBoosting(
    n_classes=10, n_trees=100,
    subsample_strategy='goss',
    goss_top_rate=0.2,
    goss_other_rate=0.1
)

fit ¶

fit(X, y)

Fit the multi-class gradient boosting model.

PARAMETER	DESCRIPTION
`X`	Training features, shape (n_samples, n_features). TYPE: `NDArray`
`y`	Training labels, shape (n_samples,). Integer class labels 0 to n_classes-1. TYPE: `NDArray`

RETURNS	DESCRIPTION
`self`	The fitted model. TYPE: `MultiClassGradientBoosting`

predict ¶

predict(X)

Predict class labels.

PARAMETER	DESCRIPTION
`X`	Features to predict on. TYPE: `NDArray \| BinnedArray`

RETURNS	DESCRIPTION
`labels`	Shape (n_samples,). Integer class labels. TYPE: `NDArray`

predict_proba ¶

predict_proba(X)

Predict class probabilities.

PARAMETER	DESCRIPTION
`X`	Features to predict on. TYPE: `NDArray \| BinnedArray`

RETURNS	DESCRIPTION
`probabilities`	Shape (n_samples, n_classes). TYPE: `NDArray`

DART¶

DART `dataclass` ¶

DART(
    n_trees=100,
    max_depth=6,
    learning_rate=0.1,
    loss="mse",
    dropout_rate=0.1,
    skip_drop=0.0,
    normalize=True,
    sample_type="uniform",
    min_child_weight=1.0,
    reg_lambda=1.0,
    n_bins=256,
    seed=None,
)

Bases: PersistenceMixin

DART: Gradient Boosting with Dropout.

Implements DART (Dropouts meet Multiple Additive Regression Trees), which randomly drops trees during training to prevent overfitting.

PARAMETER	DESCRIPTION
`n_trees`	Number of trees to train. TYPE: `int` DEFAULT: `100`
`max_depth`	Maximum depth of each tree. TYPE: `int` DEFAULT: `6`
`learning_rate`	Base learning rate (shrinkage factor). TYPE: `float` DEFAULT: `0.1`
`loss`	Loss function ('mse', 'logloss', 'huber', or callable). TYPE: `str \| LossFunction` DEFAULT: `'mse'`
`dropout_rate`	Fraction of trees to drop each round (0 to 1). TYPE: `float` DEFAULT: `0.1`
`skip_drop`	Probability of skipping dropout for a round. TYPE: `float` DEFAULT: `0.0`
`normalize`	If True, normalize dropped tree contributions. TYPE: `bool` DEFAULT: `True`
`sample_type`	How to sample dropped trees ('uniform' or 'weighted'). TYPE: `str` DEFAULT: `'uniform'`
`min_child_weight`	Minimum sum of hessian in a leaf. TYPE: `float` DEFAULT: `1.0`
`reg_lambda`	L2 regularization on leaf values. TYPE: `float` DEFAULT: `1.0`
`n_bins`	Number of bins for histogram building. TYPE: `int` DEFAULT: `256`
`seed`	Random seed for reproducibility. TYPE: `int \| None` DEFAULT: `None`

Example

import openboost as ob

# DART with 10% dropout
model = ob.DART(n_trees=100, dropout_rate=0.1)
model.fit(X_train, y_train)
predictions = model.predict(X_test)

# DART with higher dropout for more regularization
model = ob.DART(n_trees=200, dropout_rate=0.3, skip_drop=0.5)
model.fit(X_train, y_train)

fit ¶

fit(X, y)

Fit the DART model.

PARAMETER	DESCRIPTION
`X`	Training features, shape (n_samples, n_features). TYPE: `NDArray`
`y`	Training targets, shape (n_samples,). TYPE: `NDArray`

RETURNS	DESCRIPTION
`self`	The fitted model. TYPE: `DART`

predict ¶

predict(X)

Generate predictions for X.

PARAMETER	DESCRIPTION
`X`	Features to predict on, shape (n_samples, n_features). Can be raw numpy array or pre-binned BinnedArray. TYPE: `NDArray \| BinnedArray`

RETURNS	DESCRIPTION
`predictions`	Shape (n_samples,). TYPE: `NDArray`

Interpretable Models¶

OpenBoostGAM¶

OpenBoostGAM `dataclass` ¶

OpenBoostGAM(
    n_rounds=1000,
    learning_rate=0.01,
    reg_lambda=1.0,
    loss="mse",
    n_bins=256,
)

Bases: PersistenceMixin

GPU-accelerated Generalized Additive Model.

An interpretable model where

prediction = sum(shape_functioni for all features)

Each shape function is a lookup table mapping binned feature values to contribution scores. Trained via parallel gradient boosting.

PARAMETER	DESCRIPTION
`n_rounds`	Number of boosting rounds. TYPE: `int` DEFAULT: `1000`
`learning_rate`	Shrinkage factor (smaller = more stable, needs more rounds). TYPE: `float` DEFAULT: `0.01`
`reg_lambda`	L2 regularization on leaf values. TYPE: `float` DEFAULT: `1.0`
`loss`	Loss function ('mse', 'logloss', or callable). TYPE: `str \| LossFunction` DEFAULT: `'mse'`
`n_bins`	Number of bins for histogram building. TYPE: `int` DEFAULT: `256`

Example

import openboost as ob

gam = ob.OpenBoostGAM(n_rounds=1000, learning_rate=0.01)
gam.fit(X_train, y_train)
predictions = gam.predict(X_test)

# Interpret: plot shape function for feature 0
gam.plot_shape_function(0, feature_name="age")

fit ¶

fit(X, y)

Fit the GAM model.

PARAMETER	DESCRIPTION
`X`	Training features, shape (n_samples, n_features). TYPE: `NDArray`
`y`	Training targets, shape (n_samples,). TYPE: `NDArray`

RETURNS	DESCRIPTION
`self`	The fitted model. TYPE: `OpenBoostGAM`

predict ¶

predict(X)

Generate predictions.

PARAMETER	DESCRIPTION
`X`	Features, shape (n_samples, n_features). TYPE: `NDArray \| BinnedArray`

RETURNS	DESCRIPTION
`predictions`	Shape (n_samples,). TYPE: `NDArray`

get_feature_importance ¶

get_feature_importance()

Get feature importance based on shape function variance.

RETURNS	DESCRIPTION
`importance`	Shape (n_features,), higher = more important. TYPE: `NDArray`

plot_shape_function ¶

plot_shape_function(feature_idx, feature_name=None)

Plot the shape function for a feature.

PARAMETER	DESCRIPTION
`feature_idx`	Index of the feature to plot. TYPE: `int`
`feature_name`	Optional name for the x-axis label. TYPE: `str \| None` DEFAULT: `None`

LinearLeafGBDT¶

LinearLeafGBDT `dataclass` ¶

LinearLeafGBDT(
    n_trees=100,
    max_depth=4,
    learning_rate=0.1,
    loss="mse",
    min_samples_leaf=20,
    reg_lambda_tree=1.0,
    reg_lambda_linear=0.1,
    max_features_linear="sqrt",
    n_bins=256,
)

Bases: PersistenceMixin

Gradient Boosting with Linear Leaf Trees.

Each tree has linear models in its leaves instead of constant values. This enables: - Better extrapolation beyond training data range - Smoother decision boundaries - Can use shallower trees (linear models add complexity)

Recommended settings: - Use max_depth=3-4 (shallower than standard GBDT) - Use larger min_samples_leaf (need samples to fit linear model)

PARAMETER	DESCRIPTION
`n_trees`	Number of boosting rounds TYPE: `int` DEFAULT: `100`
`max_depth`	Maximum tree depth (typically 3-4, shallower than standard) TYPE: `int` DEFAULT: `4`
`learning_rate`	Shrinkage factor TYPE: `float` DEFAULT: `0.1`
`loss`	Loss function ('mse', 'mae', 'huber', or callable) TYPE: `str \| LossFunction` DEFAULT: `'mse'`
`min_samples_leaf`	Minimum samples to fit linear model in leaf TYPE: `int` DEFAULT: `20`
`reg_lambda_tree`	L2 regularization for tree splits TYPE: `float` DEFAULT: `1.0`
`reg_lambda_linear`	L2 regularization for linear models (ridge) TYPE: `float` DEFAULT: `0.1`
`max_features_linear`	Max features per leaf's linear model - None: Use all features - 'sqrt': Use sqrt(n_features) features - 'log2': Use log2(n_features) features - int: Use exactly this many features TYPE: `int \| str \| None` DEFAULT: `'sqrt'`
`n_bins`	Number of bins for histogram building TYPE: `int` DEFAULT: `256`

Example

model = LinearLeafGBDT(n_trees=100, max_depth=4)
model.fit(X_train, y_train)
pred = model.predict(X_test)

# Compare extrapolation with standard GBDT
from openboost import GradientBoosting
standard = GradientBoosting(n_trees=100, max_depth=6)
standard.fit(X_train, y_train)
# LinearLeafGBDT typically extrapolates better on linear trends

fit ¶

fit(X, y)

Fit the linear leaf GBDT model.

PARAMETER	DESCRIPTION
`X`	Training features, shape (n_samples, n_features) TYPE: `NDArray`
`y`	Training targets, shape (n_samples,) TYPE: `NDArray`

RETURNS	DESCRIPTION
`self`	Fitted model TYPE: `LinearLeafGBDT`

predict ¶

predict(X)

Generate predictions.

PARAMETER	DESCRIPTION
`X`	Features, shape (n_samples, n_features) TYPE: `NDArray`

RETURNS	DESCRIPTION
`predictions`	Shape (n_samples,) TYPE: `NDArray`

Probabilistic Models (NaturalBoost)¶

NaturalBoost¶

NaturalBoost `dataclass` ¶

NaturalBoost(
    distribution="normal",
    n_trees=100,
    max_depth=4,
    learning_rate=0.1,
    min_child_weight=1.0,
    reg_lambda=1.0,
    reg_alpha=0.0,
    subsample=1.0,
    colsample_bytree=1.0,
    n_bins=256,
)

Bases: DistributionalGBDT

Natural Gradient Boosting for probabilistic prediction.

OpenBoost's implementation of natural gradient boosting, inspired by NGBoost. Uses natural gradient instead of ordinary gradient, leading to faster convergence by accounting for the geometry of the parameter space.

Natural gradient: F^{-1} @ ordinary_gradient where F is the Fisher information matrix.

Key advantages over standard GBDT: - Full probability distributions, not just point estimates - Prediction intervals and uncertainty quantification - Faster convergence than ordinary gradient descent

Key advantages over official NGBoost: - GPU acceleration via histogram-based trees - Faster on large datasets (>10k samples) - Custom distributions with autodiff support

Reference

Duan et al. "NGBoost: Natural Gradient Boosting for Probabilistic Prediction." ICML 2020.

PARAMETER	DESCRIPTION
`distribution`	Distribution name or instance TYPE: `Literal['normal', 'lognormal', 'gamma', 'poisson', 'studentt', 'tweedie', 'negbin'] \| Distribution` DEFAULT: `'normal'`
`n_trees`	Number of boosting rounds (often needs fewer than ordinary) TYPE: `int` DEFAULT: `100`
`max_depth`	Maximum depth of each tree (default 4, often smaller is better) TYPE: `int` DEFAULT: `4`
`learning_rate`	Shrinkage factor TYPE: `float` DEFAULT: `0.1`
`min_child_weight`	Minimum sum of hessian in a leaf TYPE: `float` DEFAULT: `1.0`
`reg_lambda`	L2 regularization TYPE: `float` DEFAULT: `1.0`
`n_bins`	Number of bins for histogram building TYPE: `int` DEFAULT: `256`

Example

model = NaturalBoost(distribution='normal', n_trees=500)
model.fit(X_train, y_train)

# Get prediction intervals
lower, upper = model.predict_interval(X_test, alpha=0.1)

# Get full distribution
output = model.predict_distribution(X_test)
samples = output.sample(n_samples=1000)

NaturalBoostNormal¶

NaturalBoostNormal ¶

NaturalBoostNormal(**kwargs)

NaturalBoost with Normal distribution.

NaturalBoostLogNormal¶

NaturalBoostLogNormal ¶

NaturalBoostLogNormal(**kwargs)

NaturalBoost with LogNormal distribution (for positive data).

NaturalBoostGamma¶

NaturalBoostGamma ¶

NaturalBoostGamma(**kwargs)

NaturalBoost with Gamma distribution (for positive data).

NaturalBoostPoisson¶

NaturalBoostPoisson ¶

NaturalBoostPoisson(**kwargs)

NaturalBoost with Poisson distribution (for count data).

NaturalBoostStudentT¶

NaturalBoostStudentT ¶

NaturalBoostStudentT(**kwargs)

NaturalBoost with Student-t distribution (for heavy-tailed data).

NaturalBoostTweedie¶

NaturalBoostTweedie ¶

NaturalBoostTweedie(power=1.5, **kwargs)

NaturalBoost with Tweedie distribution (for insurance claims, zero-inflated data).

Kaggle Use Cases: - Porto Seguro Safe Driver Prediction - Allstate Claims Severity - Any zero-inflated positive target

PARAMETER	DESCRIPTION
`power`	Tweedie power parameter (1 < power < 2). 1.5 is the default for insurance claims. TYPE: `float` DEFAULT: `1.5`
`**kwargs`	Other NaturalBoost parameters (n_trees, learning_rate, etc.) DEFAULT: `{}`

Example

model = NaturalBoostTweedie(power=1.5, n_trees=500)
model.fit(X_train, y_train)  # y has zeros and positive values

# Get prediction intervals (XGBoost can't do this!)
lower, upper = model.predict_interval(X_test, alpha=0.1)

NaturalBoostNegBin¶

NaturalBoostNegBin ¶

NaturalBoostNegBin(**kwargs)

NaturalBoost with Negative Binomial distribution (for overdispersed count data).

Kaggle Use Cases: - Rossmann Store Sales - Bike Sharing Demand - Grupo Bimbo Inventory Demand - Any count prediction where variance > mean

PARAMETER	DESCRIPTION
`**kwargs`	NaturalBoost parameters (n_trees, learning_rate, etc.) DEFAULT: `{}`

Example

model = NaturalBoostNegBin(n_trees=500)
model.fit(X_train, y_train)  # y is count data

# Probability of exceeding threshold (demand planning!)
output = model.predict_distribution(X_test)
prob_high_demand = output.distribution.prob_exceed(output.params, 100)

sklearn Wrappers¶

OpenBoostRegressor¶

OpenBoostRegressor ¶

OpenBoostRegressor(
    n_estimators=100,
    max_depth=6,
    learning_rate=0.1,
    loss="squared_error",
    min_child_weight=1.0,
    reg_lambda=1.0,
    reg_alpha=0.0,
    gamma=0.0,
    subsample=1.0,
    colsample_bytree=1.0,
    n_bins=256,
    quantile_alpha=0.5,
    subsample_strategy="none",
    goss_top_rate=0.2,
    goss_other_rate=0.1,
    batch_size=None,
    early_stopping_rounds=None,
    verbose=0,
    random_state=None,
)

Bases: BaseEstimator, RegressorMixin

Gradient Boosting Regressor with sklearn-compatible interface.

This is a thin wrapper around OpenBoost's GradientBoosting that provides full compatibility with sklearn's ecosystem (GridSearchCV, Pipeline, etc.).

Parameters¶

n_estimators : int, default=100 Number of boosting rounds (trees). max_depth : int, default=6 Maximum depth of each tree. learning_rate : float, default=0.1 Shrinkage factor applied to each tree's contribution. loss : {'squared_error', 'absolute_error', 'huber', 'quantile'}, default='squared_error' Loss function to optimize. min_child_weight : float, default=1.0 Minimum sum of hessian in a leaf node. reg_lambda : float, default=1.0 L2 regularization on leaf values. reg_alpha : float, default=0.0 L1 regularization on leaf values. gamma : float, default=0.0 Minimum gain required to make a split. subsample : float, default=1.0 Fraction of samples to use for each tree. colsample_bytree : float, default=1.0 Fraction of features to use for each tree. n_bins : int, default=256 Number of bins for histogram building. quantile_alpha : float, default=0.5 Quantile level for 'quantile' loss. subsample_strategy : {'none', 'random', 'goss'}, default='none' Sampling strategy for large-scale training (Phase 17). - 'none': No sampling (default) - 'random': Random subsampling - 'goss': Gradient-based One-Side Sampling (LightGBM-style) goss_top_rate : float, default=0.2 Fraction of top-gradient samples to keep (for GOSS). goss_other_rate : float, default=0.1 Fraction of remaining samples to sample (for GOSS). batch_size : int, optional Mini-batch size for large datasets. If None, process all at once. early_stopping_rounds : int, optional Stop training if validation score doesn't improve for this many rounds. Requires eval_set to be passed to fit(). verbose : int, default=0 Verbosity level (0=silent, N=log every N rounds). random_state : int, optional Random seed for reproducibility.

Attributes¶

n_features_in_ : int Number of features seen during fit. feature_names_in_ : ndarray of shape (n_features_in_,) Names of features seen during fit (if X is a DataFrame). feature_importances_ : ndarray of shape (n_features_in_,) Feature importances (based on split frequency). booster_ : GradientBoosting The underlying fitted OpenBoost model. best_iteration_ : int Iteration with best validation score (if early stopping used). best_score_ : float Best validation score achieved (if early stopping used).

Examples¶

from openboost import OpenBoostRegressor reg = OpenBoostRegressor(n_estimators=100, max_depth=6) reg.fit(X_train, y_train) reg.predict(X_test) reg.score(X_test, y_test) # R² score

With early stopping¶

reg = OpenBoostRegressor(n_estimators=1000, early_stopping_rounds=50) reg.fit(X_train, y_train, eval_set=[(X_val, y_val)]) print(f"Best iteration: {reg.best_iteration_}")

GridSearchCV¶

from sklearn.model_selection import GridSearchCV param_grid = {'n_estimators': [50, 100], 'max_depth': [3, 5]} search = GridSearchCV(OpenBoostRegressor(), param_grid, cv=5) search.fit(X, y)

fit ¶

fit(X, y, sample_weight=None, eval_set=None)

Fit the gradient boosting regressor.

Parameters¶

X : array-like of shape (n_samples, n_features) Training features. y : array-like of shape (n_samples,) Target values. sample_weight : array-like of shape (n_samples,), optional Sample weights. eval_set : list of (X, y) tuples, optional Validation sets for early stopping.

Returns¶

self : OpenBoostRegressor Fitted estimator.

predict ¶

predict(X)

Predict target values.

Parameters¶

X : array-like of shape (n_samples, n_features) Features to predict on.

Returns¶

y_pred : ndarray of shape (n_samples,) Predicted values.

OpenBoostClassifier¶

OpenBoostClassifier ¶

OpenBoostClassifier(
    n_estimators=100,
    max_depth=6,
    learning_rate=0.1,
    min_child_weight=1.0,
    reg_lambda=1.0,
    reg_alpha=0.0,
    gamma=0.0,
    subsample=1.0,
    colsample_bytree=1.0,
    n_bins=256,
    subsample_strategy="none",
    goss_top_rate=0.2,
    goss_other_rate=0.1,
    batch_size=None,
    early_stopping_rounds=None,
    verbose=0,
    random_state=None,
)

Bases: BaseEstimator, ClassifierMixin

Gradient Boosting Classifier with sklearn-compatible interface.

Automatically handles binary and multi-class classification. Uses logloss for binary, softmax for multi-class.

Parameters¶

n_estimators : int, default=100 Number of boosting rounds. max_depth : int, default=6 Maximum depth of each tree. learning_rate : float, default=0.1 Shrinkage factor. min_child_weight : float, default=1.0 Minimum sum of hessian in a leaf. reg_lambda : float, default=1.0 L2 regularization on leaf values. reg_alpha : float, default=0.0 L1 regularization on leaf values. gamma : float, default=0.0 Minimum gain required to make a split. subsample : float, default=1.0 Fraction of samples per tree. colsample_bytree : float, default=1.0 Fraction of features per tree. n_bins : int, default=256 Number of bins for histogram building. subsample_strategy : {'none', 'random', 'goss'}, default='none' Sampling strategy for large-scale training (Phase 17). goss_top_rate : float, default=0.2 Fraction of top-gradient samples to keep (for GOSS). goss_other_rate : float, default=0.1 Fraction of remaining samples to sample (for GOSS). batch_size : int, optional Mini-batch size for large datasets. early_stopping_rounds : int, optional Stop if validation doesn't improve. verbose : int, default=0 Verbosity level. random_state : int, optional Random seed.

Attributes¶

classes_ : ndarray Unique class labels. n_classes_ : int Number of classes. n_features_in_ : int Number of features. feature_importances_ : ndarray Feature importances. booster_ : GradientBoosting or MultiClassGradientBoosting Underlying model.

Examples¶

from openboost import OpenBoostClassifier clf = OpenBoostClassifier(n_estimators=100) clf.fit(X_train, y_train) clf.predict(X_test) clf.predict_proba(X_test) clf.classes_ array([0, 1])

Multi-class¶

clf.fit(X_train, y_train) # y_train has 3+ classes clf.predict_proba(X_test).shape (n_samples, n_classes)

fit ¶

fit(X, y, sample_weight=None, eval_set=None)

Fit the gradient boosting classifier.

Parameters¶

X : array-like of shape (n_samples, n_features) Training features. y : array-like of shape (n_samples,) Target class labels. sample_weight : array-like of shape (n_samples,), optional Sample weights. eval_set : list of (X, y) tuples, optional Validation sets for early stopping.

Returns¶

self : OpenBoostClassifier Fitted estimator.

predict ¶

predict(X)

Predict class labels.

Parameters¶

X : array-like of shape (n_samples, n_features) Features to predict on.

Returns¶

y_pred : ndarray of shape (n_samples,) Predicted class labels.

predict_proba ¶

predict_proba(X)

Predict class probabilities.

Parameters¶

X : array-like of shape (n_samples, n_features) Features to predict on.

Returns¶

proba : ndarray of shape (n_samples, n_classes) Class probabilities.

OpenBoostDistributionalRegressor¶

OpenBoostDistributionalRegressor ¶

OpenBoostDistributionalRegressor(
    distribution="normal",
    n_estimators=100,
    max_depth=4,
    learning_rate=0.1,
    min_child_weight=1.0,
    reg_lambda=1.0,
    n_bins=256,
    use_natural_gradient=True,
    verbose=0,
)

Bases: BaseEstimator, RegressorMixin

Distributional regression with sklearn-compatible interface.

Predicts full probability distributions instead of point estimates. Uses natural gradient boosting (NGBoost) by default for faster convergence.

Parameters¶

distribution : str, default='normal' Distribution family. Options: 'normal', 'lognormal', 'gamma', 'poisson', 'studentt'. n_estimators : int, default=100 Number of boosting rounds. max_depth : int, default=4 Maximum depth of each tree. Typically shallower than standard GBDT. learning_rate : float, default=0.1 Shrinkage factor. min_child_weight : float, default=1.0 Minimum sum of hessian in a leaf. reg_lambda : float, default=1.0 L2 regularization on leaf values. n_bins : int, default=256 Number of bins for histogram building. use_natural_gradient : bool, default=True If True, use NGBoost (natural gradient). Recommended for faster convergence and better uncertainty calibration. verbose : int, default=0 Verbosity level.

Attributes¶

n_features_in_ : int Number of features seen during fit. booster_ : NGBoost or DistributionalGBDT The underlying fitted model.

Examples¶

from openboost import OpenBoostDistributionalRegressor model = OpenBoostDistributionalRegressor(distribution='normal') model.fit(X_train, y_train)

Point prediction (mean)¶

y_pred = model.predict(X_test)

Prediction intervals (90%)¶

lower, upper = model.predict_interval(X_test, alpha=0.1)

Full distribution parameters¶

params = model.predict_distribution(X_test) mu, sigma = params['loc'], params['scale']

Sample from predicted distribution¶

samples = model.sample(X_test, n_samples=100)

fit ¶

fit(X, y, **kwargs)

Fit the distributional regressor.

Parameters¶

X : array-like of shape (n_samples, n_features) Training features. y : array-like of shape (n_samples,) Target values.

Returns¶

self : OpenBoostDistributionalRegressor Fitted estimator.

predict ¶

predict(X)

Predict mean (expected value).

Parameters¶

X : array-like of shape (n_samples, n_features) Features to predict on.

Returns¶

y_pred : ndarray of shape (n_samples,) Predicted mean values.

predict_interval ¶

predict_interval(X, alpha=0.1)

Predict (1-alpha) prediction interval.

Parameters¶

X : array-like of shape (n_samples, n_features) Features to predict on. alpha : float, default=0.1 Significance level. 0.1 gives a 90% prediction interval.

Returns¶

lower : ndarray of shape (n_samples,) Lower bounds of the interval. upper : ndarray of shape (n_samples,) Upper bounds of the interval.

predict_distribution ¶

predict_distribution(X)

Predict all distribution parameters.

Parameters¶

X : array-like of shape (n_samples, n_features) Features to predict on.

Returns¶

params : dict Dictionary mapping parameter names to predicted values. For Normal: {'loc': mean, 'scale': std}

predict_quantile ¶

predict_quantile(X, q)

Predict q-th quantile.

Parameters¶

X : array-like of shape (n_samples, n_features) Features to predict on. q : float Quantile level (0 < q < 1).

Returns¶

quantiles : ndarray of shape (n_samples,) Predicted quantiles.

sample ¶

sample(X, n_samples=1, seed=None)

Sample from predicted distribution.

Parameters¶

X : array-like of shape (n_obs, n_features) Features to predict on. n_samples : int, default=1 Number of samples per observation. seed : int, optional Random seed for reproducibility.

Returns¶

samples : ndarray of shape (n_obs, n_samples) Samples from the predicted distribution.

nll_score ¶

nll_score(X, y)

Compute negative log-likelihood (lower is better).

Parameters¶

X : array-like of shape (n_samples, n_features) Features. y : array-like of shape (n_samples,) True target values.

Returns¶

nll : float Mean negative log-likelihood.

OpenBoostLinearLeafRegressor¶

OpenBoostLinearLeafRegressor ¶

OpenBoostLinearLeafRegressor(
    n_estimators=100,
    max_depth=4,
    learning_rate=0.1,
    loss="squared_error",
    min_samples_leaf=20,
    reg_lambda=1.0,
    reg_lambda_linear=0.1,
    max_features_linear="sqrt",
    n_bins=256,
    verbose=0,
)

Bases: BaseEstimator, RegressorMixin

Linear Leaf Gradient Boosting with sklearn-compatible interface.

Uses trees with linear models in leaves instead of constant values. This provides better extrapolation beyond the training data range.

Parameters¶

n_estimators : int, default=100 Number of boosting rounds. max_depth : int, default=4 Maximum tree depth. Typically shallower than standard GBDT since linear models in leaves add flexibility. learning_rate : float, default=0.1 Shrinkage factor. loss : str, default='squared_error' Loss function: 'squared_error', 'absolute_error', 'huber'. min_samples_leaf : int, default=20 Minimum samples in a leaf to fit linear model. reg_lambda : float, default=1.0 L2 regularization for tree splits. reg_lambda_linear : float, default=0.1 L2 regularization for linear models in leaves (ridge). max_features_linear : int, str, or None, default='sqrt' Max features for linear model in each leaf: - None: Use all features - 'sqrt': Use sqrt(n_features) - 'log2': Use log2(n_features) - int: Use exactly this many features n_bins : int, default=256 Number of bins for histogram building. verbose : int, default=0 Verbosity level.

Attributes¶

n_features_in_ : int Number of features seen during fit. booster_ : LinearLeafGBDT The underlying fitted model.

Examples¶

from openboost import OpenBoostLinearLeafRegressor model = OpenBoostLinearLeafRegressor(n_estimators=100, max_depth=4) model.fit(X_train, y_train) y_pred = model.predict(X_test)

Compare with standard GBDT on extrapolation tasks¶

LinearLeafRegressor typically performs better when the¶

underlying relationship has linear components¶

fit ¶

fit(X, y, **kwargs)

Fit the linear leaf regressor.

Parameters¶

X : array-like of shape (n_samples, n_features) Training features. y : array-like of shape (n_samples,) Target values.

Returns¶

self : OpenBoostLinearLeafRegressor Fitted estimator.

predict ¶

predict(X)

Predict target values.

Parameters¶

X : array-like of shape (n_samples, n_features) Features to predict on.

Returns¶

y_pred : ndarray of shape (n_samples,) Predicted values.

Models¶

Standard GBDT¶

GradientBoosting¶

GradientBoosting dataclass ¶

fit ¶

predict ¶

MultiClassGradientBoosting¶

MultiClassGradientBoosting dataclass ¶

fit ¶

predict ¶

predict_proba ¶

DART¶

DART dataclass ¶

fit ¶

predict ¶

Interpretable Models¶

OpenBoostGAM¶

OpenBoostGAM dataclass ¶

fit ¶

predict ¶

get_feature_importance ¶

plot_shape_function ¶

LinearLeafGBDT¶

LinearLeafGBDT dataclass ¶

fit ¶

predict ¶

Probabilistic Models (NaturalBoost)¶

NaturalBoost¶

NaturalBoost dataclass ¶

NaturalBoostNormal¶

NaturalBoostNormal ¶

NaturalBoostLogNormal¶

NaturalBoostLogNormal ¶

NaturalBoostGamma¶

NaturalBoostGamma ¶

NaturalBoostPoisson¶

NaturalBoostPoisson ¶

NaturalBoostStudentT¶

NaturalBoostStudentT ¶

NaturalBoostTweedie¶

NaturalBoostTweedie ¶

NaturalBoostNegBin¶

NaturalBoostNegBin ¶

sklearn Wrappers¶

OpenBoostRegressor¶

OpenBoostRegressor ¶

Parameters¶

Attributes¶

Examples¶

With early stopping¶

GridSearchCV¶

fit ¶

Parameters¶

Returns¶

predict ¶

Parameters¶

Returns¶

OpenBoostClassifier¶

OpenBoostClassifier ¶

Parameters¶

Attributes¶

Examples¶

Multi-class¶

fit ¶

Parameters¶

Returns¶

predict ¶

Parameters¶

Returns¶

predict_proba ¶

Parameters¶

Returns¶

OpenBoostDistributionalRegressor¶

OpenBoostDistributionalRegressor ¶

Parameters¶

Attributes¶

Examples¶

Point prediction (mean)¶

Prediction intervals (90%)¶

Full distribution parameters¶

GradientBoosting `dataclass` ¶

MultiClassGradientBoosting `dataclass` ¶

DART `dataclass` ¶

OpenBoostGAM `dataclass` ¶

LinearLeafGBDT `dataclass` ¶

NaturalBoost `dataclass` ¶