Utilities¶

Helper functions for tuning, cross-validation, and evaluation.

Parameter Tuning¶

suggest_params¶

suggest_params ¶

suggest_params(
    X, y, task="regression", n_estimators_cap=500
)

Suggest hyperparameters based on dataset characteristics.

This provides reasonable starting points based on heuristics. For best results, use these as initial values and tune with cross-validation.

PARAMETER	DESCRIPTION
`X`	Feature matrix, shape (n_samples, n_features). TYPE: `NDArray`
`y`	Target values, shape (n_samples,). TYPE: `NDArray`
`task`	Type of task - 'regression', 'classification', or 'distributional'. TYPE: `Literal['regression', 'classification', 'distributional']` DEFAULT: `'regression'`
`n_estimators_cap`	Maximum number of estimators to suggest. TYPE: `int` DEFAULT: `500`

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary of suggested hyperparameters suitable for passing to
`dict[str, Any]`	OpenBoostRegressor, OpenBoostClassifier, etc.

Example

params = suggest_params(X_train, y_train, task='regression') model = OpenBoostRegressor(**params) model.fit(X_train, y_train)

Notes

For small datasets (< 1000 samples): Fewer trees, more regularization
For large datasets (> 100k samples): More trees, lower learning rate
For high-dimensional data: More column sampling, shallower trees
For noisy data: Consider distributional models for uncertainty

get_param_grid¶

get_param_grid ¶

get_param_grid(task='regression')

Get a suggested parameter grid for hyperparameter tuning.

PARAMETER	DESCRIPTION
`task`	Type of task - 'regression', 'classification', or 'distributional'. TYPE: `Literal['regression', 'classification', 'distributional']` DEFAULT: `'regression'`

RETURNS	DESCRIPTION
`dict[str, list]`	Dictionary of parameter names to lists of values, suitable for
`dict[str, list]`	sklearn's GridSearchCV or RandomizedSearchCV.

Example

from sklearn.model_selection import GridSearchCV from openboost import OpenBoostRegressor from openboost.utils import get_param_grid

param_grid = get_param_grid('regression') search = GridSearchCV(OpenBoostRegressor(), param_grid, cv=3) search.fit(X, y) print(search.best_params_)

Cross-Validation¶

cross_val_predict¶

cross_val_predict ¶

cross_val_predict(model, X, y, cv=5, random_state=42)

Generate out-of-fold predictions using cross-validation.

Each sample gets a prediction from a model that was not trained on it. Useful for stacking/blending in competitions and for honest evaluation.

PARAMETER	DESCRIPTION
`model`	An OpenBoost model instance (will be cloned for each fold). TYPE: `Any`
`X`	Feature matrix, shape (n_samples, n_features). TYPE: `NDArray`
`y`	Target values, shape (n_samples,). TYPE: `NDArray`
`cv`	Number of cross-validation folds. TYPE: `int` DEFAULT: `5`
`random_state`	Random seed for reproducible fold splits. TYPE: `int \| None` DEFAULT: `42`

RETURNS	DESCRIPTION
`NDArray`	Out-of-fold predictions, shape (n_samples,) for regression or
`NDArray`	shape (n_samples, n_classes) for classification probabilities.

Example

from openboost import OpenBoostRegressor from openboost.utils import cross_val_predict

model = OpenBoostRegressor(n_estimators=100) oof_pred = cross_val_predict(model, X, y, cv=5)

Use OOF predictions for stacking¶

from sklearn.linear_model import Ridge meta_model = Ridge() meta_model.fit(oof_pred.reshape(-1, 1), y)

cross_val_predict_proba¶

cross_val_predict_proba ¶

cross_val_predict_proba(model, X, y, cv=5, random_state=42)

Generate out-of-fold probability predictions using cross-validation.

Similar to cross_val_predict but returns class probabilities instead of class labels. Only works with classifiers.

PARAMETER	DESCRIPTION
`model`	An OpenBoost classifier instance (must have predict_proba). TYPE: `Any`
`X`	Feature matrix, shape (n_samples, n_features). TYPE: `NDArray`
`y`	Target labels, shape (n_samples,). TYPE: `NDArray`
`cv`	Number of cross-validation folds. TYPE: `int` DEFAULT: `5`
`random_state`	Random seed for reproducible fold splits. TYPE: `int \| None` DEFAULT: `42`

RETURNS	DESCRIPTION
`NDArray`	Out-of-fold probability predictions, shape (n_samples, n_classes).

Example

from openboost import OpenBoostClassifier from openboost.utils import cross_val_predict_proba

model = OpenBoostClassifier(n_estimators=100) oof_proba = cross_val_predict_proba(model, X, y, cv=5)

Use probabilities for stacking¶

meta_features = oof_proba[:, 1] # P(class=1)

RAISES	DESCRIPTION
`AttributeError`	If model doesn't have predict_proba method.

cross_val_predict_interval¶

cross_val_predict_interval ¶

cross_val_predict_interval(
    model, X, y, alpha=0.1, cv=5, random_state=42
)

Generate out-of-fold prediction intervals using cross-validation.

For distributional models that support uncertainty quantification. Returns lower and upper bounds of the prediction interval.

PARAMETER	DESCRIPTION
`model`	An OpenBoost distributional model (must have predict_interval). TYPE: `Any`
`X`	Feature matrix, shape (n_samples, n_features). TYPE: `NDArray`
`y`	Target values, shape (n_samples,). TYPE: `NDArray`
`alpha`	Significance level (0.1 = 90% interval). TYPE: `float` DEFAULT: `0.1`
`cv`	Number of cross-validation folds. TYPE: `int` DEFAULT: `5`
`random_state`	Random seed for reproducible fold splits. TYPE: `int \| None` DEFAULT: `42`

RETURNS	DESCRIPTION
`tuple[NDArray, NDArray]`	Tuple of (lower_bounds, upper_bounds), each shape (n_samples,).

Example

from openboost import OpenBoostDistributionalRegressor from openboost.utils import cross_val_predict_interval

model = OpenBoostDistributionalRegressor(distribution='normal') lower, upper = cross_val_predict_interval(model, X, y, alpha=0.1)

Check coverage¶

coverage = np.mean((y >= lower) & (y <= upper)) print(f"90% interval coverage: {coverage:.2%}")

RAISES	DESCRIPTION
`AttributeError`	If model doesn't have predict_interval method.

evaluate_coverage¶

evaluate_coverage ¶

evaluate_coverage(y_true, lower, upper, alpha=0.1)

Evaluate prediction interval coverage and width.

PARAMETER	DESCRIPTION
`y_true`	True target values, shape (n_samples,). TYPE: `NDArray`
`lower`	Lower bounds of intervals, shape (n_samples,). TYPE: `NDArray`
`upper`	Upper bounds of intervals, shape (n_samples,). TYPE: `NDArray`
`alpha`	Expected significance level (for reporting). TYPE: `float` DEFAULT: `0.1`

RETURNS	DESCRIPTION
`dict[str, float]`	Dictionary with:
`dict[str, float]`	coverage: Fraction of true values within intervals
`dict[str, float]`	expected_coverage: Expected coverage (1 - alpha)
`dict[str, float]`	mean_width: Average interval width
`dict[str, float]`	median_width: Median interval width

Example

lower, upper = model.predict_interval(X_test, alpha=0.1) metrics = evaluate_coverage(y_test, lower, upper, alpha=0.1) print(f"Coverage: {metrics['coverage']:.2%}") print(f"Mean width: {metrics['mean_width']:.4f}")

Feature Importance¶

compute_feature_importances¶

compute_feature_importances ¶

compute_feature_importances(
    model, importance_type="frequency", normalize=True
)

Compute feature importances from any tree-based model.

Works with: GradientBoosting, DART, OpenBoostGAM, MultiClassGradientBoosting, and any model with a trees_ attribute.

PARAMETER	DESCRIPTION
`model`	A fitted model with `trees_` attribute. TYPE: `Any`
`importance_type`	Type of importance calculation: - 'frequency': Number of times feature is used for splits (default) - 'gain': Sum of gain from splits on each feature (if available) - 'cover': Sum of samples covered by splits (if available) TYPE: `str` DEFAULT: `'frequency'`
`normalize`	If True, normalize importances to sum to 1. TYPE: `bool` DEFAULT: `True`

RETURNS	DESCRIPTION
`importances`	Array of shape (n_features,) with importance scores. TYPE: `NDArray`

RAISES	DESCRIPTION
`ValueError`	If model has no trees or unknown importance_type.

Example

model = ob.GradientBoosting(n_trees=100).fit(X, y) importances = compute_feature_importances(model)

Use with sklearn-style attribute¶

model.feature_importances_ = compute_feature_importances(model)

get_feature_importance_dict¶

get_feature_importance_dict ¶

get_feature_importance_dict(
    model,
    feature_names=None,
    importance_type="frequency",
    top_n=None,
)

Get feature importances as a sorted dictionary.

Convenience function that returns importances as a dict, optionally with feature names and limited to top N features.

PARAMETER	DESCRIPTION
`model`	Fitted tree-based model. TYPE: `Any`
`feature_names`	Optional list of feature names. TYPE: `list[str] \| None` DEFAULT: `None`
`importance_type`	Type of importance ('frequency', 'gain', 'cover'). TYPE: `str` DEFAULT: `'frequency'`
`top_n`	If provided, return only top N features. TYPE: `int \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`dict[str, float]`	Dict mapping feature name/index to importance, sorted by importance.

Example

importance_dict = get_feature_importance_dict( ... model, ... feature_names=['age', 'income', 'score'], ... top_n=2 ... )

{'income': 0.45, 'age': 0.32}¶

plot_feature_importances¶

plot_feature_importances ¶

plot_feature_importances(
    model,
    feature_names=None,
    importance_type="frequency",
    top_n=20,
    ax=None,
    **kwargs,
)

Plot feature importances as a horizontal bar chart.

PARAMETER	DESCRIPTION
`model`	Fitted tree-based model. TYPE: `Any`
`feature_names`	Optional list of feature names. TYPE: `list[str] \| None` DEFAULT: `None`
`importance_type`	Type of importance ('frequency', 'gain', 'cover'). TYPE: `str` DEFAULT: `'frequency'`
`top_n`	Number of top features to show. TYPE: `int` DEFAULT: `20`
`ax`	Matplotlib axes to plot on (creates new if None). DEFAULT: `None`
`**kwargs`	Additional arguments passed to barh(). DEFAULT: `{}`

RETURNS	DESCRIPTION
	Matplotlib axes object.

Example

plot_feature_importances(model, top_n=10) plt.show()

Evaluation Metrics¶

Regression¶

mse_score ¶

mse_score(y_true, y_pred, *, sample_weight=None)

Compute Mean Squared Error.

Thin wrapper around sklearn.metrics.mean_squared_error with sample weight support.

PARAMETER	DESCRIPTION
`y_true`	True values, shape (n_samples,). TYPE: `NDArray`
`y_pred`	Predicted values, shape (n_samples,). TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	MSE (lower is better). Perfect predictions = 0.

Example

import openboost as ob y_true = np.array([1.0, 2.0, 3.0]) y_pred = np.array([1.1, 2.0, 2.8]) ob.mse_score(y_true, y_pred) 0.016666...

mae_score ¶

mae_score(y_true, y_pred, *, sample_weight=None)

Compute Mean Absolute Error.

Thin wrapper around sklearn.metrics.mean_absolute_error with sample weight support.

PARAMETER	DESCRIPTION
`y_true`	True values, shape (n_samples,). TYPE: `NDArray`
`y_pred`	Predicted values, shape (n_samples,). TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	MAE (lower is better). Perfect predictions = 0.

Example

import openboost as ob y_true = np.array([1.0, 2.0, 3.0]) y_pred = np.array([1.1, 2.0, 2.8]) ob.mae_score(y_true, y_pred) 0.1

rmse_score ¶

rmse_score(y_true, y_pred, *, sample_weight=None)

Compute Root Mean Squared Error.

Thin wrapper around sklearn.metrics.mean_squared_error with squared=False.

PARAMETER	DESCRIPTION
`y_true`	True values, shape (n_samples,). TYPE: `NDArray`
`y_pred`	Predicted values, shape (n_samples,). TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	RMSE (lower is better). Perfect predictions = 0.

Example

import openboost as ob y_true = np.array([1.0, 2.0, 3.0]) y_pred = np.array([1.1, 2.0, 2.8]) ob.rmse_score(y_true, y_pred) 0.1291...

r2_score ¶

r2_score(y_true, y_pred, *, sample_weight=None)

Compute R² (coefficient of determination).

Thin wrapper around sklearn.metrics.r2_score with sample weight support.

PARAMETER	DESCRIPTION
`y_true`	True values, shape (n_samples,). TYPE: `NDArray`
`y_pred`	Predicted values, shape (n_samples,). TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	R² score. Perfect predictions = 1.0, baseline (mean) = 0.0.

Example

import openboost as ob y_true = np.array([1.0, 2.0, 3.0, 4.0]) y_pred = np.array([1.1, 1.9, 3.1, 3.9]) ob.r2_score(y_true, y_pred) 0.98

Classification¶

accuracy_score ¶

accuracy_score(y_true, y_pred, *, sample_weight=None)

Compute classification accuracy.

Thin wrapper around sklearn.metrics.accuracy_score with sample weight support.

PARAMETER	DESCRIPTION
`y_true`	True labels, shape (n_samples,). TYPE: `NDArray`
`y_pred`	Predicted labels, shape (n_samples,). TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	Accuracy score between 0 and 1.

Example

import openboost as ob y_true = np.array([0, 1, 1, 0]) y_pred = np.array([0, 1, 0, 0]) ob.accuracy_score(y_true, y_pred) 0.75

roc_auc_score ¶

roc_auc_score(y_true, y_score, *, sample_weight=None)

Compute Area Under the ROC Curve (AUC).

Thin wrapper around sklearn.metrics.roc_auc_score with sample weight support.

PARAMETER	DESCRIPTION
`y_true`	True binary labels, shape (n_samples,). Values should be 0 or 1. TYPE: `NDArray`
`y_score`	Predicted scores/probabilities, shape (n_samples,). TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	AUC score between 0 and 1. Random classifier = 0.5, perfect = 1.0.

Example

import openboost as ob y_true = np.array([0, 0, 1, 1]) y_score = np.array([0.1, 0.4, 0.35, 0.8]) ob.roc_auc_score(y_true, y_score) 0.75

log_loss_score ¶

log_loss_score(y_true, y_pred, *, sample_weight=None)

Compute log loss (cross-entropy loss) for binary classification.

Thin wrapper around sklearn.metrics.log_loss with sample weight support.

PARAMETER	DESCRIPTION
`y_true`	True binary labels, shape (n_samples,). Values should be 0 or 1. TYPE: `NDArray`
`y_pred`	Predicted probabilities for positive class, shape (n_samples,). TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	Log loss (lower is better). Perfect predictions = 0.

Example

import openboost as ob y_true = np.array([0, 0, 1, 1]) y_pred = np.array([0.1, 0.2, 0.7, 0.9]) ob.log_loss_score(y_true, y_pred) 0.1738...

f1_score ¶

f1_score(
    y_true, y_pred, *, average="binary", sample_weight=None
)

Compute F1 score (harmonic mean of precision and recall).

Thin wrapper around sklearn.metrics.f1_score with sample weight support.

PARAMETER	DESCRIPTION
`y_true`	True labels, shape (n_samples,). TYPE: `NDArray`
`y_pred`	Predicted labels, shape (n_samples,). TYPE: `NDArray`
`average`	Averaging method for multi-class: - 'binary': Only for binary classification. - 'micro': Global TP, FP, FN counts. - 'macro': Unweighted mean of per-class F1. - 'weighted': Weighted mean by support. TYPE: `Literal['binary', 'micro', 'macro', 'weighted']` DEFAULT: `'binary'`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	F1 score between 0 and 1.

Example

import openboost as ob y_true = np.array([0, 1, 1, 0, 1]) y_pred = np.array([0, 1, 0, 0, 1]) ob.f1_score(y_true, y_pred) 0.8

precision_score ¶

precision_score(
    y_true, y_pred, *, average="binary", sample_weight=None
)

Compute precision (positive predictive value).

Thin wrapper around sklearn.metrics.precision_score with sample weight support.

PARAMETER	DESCRIPTION
`y_true`	True labels, shape (n_samples,). TYPE: `NDArray`
`y_pred`	Predicted labels, shape (n_samples,). TYPE: `NDArray`
`average`	Averaging method for multi-class: - 'binary': Only for binary classification. - 'micro': Global TP, FP counts. - 'macro': Unweighted mean of per-class precision. - 'weighted': Weighted mean by support. TYPE: `Literal['binary', 'micro', 'macro', 'weighted']` DEFAULT: `'binary'`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	Precision score between 0 and 1.

Example

import openboost as ob y_true = np.array([0, 1, 1, 0, 1]) y_pred = np.array([0, 1, 0, 1, 1]) ob.precision_score(y_true, y_pred) 0.666...

recall_score ¶

recall_score(
    y_true, y_pred, *, average="binary", sample_weight=None
)

Compute recall (sensitivity, true positive rate).

Thin wrapper around sklearn.metrics.recall_score with sample weight support.

PARAMETER	DESCRIPTION
`y_true`	True labels, shape (n_samples,). TYPE: `NDArray`
`y_pred`	Predicted labels, shape (n_samples,). TYPE: `NDArray`
`average`	Averaging method for multi-class: - 'binary': Only for binary classification. - 'micro': Global TP, FN counts. - 'macro': Unweighted mean of per-class recall. - 'weighted': Weighted mean by support. TYPE: `Literal['binary', 'micro', 'macro', 'weighted']` DEFAULT: `'binary'`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	Recall score between 0 and 1.

Example

import openboost as ob y_true = np.array([0, 1, 1, 0, 1]) y_pred = np.array([0, 1, 0, 0, 1]) ob.recall_score(y_true, y_pred) 0.666...

Probabilistic Metrics¶

crps_gaussian ¶

crps_gaussian(y_true, mean, std, *, sample_weight=None)

Compute Continuous Ranked Probability Score for Gaussian predictions.

CRPS is a strictly proper scoring rule that measures the quality of probabilistic predictions. Lower is better. For Gaussian distributions, there's a closed-form solution.

PARAMETER	DESCRIPTION
`y_true`	True values, shape (n_samples,). TYPE: `NDArray`
`mean`	Predicted mean, shape (n_samples,). TYPE: `NDArray`
`std`	Predicted standard deviation, shape (n_samples,). Must be > 0. TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	Mean CRPS (lower is better). Perfect calibration minimizes CRPS.

Example

import openboost as ob y_true = np.array([1.0, 2.0, 3.0]) mean = np.array([1.1, 2.0, 2.8]) std = np.array([0.5, 0.5, 0.5]) ob.crps_gaussian(y_true, mean, std) 0.123...

Notes

CRPS formula for Gaussian: CRPS(N(μ,σ²), y) = σ * [z(2Φ(z) - 1) + 2*φ(z) - 1/√π] where z = (y - μ) / σ, Φ is CDF, φ is PDF of standard normal.

For NaturalBoost models, use:

output = model.predict_distribution(X) mean, std = output.params[:, 0], np.sqrt(output.params[:, 1]) crps = ob.crps_gaussian(y_true, mean, std)

crps_empirical ¶

crps_empirical(y_true, samples, *, sample_weight=None)

Compute CRPS using empirical distribution from Monte Carlo samples.

For non-Gaussian distributions, CRPS can be estimated from samples. This is useful for NaturalBoost models with non-Normal distributions.

PARAMETER	DESCRIPTION
`y_true`	True values, shape (n_samples,). TYPE: `NDArray`
`samples`	Monte Carlo samples, shape (n_samples, n_mc_samples). Each row contains samples from the predictive distribution. TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	Mean CRPS estimated from samples (lower is better).

Example

model = ob.NaturalBoostGamma(n_trees=100) model.fit(X_train, y_train) samples = model.sample(X_test, n_samples=1000) # (n_test, 1000) crps = ob.crps_empirical(y_test, samples)

Notes

Uses the formula: CRPS = E|X - y| - 0.5 * E|X - X'| where X, X' are independent samples from the predictive distribution.

brier_score ¶

brier_score(y_true, y_prob, *, sample_weight=None)

Compute Brier score for probabilistic binary classification.

Brier score measures the mean squared error of probability predictions. It's a strictly proper scoring rule for binary outcomes.

PARAMETER	DESCRIPTION
`y_true`	True binary labels, shape (n_samples,). Values should be 0 or 1. TYPE: `NDArray`
`y_prob`	Predicted probabilities for positive class, shape (n_samples,). TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	Brier score (lower is better). Perfect predictions = 0, random = 0.25.

Example

import openboost as ob y_true = np.array([0, 0, 1, 1]) y_prob = np.array([0.1, 0.2, 0.8, 0.9]) ob.brier_score(y_true, y_prob) 0.025

Notes

Brier score = mean((y_prob - y_true)²)

Decomposition: Brier = Reliability - Resolution + Uncertainty - Reliability: calibration error (how well probabilities match frequencies) - Resolution: how different predictions are from the base rate - Uncertainty: entropy of the outcome distribution

pinball_loss ¶

pinball_loss(
    y_true, y_pred, quantile=0.5, *, sample_weight=None
)

Compute pinball loss (quantile loss) for quantile regression.

Pinball loss is the proper scoring rule for quantile estimation. At quantile=0.5, it equals MAE.

PARAMETER	DESCRIPTION
`y_true`	True values, shape (n_samples,). TYPE: `NDArray`
`y_pred`	Predicted quantile values, shape (n_samples,). TYPE: `NDArray`
`quantile`	The quantile being predicted, in (0, 1). Default 0.5 (median). TYPE: `float` DEFAULT: `0.5`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	Pinball loss (lower is better).

Example

import openboost as ob y_true = np.array([1.0, 2.0, 3.0]) y_pred_median = np.array([1.1, 2.0, 2.8]) ob.pinball_loss(y_true, y_pred_median, quantile=0.5) 0.1

Lower quantile (e.g., 10th percentile)¶

y_pred_q10 = np.array([0.5, 1.5, 2.0]) ob.pinball_loss(y_true, y_pred_q10, quantile=0.1)

Notes

Pinball loss: L(y, q) = (y - q) * τ if y >= q else (q - y) * (1 - τ) where τ is the quantile.

For prediction intervals from NaturalBoost:

lower, upper = model.predict_interval(X, alpha=0.1) # 90% interval loss_lower = ob.pinball_loss(y, lower, quantile=0.05) loss_upper = ob.pinball_loss(y, upper, quantile=0.95)

interval_score ¶

interval_score(
    y_true, lower, upper, alpha=0.1, *, sample_weight=None
)

Compute interval score for prediction intervals.

Interval score is a strictly proper scoring rule for prediction intervals. It rewards narrow intervals while penalizing miscoverage.

PARAMETER	DESCRIPTION
`y_true`	True values, shape (n_samples,). TYPE: `NDArray`
`lower`	Lower bound of prediction interval, shape (n_samples,). TYPE: `NDArray`
`upper`	Upper bound of prediction interval, shape (n_samples,). TYPE: `NDArray`
`alpha`	Nominal miscoverage rate (0.1 for 90% interval). Default 0.1. TYPE: `float` DEFAULT: `0.1`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	Interval score (lower is better).

Example

import openboost as ob y_true = np.array([1.0, 2.0, 3.0]) lower = np.array([0.5, 1.5, 2.5]) upper = np.array([1.5, 2.5, 3.5]) ob.interval_score(y_true, lower, upper, alpha=0.1) 1.0

Notes

Interval Score = (upper - lower) + (2/α) * (lower - y) * I(y < lower) + (2/α) * (y - upper) * I(y > upper)

The score combines: 1. Interval width (prefer narrow intervals) 2. Penalty for observations below lower bound 3. Penalty for observations above upper bound

Use with NaturalBoost:

lower, upper = model.predict_interval(X_test, alpha=0.1) score = ob.interval_score(y_test, lower, upper, alpha=0.1)

expected_calibration_error ¶

expected_calibration_error(
    y_true, y_prob, n_bins=10, *, strategy="uniform"
)

Compute Expected Calibration Error (ECE) for probability predictions.

ECE measures the miscalibration of predicted probabilities. A well-calibrated model has ECE close to 0.

PARAMETER	DESCRIPTION
`y_true`	True binary labels, shape (n_samples,). Values should be 0 or 1. TYPE: `NDArray`
`y_prob`	Predicted probabilities for positive class, shape (n_samples,). TYPE: `NDArray`
`n_bins`	Number of bins to use. Default 10. TYPE: `int` DEFAULT: `10`
`strategy`	Binning strategy: - 'uniform': Bins of equal width in [0, 1]. - 'quantile': Bins with equal number of samples. TYPE: `Literal['uniform', 'quantile']` DEFAULT: `'uniform'`

RETURNS	DESCRIPTION
`float`	ECE (lower is better). Perfect calibration = 0.

Example

import openboost as ob y_true = np.array([0, 0, 1, 1, 1]) y_prob = np.array([0.1, 0.3, 0.6, 0.8, 0.9]) ob.expected_calibration_error(y_true, y_prob) 0.06

Notes

ECE = Σ (|bin_size| / n) * |accuracy_in_bin - mean_confidence_in_bin|

For reliability diagrams, use calibration_curve to get bin data.

calibration_curve ¶

calibration_curve(
    y_true, y_prob, n_bins=10, *, strategy="uniform"
)

Compute calibration curve data for reliability diagrams.

Returns the fraction of positives and mean predicted probability for each bin. This data can be used to create reliability diagrams.

PARAMETER	DESCRIPTION
`y_true`	True binary labels, shape (n_samples,). Values should be 0 or 1. TYPE: `NDArray`
`y_prob`	Predicted probabilities for positive class, shape (n_samples,). TYPE: `NDArray`
`n_bins`	Number of bins to use. Default 10. TYPE: `int` DEFAULT: `10`
`strategy`	Binning strategy: - 'uniform': Bins of equal width in [0, 1]. - 'quantile': Bins with equal number of samples. TYPE: `Literal['uniform', 'quantile']` DEFAULT: `'uniform'`

RETURNS	DESCRIPTION
`NDArray`	Tuple of (fraction_of_positives, mean_predicted_value, bin_counts):
`NDArray`	fraction_of_positives: Actual fraction of positives in each bin.
`NDArray`	mean_predicted_value: Mean predicted probability in each bin.
`tuple[NDArray, NDArray, NDArray]`	bin_counts: Number of samples in each bin.

Example

import openboost as ob import matplotlib.pyplot as plt

y_true = np.array([0, 0, 1, 1, 1, 0, 1, 0, 1, 1]) y_prob = np.array([0.1, 0.2, 0.3, 0.5, 0.6, 0.4, 0.7, 0.3, 0.8, 0.9]) frac_pos, mean_pred, counts = ob.calibration_curve(y_true, y_prob, n_bins=5)

Plot reliability diagram¶

plt.plot([0, 1], [0, 1], 'k--', label='Perfectly calibrated') plt.plot(mean_pred, frac_pos, 's-', label='Model') plt.xlabel('Mean predicted probability') plt.ylabel('Fraction of positives') plt.legend()

negative_log_likelihood ¶

negative_log_likelihood(
    y_true, mean, std, *, sample_weight=None
)

Compute negative log-likelihood for Gaussian predictions.

NLL is a proper scoring rule for probabilistic predictions. Lower is better.

PARAMETER	DESCRIPTION
`y_true`	True values, shape (n_samples,). TYPE: `NDArray`
`mean`	Predicted mean, shape (n_samples,). TYPE: `NDArray`
`std`	Predicted standard deviation, shape (n_samples,). Must be > 0. TYPE: `NDArray`
`sample_weight`	Sample weights, shape (n_samples,). If None, uniform weights. TYPE: `NDArray \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`float`	Mean negative log-likelihood (lower is better).

Example

import openboost as ob y_true = np.array([1.0, 2.0, 3.0]) mean = np.array([1.1, 2.0, 2.8]) std = np.array([0.5, 0.5, 0.5]) ob.negative_log_likelihood(y_true, mean, std) 0.92...

Notes

NLL = 0.5 * log(2π) + log(σ) + (y - μ)² / (2σ²)

For NaturalBoost Normal models:

output = model.predict_distribution(X) mean, var = output.params[:, 0], output.params[:, 1] nll = ob.negative_log_likelihood(y, mean, np.sqrt(var))

Sampling¶

goss_sample¶

goss_sample ¶

goss_sample(
    grad, hess=None, top_rate=0.2, other_rate=0.1, seed=None
)

Gradient-based One-Side Sampling (GOSS).

GOSS keeps all samples with large gradient magnitudes (important for learning) and randomly samples from the rest. Small-gradient samples are upweighted to maintain unbiased gradient estimates.

This gives ~3x speedup with minimal accuracy loss compared to random subsampling.

Algorithm

Sort samples by |gradient|
Keep top top_rate samples by gradient magnitude
Randomly sample other_rate from the rest
Upweight small-gradient samples by (1 - top_rate) / other_rate

PARAMETER	DESCRIPTION
`grad`	Gradient array, shape (n_samples,) or (n_samples, n_params). For multi-parameter distributions, uses sum of absolute gradients. TYPE: `NDArray`
`hess`	Hessian array (unused, for API compatibility). TYPE: `NDArray \| None` DEFAULT: `None`
`top_rate`	Fraction of high-gradient samples to keep (default 0.2). TYPE: `float` DEFAULT: `0.2`
`other_rate`	Fraction of low-gradient samples to sample (default 0.1). TYPE: `float` DEFAULT: `0.1`
`seed`	Random seed for reproducibility. TYPE: `int \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`SamplingResult`	SamplingResult with selected indices and weights.

Example

grad = compute_gradients(pred, y) result = goss_sample(grad, top_rate=0.2, other_rate=0.1)

Use selected samples for histogram building¶

hist = build_histogram(X[result.indices], ... grad[result.indices] * result.weights, ... hess[result.indices] * result.weights)

References

LightGBM paper: https://papers.nips.cc/paper/6907-lightgbm

MiniBatchIterator¶

MiniBatchIterator ¶

MiniBatchIterator(
    n_samples, batch_size, shuffle=False, seed=None
)

Iterator for mini-batch training.

Yields chunks of sample indices for processing datasets larger than memory.

PARAMETER	DESCRIPTION
`n_samples`	Total number of samples. TYPE: `int`
`batch_size`	Number of samples per batch. TYPE: `int`
`shuffle`	Whether to shuffle indices before iteration. TYPE: `bool` DEFAULT: `False`
`seed`	Random seed for shuffling. TYPE: `int \| None` DEFAULT: `None`

Example

iterator = MiniBatchIterator(n_samples=10_000_000, batch_size=100_000) for batch_indices in iterator: ... # Load batch data ... X_batch = load_batch(X_mmap, batch_indices) ... grad_batch = grad[batch_indices] ... hess_batch = hess[batch_indices] ...
... # Build and accumulate histogram ... batch_hist = build_histogram(X_batch, grad_batch, hess_batch) ... total_hist += batch_hist

n_batches `property` ¶

n_batches

Number of batches per epoch.

iter ¶

__iter__()

Iterate over batch indices.

next ¶

__next__()

Get next batch of indices.

len ¶

__len__()

Number of batches.

create_memmap_binned¶

create_memmap_binned ¶

create_memmap_binned(path, X, n_bins=256)

Create memory-mapped binned array for large datasets.

Bins the data and saves to disk as a memory-mapped file, enabling training on datasets larger than RAM.

PARAMETER	DESCRIPTION
`path`	Path to save the memory-mapped file. TYPE: `str`
`X`	Input features, shape (n_samples, n_features). TYPE: `NDArray`
`n_bins`	Number of bins for quantile binning. TYPE: `int` DEFAULT: `256`

RETURNS	DESCRIPTION
`memmap`	Memory-mapped binned array, shape (n_features, n_samples).

Example

Create once¶

X_mmap = create_memmap_binned('data.npy', X_train)

Load for training (no copy, uses disk)¶

X_mmap = np.memmap('data.npy', mode='r', dtype=np.uint8, ... shape=(n_features, n_samples))

load_memmap_binned¶

load_memmap_binned ¶

load_memmap_binned(path, n_features, n_samples)

Load memory-mapped binned array.

PARAMETER	DESCRIPTION
`path`	Path to the memory-mapped file. TYPE: `str`
`n_features`	Number of features. TYPE: `int`
`n_samples`	Number of samples. TYPE: `int`

RETURNS	DESCRIPTION
`memmap`	Memory-mapped binned array, shape (n_features, n_samples).

Utilities¶

Parameter Tuning¶

suggest_params¶

suggest_params ¶

get_param_grid¶

get_param_grid ¶

Cross-Validation¶

cross_val_predict¶

cross_val_predict ¶

Use OOF predictions for stacking¶

cross_val_predict_proba¶

cross_val_predict_proba ¶

Use probabilities for stacking¶

cross_val_predict_interval¶

cross_val_predict_interval ¶

Check coverage¶

evaluate_coverage¶

evaluate_coverage ¶

Feature Importance¶

compute_feature_importances¶

compute_feature_importances ¶

Use with sklearn-style attribute¶

get_feature_importance_dict¶

get_feature_importance_dict ¶

{'income': 0.45, 'age': 0.32}¶

plot_feature_importances¶

plot_feature_importances ¶

Evaluation Metrics¶

Regression¶

mse_score ¶

mae_score ¶

rmse_score ¶

r2_score ¶

Classification¶

accuracy_score ¶

roc_auc_score ¶

log_loss_score ¶

f1_score ¶

precision_score ¶

recall_score ¶

Probabilistic Metrics¶

crps_gaussian ¶

crps_empirical ¶

brier_score ¶

pinball_loss ¶

Lower quantile (e.g., 10th percentile)¶

interval_score ¶

expected_calibration_error ¶

calibration_curve ¶

Plot reliability diagram¶

negative_log_likelihood ¶

Sampling¶

goss_sample¶

goss_sample ¶

Use selected samples for histogram building¶

MiniBatchIterator¶

MiniBatchIterator ¶

n_batches property ¶

__iter__ ¶

__next__ ¶

__len__ ¶

create_memmap_binned¶

create_memmap_binned ¶

Create once¶

Load for training (no copy, uses disk)¶

load_memmap_binned¶

load_memmap_binned ¶

n_batches `property` ¶

iter ¶

next ¶

len ¶