Metrics¶

Evaluation metrics for XGBoost/LightGBM that work with custom objectives.

Important: Disable Default Metrics

When using custom objectives, XGBoost's default evaluation metrics may not be meaningful. Always set 'disable_default_eval_metric': 1 in params for XGBoost, or 'metric': 'None' for LightGBM.

Quick Start¶

from jaxboost.objective import ordinal_logit
from jaxboost.metric import qwk_metric

# Create ordinal objective
ordinal = ordinal_logit(n_classes=6)
ordinal.init_thresholds_from_data(y_train)

# Train with custom metric
model = xgb.train(
    {'disable_default_eval_metric': 1, 'max_depth': 4},
    dtrain,
    obj=ordinal.xgb_objective,
    custom_metric=ordinal.qwk_metric.xgb_metric,  # Built-in metric
    evals=[(dtest, 'test')]
)

Base Classes¶

Metric¶

Metric ¶

Metric(name: str, fn: Callable[[ndarray, ndarray], float], transform: Callable[[ndarray], ndarray] | None = None, higher_is_better: bool = True)

Base class for XGBoost/LightGBM evaluation metrics.

Provides both XGBoost and LightGBM compatible interfaces.

Parameters:

Name	Type	Description	Default
`name`	`str`	Metric name displayed during training	required
`fn`	`Callable[[ndarray, ndarray], float]`	Metric function (y_true, y_pred) -> float	required
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional prediction transform (e.g., sigmoid for binary)	`None`
`higher_is_better`	`bool`	Whether higher metric values are better	`True`

Example

metric = Metric( ... name='accuracy', ... fn=lambda y, p: (y == p).mean(), ... transform=lambda p: (p > 0.5).astype(int), ... higher_is_better=True ... )

Use with XGBoost¶

model = xgb.train(params, dtrain, custom_metric=metric.xgb_metric)

Use with LightGBM¶

model = lgb.train(params, train_data, feval=metric.lgb_metric)

Source code in src/jaxboost/metric/base.py

def __init__(
    self,
    name: str,
    fn: Callable[[np.ndarray, np.ndarray], float],
    transform: Callable[[np.ndarray], np.ndarray] | None = None,
    higher_is_better: bool = True,
):
    self.name = name
    self.fn = fn
    self.transform = transform
    self.higher_is_better = higher_is_better

call ¶

__call__(y_true: ndarray, y_pred: ndarray) -> float

Compute metric value.

Source code in src/jaxboost/metric/base.py

def __call__(self, y_true: np.ndarray, y_pred: np.ndarray) -> float:
    """Compute metric value."""
    if self.transform is not None:
        y_pred = self.transform(y_pred)
    return float(self.fn(y_true, y_pred))

xgb_metric ¶

xgb_metric(predt: ndarray, dtrain: Any) -> tuple[str, float]

XGBoost-compatible metric function.

Parameters:

Name	Type	Description	Default
`predt`	`ndarray`	Raw predictions from model	required
`dtrain`	`Any`	XGBoost DMatrix with labels	required

Returns:

Type	Description
`tuple[str, float]`	(metric_name, metric_value)

Source code in src/jaxboost/metric/base.py

def xgb_metric(self, predt: np.ndarray, dtrain: Any) -> tuple[str, float]:
    """
    XGBoost-compatible metric function.

    Args:
        predt: Raw predictions from model
        dtrain: XGBoost DMatrix with labels

    Returns:
        (metric_name, metric_value)
    """
    y_true = dtrain.get_label()
    y_pred = self.transform(predt) if self.transform else predt
    value = self.fn(y_true, y_pred)
    return self.name, float(value)

lgb_metric ¶

lgb_metric(preds: ndarray, eval_data: Any) -> tuple[str, float, bool]

LightGBM-compatible metric function.

Parameters:

Name	Type	Description	Default
`preds`	`ndarray`	Raw predictions from model	required
`eval_data`	`Any`	LightGBM Dataset with labels	required

Returns:

Type	Description
`tuple[str, float, bool]`	(metric_name, metric_value, is_higher_better)

Source code in src/jaxboost/metric/base.py

def lgb_metric(self, preds: np.ndarray, eval_data: Any) -> tuple[str, float, bool]:
    """
    LightGBM-compatible metric function.

    Args:
        preds: Raw predictions from model
        eval_data: LightGBM Dataset with labels

    Returns:
        (metric_name, metric_value, is_higher_better)
    """
    y_true = eval_data.get_label()
    y_pred = self.transform(preds) if self.transform else preds
    value = self.fn(y_true, y_pred)
    return self.name, float(value), self.higher_is_better

make_metric¶

make_metric ¶

make_metric(name: str, transform: Callable[[ndarray], ndarray] | None = None, higher_is_better: bool = True) -> Callable

Decorator to create XGBoost/LightGBM compatible metrics.

Parameters:

Name	Type	Description	Default
`name`	`str`	Metric name shown during training	required
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to transform raw predictions	`None`
`higher_is_better`	`bool`	Whether higher values are better	`True`

Example

@make_metric('my_accuracy', transform=lambda p: (p > 0.5).astype(int)) ... def my_accuracy(y_true, y_pred): ... return (y_true == y_pred).mean()

Use with XGBoost¶

model = xgb.train(params, dtrain, custom_metric=my_accuracy.xgb_metric)

Returns:

Type	Description
`Callable`	Decorated function that has .xgb_metric and .lgb_metric attributes

Source code in src/jaxboost/metric/base.py

def make_metric(
    name: str,
    transform: Callable[[np.ndarray], np.ndarray] | None = None,
    higher_is_better: bool = True,
) -> Callable:
    """
    Decorator to create XGBoost/LightGBM compatible metrics.

    Args:
        name: Metric name shown during training
        transform: Optional function to transform raw predictions
        higher_is_better: Whether higher values are better

    Example:
        >>> @make_metric('my_accuracy', transform=lambda p: (p > 0.5).astype(int))
        ... def my_accuracy(y_true, y_pred):
        ...     return (y_true == y_pred).mean()
        >>>
        >>> # Use with XGBoost
        >>> model = xgb.train(params, dtrain, custom_metric=my_accuracy.xgb_metric)

    Returns:
        Decorated function that has .xgb_metric and .lgb_metric attributes
    """

    def decorator(fn: Callable[[np.ndarray, np.ndarray], float]) -> Metric:
        return Metric(
            name=name,
            fn=fn,
            transform=transform,
            higher_is_better=higher_is_better,
        )

    return decorator

Ordinal Metrics¶

For ordered categorical outcomes (ratings, grades, severity levels).

qwk_metric¶

qwk_metric ¶

qwk_metric(n_classes: int, transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create Quadratic Weighted Kappa metric for ordinal regression.

QWK penalizes predictions quadratically by distance from truth. Perfect agreement = 1, random agreement ≈ 0, worse than random < 0.

Parameters:

Name	Type	Description	Default
`n_classes`	`int`	Number of ordinal classes	required
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to convert raw predictions to class labels. If None, predictions are rounded and clipped.	`None`

Returns:

Type	Description
`Metric`	Metric object with .xgb_metric and .lgb_metric methods

Example

With ordinal objective¶

ordinal = ordinal_logit(n_classes=6) qwk = qwk_metric(n_classes=6, transform=ordinal.predict)

model = xgb.train( ... {'disable_default_eval_metric': 1}, ... dtrain, obj=ordinal.xgb_objective, ... custom_metric=qwk.xgb_metric ... )

Source code in src/jaxboost/metric/ordinal.py

def qwk_metric(
    n_classes: int, transform: Callable[[np.ndarray], np.ndarray] | None = None
) -> Metric:
    """
    Create Quadratic Weighted Kappa metric for ordinal regression.

    QWK penalizes predictions quadratically by distance from truth.
    Perfect agreement = 1, random agreement ≈ 0, worse than random < 0.

    Args:
        n_classes: Number of ordinal classes
        transform: Optional function to convert raw predictions to class labels.
                   If None, predictions are rounded and clipped.

    Returns:
        Metric object with .xgb_metric and .lgb_metric methods

    Example:
        >>> # With ordinal objective
        >>> ordinal = ordinal_logit(n_classes=6)
        >>> qwk = qwk_metric(n_classes=6, transform=ordinal.predict)
        >>>
        >>> model = xgb.train(
        ...     {'disable_default_eval_metric': 1},
        ...     dtrain, obj=ordinal.xgb_objective,
        ...     custom_metric=qwk.xgb_metric
        ... )
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        if transform is not None:
            return transform(predt)
        # Default: round and clip
        return np.clip(np.round(predt), 0, n_classes - 1).astype(int)

    def _qwk(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        return _compute_qwk(y_true, y_pred, n_classes)

    return Metric(
        name="qwk",
        fn=_qwk,
        transform=_transform,
        higher_is_better=True,
    )

ordinal_mae_metric¶

ordinal_mae_metric ¶

ordinal_mae_metric(n_classes: int, transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create Mean Absolute Error metric for ordinal regression.

Measures average distance between predicted and true class.

Parameters:

Name	Type	Description	Default
`n_classes`	`int`	Number of ordinal classes	required
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to convert raw predictions to class labels	`None`

Returns:

Type	Description
`Metric`	Metric object

Example

mae = ordinal_mae_metric(n_classes=6) model = xgb.train(params, dtrain, custom_metric=mae.xgb_metric)

Source code in src/jaxboost/metric/ordinal.py

def ordinal_mae_metric(
    n_classes: int, transform: Callable[[np.ndarray], np.ndarray] | None = None
) -> Metric:
    """
    Create Mean Absolute Error metric for ordinal regression.

    Measures average distance between predicted and true class.

    Args:
        n_classes: Number of ordinal classes
        transform: Optional function to convert raw predictions to class labels

    Returns:
        Metric object

    Example:
        >>> mae = ordinal_mae_metric(n_classes=6)
        >>> model = xgb.train(params, dtrain, custom_metric=mae.xgb_metric)
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        if transform is not None:
            return transform(predt)
        return np.clip(np.round(predt), 0, n_classes - 1).astype(int)

    def _mae(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        return np.mean(np.abs(y_true - y_pred))

    return Metric(
        name="ordinal_mae",
        fn=_mae,
        transform=_transform,
        higher_is_better=False,
    )

ordinal_accuracy_metric¶

ordinal_accuracy_metric ¶

ordinal_accuracy_metric(n_classes: int, transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create exact accuracy metric for ordinal regression.

Measures proportion of exactly correct predictions.

Parameters:

Name	Type	Description	Default
`n_classes`	`int`	Number of ordinal classes	required
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to convert raw predictions to class labels	`None`

Returns:

Type	Description
`Metric`	Metric object

Source code in src/jaxboost/metric/ordinal.py

def ordinal_accuracy_metric(
    n_classes: int, transform: Callable[[np.ndarray], np.ndarray] | None = None
) -> Metric:
    """
    Create exact accuracy metric for ordinal regression.

    Measures proportion of exactly correct predictions.

    Args:
        n_classes: Number of ordinal classes
        transform: Optional function to convert raw predictions to class labels

    Returns:
        Metric object
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        if transform is not None:
            return transform(predt)
        return np.clip(np.round(predt), 0, n_classes - 1).astype(int)

    def _acc(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        return np.mean(y_true.astype(int) == y_pred.astype(int))

    return Metric(
        name="ordinal_acc",
        fn=_acc,
        transform=_transform,
        higher_is_better=True,
    )

adjacent_accuracy_metric¶

adjacent_accuracy_metric ¶

adjacent_accuracy_metric(n_classes: int, transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create adjacent accuracy metric (within ±1) for ordinal regression.

Measures proportion of predictions within 1 class of truth. Useful when exact prediction is difficult but close is acceptable.

Parameters:

Name	Type	Description	Default
`n_classes`	`int`	Number of ordinal classes	required
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to convert raw predictions to class labels	`None`

Returns:

Type	Description
`Metric`	Metric object

Source code in src/jaxboost/metric/ordinal.py

def adjacent_accuracy_metric(
    n_classes: int, transform: Callable[[np.ndarray], np.ndarray] | None = None
) -> Metric:
    """
    Create adjacent accuracy metric (within ±1) for ordinal regression.

    Measures proportion of predictions within 1 class of truth.
    Useful when exact prediction is difficult but close is acceptable.

    Args:
        n_classes: Number of ordinal classes
        transform: Optional function to convert raw predictions to class labels

    Returns:
        Metric object
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        if transform is not None:
            return transform(predt)
        return np.clip(np.round(predt), 0, n_classes - 1).astype(int)

    def _adj_acc(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        return np.mean(np.abs(y_true - y_pred) <= 1)

    return Metric(
        name="adj_acc",
        fn=_adj_acc,
        transform=_transform,
        higher_is_better=True,
    )

Classification Metrics¶

For binary classification problems.

auc_metric¶

auc_metric ¶

auc_metric(transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create Area Under ROC Curve metric for binary classification.

Parameters:

Name	Type	Description	Default
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to convert raw predictions to probabilities. If None, sigmoid is applied.	`None`

Returns:

Type	Description
`Metric`	Metric object with .xgb_metric and .lgb_metric methods

Example

from jaxboost.metric import auc_metric

model = xgb.train( ... {'disable_default_eval_metric': 1}, ... dtrain, obj=focal_loss.xgb_objective, ... custom_metric=auc_metric().xgb_metric ... )

Source code in src/jaxboost/metric/classification.py

def auc_metric(transform: Callable[[np.ndarray], np.ndarray] | None = None) -> Metric:
    """
    Create Area Under ROC Curve metric for binary classification.

    Args:
        transform: Optional function to convert raw predictions to probabilities.
                   If None, sigmoid is applied.

    Returns:
        Metric object with .xgb_metric and .lgb_metric methods

    Example:
        >>> from jaxboost.metric import auc_metric
        >>>
        >>> model = xgb.train(
        ...     {'disable_default_eval_metric': 1},
        ...     dtrain, obj=focal_loss.xgb_objective,
        ...     custom_metric=auc_metric().xgb_metric
        ... )
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        if transform is not None:
            return transform(predt)
        return _sigmoid(predt)

    return Metric(
        name="auc",
        fn=_compute_auc,
        transform=_transform,
        higher_is_better=True,
    )

log_loss_metric¶

log_loss_metric ¶

log_loss_metric(transform: Callable[[ndarray], ndarray] | None = None, eps: float = 1e-07) -> Metric

Create Log Loss (binary cross-entropy) metric.

Parameters:

Name	Type	Description	Default
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to convert raw predictions to probabilities	`None`
`eps`	`float`	Small value for numerical stability	`1e-07`

Returns:

Type	Description
`Metric`	Metric object

Source code in src/jaxboost/metric/classification.py

def log_loss_metric(
    transform: Callable[[np.ndarray], np.ndarray] | None = None, eps: float = 1e-7
) -> Metric:
    """
    Create Log Loss (binary cross-entropy) metric.

    Args:
        transform: Optional function to convert raw predictions to probabilities
        eps: Small value for numerical stability

    Returns:
        Metric object
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        if transform is not None:
            return transform(predt)
        return _sigmoid(predt)

    def _log_loss(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        y_pred = np.clip(y_pred, eps, 1 - eps)
        return -np.mean(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))

    return Metric(
        name="logloss",
        fn=_log_loss,
        transform=_transform,
        higher_is_better=False,
    )

accuracy_metric¶

accuracy_metric ¶

accuracy_metric(threshold: float = 0.5, transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create accuracy metric for binary classification.

Parameters:

Name	Type	Description	Default
`threshold`	`float`	Classification threshold (default 0.5)	`0.5`
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to convert raw predictions to probabilities	`None`

Returns:

Type	Description
`Metric`	Metric object

Source code in src/jaxboost/metric/classification.py

def accuracy_metric(
    threshold: float = 0.5, transform: Callable[[np.ndarray], np.ndarray] | None = None
) -> Metric:
    """
    Create accuracy metric for binary classification.

    Args:
        threshold: Classification threshold (default 0.5)
        transform: Optional function to convert raw predictions to probabilities

    Returns:
        Metric object
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        probs = transform(predt) if transform is not None else _sigmoid(predt)
        return (probs >= threshold).astype(int)

    def _acc(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        return np.mean(y_true.astype(int) == y_pred.astype(int))

    return Metric(
        name="accuracy",
        fn=_acc,
        transform=_transform,
        higher_is_better=True,
    )

f1_metric¶

f1_metric ¶

f1_metric(threshold: float = 0.5, transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create F1 score metric for binary classification.

F1 = 2 * (precision * recall) / (precision + recall)

Parameters:

Name	Type	Description	Default
`threshold`	`float`	Classification threshold	`0.5`
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to convert raw predictions to probabilities	`None`

Returns:

Type	Description
`Metric`	Metric object

Source code in src/jaxboost/metric/classification.py

def f1_metric(
    threshold: float = 0.5, transform: Callable[[np.ndarray], np.ndarray] | None = None
) -> Metric:
    """
    Create F1 score metric for binary classification.

    F1 = 2 * (precision * recall) / (precision + recall)

    Args:
        threshold: Classification threshold
        transform: Optional function to convert raw predictions to probabilities

    Returns:
        Metric object
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        probs = transform(predt) if transform is not None else _sigmoid(predt)
        return (probs >= threshold).astype(int)

    def _f1(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        y_true = y_true.astype(int)
        y_pred = y_pred.astype(int)

        tp = np.sum((y_true == 1) & (y_pred == 1))
        fp = np.sum((y_true == 0) & (y_pred == 1))
        fn = np.sum((y_true == 1) & (y_pred == 0))

        precision = tp / (tp + fp) if (tp + fp) > 0 else 0.0
        recall = tp / (tp + fn) if (tp + fn) > 0 else 0.0

        if precision + recall == 0:
            return 0.0
        return 2 * precision * recall / (precision + recall)

    return Metric(
        name="f1",
        fn=_f1,
        transform=_transform,
        higher_is_better=True,
    )

precision_metric¶

precision_metric ¶

precision_metric(threshold: float = 0.5, transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create precision metric for binary classification.

Precision = TP / (TP + FP)

Parameters:

Name	Type	Description	Default
`threshold`	`float`	Classification threshold	`0.5`
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to convert raw predictions to probabilities	`None`

Returns:

Type	Description
`Metric`	Metric object

Source code in src/jaxboost/metric/classification.py

def precision_metric(
    threshold: float = 0.5, transform: Callable[[np.ndarray], np.ndarray] | None = None
) -> Metric:
    """
    Create precision metric for binary classification.

    Precision = TP / (TP + FP)

    Args:
        threshold: Classification threshold
        transform: Optional function to convert raw predictions to probabilities

    Returns:
        Metric object
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        probs = transform(predt) if transform is not None else _sigmoid(predt)
        return (probs >= threshold).astype(int)

    def _precision(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        y_true = y_true.astype(int)
        y_pred = y_pred.astype(int)

        tp = np.sum((y_true == 1) & (y_pred == 1))
        fp = np.sum((y_true == 0) & (y_pred == 1))

        return tp / (tp + fp) if (tp + fp) > 0 else 0.0

    return Metric(
        name="precision",
        fn=_precision,
        transform=_transform,
        higher_is_better=True,
    )

recall_metric¶

recall_metric ¶

recall_metric(threshold: float = 0.5, transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create recall metric for binary classification.

Recall = TP / (TP + FN)

Parameters:

Name	Type	Description	Default
`threshold`	`float`	Classification threshold	`0.5`
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to convert raw predictions to probabilities	`None`

Returns:

Type	Description
`Metric`	Metric object

Source code in src/jaxboost/metric/classification.py

def recall_metric(
    threshold: float = 0.5, transform: Callable[[np.ndarray], np.ndarray] | None = None
) -> Metric:
    """
    Create recall metric for binary classification.

    Recall = TP / (TP + FN)

    Args:
        threshold: Classification threshold
        transform: Optional function to convert raw predictions to probabilities

    Returns:
        Metric object
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        probs = transform(predt) if transform is not None else _sigmoid(predt)
        return (probs >= threshold).astype(int)

    def _recall(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        y_true = y_true.astype(int)
        y_pred = y_pred.astype(int)

        tp = np.sum((y_true == 1) & (y_pred == 1))
        fn = np.sum((y_true == 1) & (y_pred == 0))

        return tp / (tp + fn) if (tp + fn) > 0 else 0.0

    return Metric(
        name="recall",
        fn=_recall,
        transform=_transform,
        higher_is_better=True,
    )

Regression Metrics¶

For continuous target prediction.

mse_metric¶

mse_metric ¶

mse_metric(transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create Mean Squared Error metric.

Parameters:

Name	Type	Description	Default
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to transform raw predictions	`None`

Returns:

Type	Description
`Metric`	Metric object

Example

model = xgb.train( ... {'disable_default_eval_metric': 1}, ... dtrain, obj=my_objective.xgb_objective, ... custom_metric=mse_metric().xgb_metric ... )

Source code in src/jaxboost/metric/regression.py

def mse_metric(transform: Callable[[np.ndarray], np.ndarray] | None = None) -> Metric:
    """
    Create Mean Squared Error metric.

    Args:
        transform: Optional function to transform raw predictions

    Returns:
        Metric object

    Example:
        >>> model = xgb.train(
        ...     {'disable_default_eval_metric': 1},
        ...     dtrain, obj=my_objective.xgb_objective,
        ...     custom_metric=mse_metric().xgb_metric
        ... )
    """

    def _mse(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        return np.mean((y_true - y_pred) ** 2)

    return Metric(
        name="mse",
        fn=_mse,
        transform=transform,
        higher_is_better=False,
    )

rmse_metric¶

rmse_metric ¶

rmse_metric(transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create Root Mean Squared Error metric.

Parameters:

Name	Type	Description	Default
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to transform raw predictions	`None`

Returns:

Type	Description
`Metric`	Metric object

Source code in src/jaxboost/metric/regression.py

def rmse_metric(transform: Callable[[np.ndarray], np.ndarray] | None = None) -> Metric:
    """
    Create Root Mean Squared Error metric.

    Args:
        transform: Optional function to transform raw predictions

    Returns:
        Metric object
    """

    def _rmse(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        return np.sqrt(np.mean((y_true - y_pred) ** 2))

    return Metric(
        name="rmse",
        fn=_rmse,
        transform=transform,
        higher_is_better=False,
    )

mae_metric¶

mae_metric ¶

mae_metric(transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create Mean Absolute Error metric.

Parameters:

Name	Type	Description	Default
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to transform raw predictions	`None`

Returns:

Type	Description
`Metric`	Metric object

Source code in src/jaxboost/metric/regression.py

def mae_metric(transform: Callable[[np.ndarray], np.ndarray] | None = None) -> Metric:
    """
    Create Mean Absolute Error metric.

    Args:
        transform: Optional function to transform raw predictions

    Returns:
        Metric object
    """

    def _mae(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        return np.mean(np.abs(y_true - y_pred))

    return Metric(
        name="mae",
        fn=_mae,
        transform=transform,
        higher_is_better=False,
    )

r2_metric¶

r2_metric ¶

r2_metric(transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create R² (coefficient of determination) metric.

R² = 1 - SS_res / SS_tot

Parameters:

Name	Type	Description	Default
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to transform raw predictions	`None`

Returns:

Type	Description
`Metric`	Metric object

Source code in src/jaxboost/metric/regression.py

def r2_metric(transform: Callable[[np.ndarray], np.ndarray] | None = None) -> Metric:
    """
    Create R² (coefficient of determination) metric.

    R² = 1 - SS_res / SS_tot

    Args:
        transform: Optional function to transform raw predictions

    Returns:
        Metric object
    """

    def _r2(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        ss_res = np.sum((y_true - y_pred) ** 2)
        ss_tot = np.sum((y_true - np.mean(y_true)) ** 2)

        if ss_tot < 1e-10:
            return 0.0
        return 1 - ss_res / ss_tot

    return Metric(
        name="r2",
        fn=_r2,
        transform=transform,
        higher_is_better=True,
    )

Bounded Regression Metrics¶

For proportion/rate prediction in [0, 1].

bounded_mse_metric¶

bounded_mse_metric ¶

bounded_mse_metric(transform: Callable[[ndarray], ndarray] | None = None) -> Metric

Create MSE metric for bounded regression.

By default, applies sigmoid to transform logits to [0, 1].

Parameters:

Name	Type	Description	Default
`transform`	`Callable[[ndarray], ndarray] \| None`	Optional function to transform raw predictions to [0, 1]. If None, sigmoid is applied.	`None`

Returns:

Type	Description
`Metric`	Metric object

Example

For bounded regression with sigmoid link¶

model = xgb.train( ... {'disable_default_eval_metric': 1}, ... dtrain, obj=soft_ce.xgb_objective, ... custom_metric=bounded_mse_metric().xgb_metric ... )

Source code in src/jaxboost/metric/bounded.py

def bounded_mse_metric(transform: Callable[[np.ndarray], np.ndarray] | None = None) -> Metric:
    """
    Create MSE metric for bounded regression.

    By default, applies sigmoid to transform logits to [0, 1].

    Args:
        transform: Optional function to transform raw predictions to [0, 1].
                   If None, sigmoid is applied.

    Returns:
        Metric object

    Example:
        >>> # For bounded regression with sigmoid link
        >>> model = xgb.train(
        ...     {'disable_default_eval_metric': 1},
        ...     dtrain, obj=soft_ce.xgb_objective,
        ...     custom_metric=bounded_mse_metric().xgb_metric
        ... )
    """

    def _transform(predt: np.ndarray) -> np.ndarray:
        if transform is not None:
            return transform(predt)
        return _sigmoid(predt)

    def _mse(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        return np.mean((y_true - y_pred) ** 2)

    return Metric(
        name="bounded_mse",
        fn=_mse,
        transform=_transform,
        higher_is_better=False,
    )

out_of_bounds_metric¶

out_of_bounds_metric ¶

out_of_bounds_metric(lower: float = 0.0, upper: float = 1.0) -> Metric

Create metric to measure proportion of predictions outside valid bounds.

Useful for comparing bounded vs unbounded regression approaches.

Parameters:

Name	Type	Description	Default
`lower`	`float`	Lower bound (default 0.0)	`0.0`
`upper`	`float`	Upper bound (default 1.0)	`1.0`

Returns:

Type	Description
`Metric`	Metric object

Example

Check how many predictions fall outside [0, 1]¶

model = xgb.train( ... {'disable_default_eval_metric': 1}, ... dtrain, obj=mse.xgb_objective, ... custom_metric=out_of_bounds_metric().xgb_metric ... )

Source code in src/jaxboost/metric/bounded.py

def out_of_bounds_metric(lower: float = 0.0, upper: float = 1.0) -> Metric:
    """
    Create metric to measure proportion of predictions outside valid bounds.

    Useful for comparing bounded vs unbounded regression approaches.

    Args:
        lower: Lower bound (default 0.0)
        upper: Upper bound (default 1.0)

    Returns:
        Metric object

    Example:
        >>> # Check how many predictions fall outside [0, 1]
        >>> model = xgb.train(
        ...     {'disable_default_eval_metric': 1},
        ...     dtrain, obj=mse.xgb_objective,
        ...     custom_metric=out_of_bounds_metric().xgb_metric
        ... )
    """

    def _oob(y_true: np.ndarray, y_pred: np.ndarray) -> float:
        return np.mean((y_pred < lower) | (y_pred > upper))

    return Metric(
        name="oob_rate",
        fn=_oob,
        transform=None,  # Don't transform - we want to see raw predictions
        higher_is_better=False,
    )

XGBoost vs LightGBM Interface¶

Aspect	XGBoost	LightGBM
Metric param	`custom_metric=`	`feval=`
Metric method	`.xgb_metric`	`.lgb_metric`
Return value	`(name, value)`	`(name, value, is_higher_better)`
Disable default	`'disable_default_eval_metric': 1`	`'metric': 'None'`

XGBoost Example¶

model = xgb.train(
    {'disable_default_eval_metric': 1},
    dtrain,
    obj=objective.xgb_objective,
    custom_metric=metric.xgb_metric,
    evals=[(dtest, 'test')]
)

LightGBM Example¶

model = lgb.train(
    {'metric': 'None'},
    train_data,
    fobj=objective.lgb_objective,
    feval=metric.lgb_metric,
    valid_sets=[valid_data]
)

Metrics¶

Quick Start¶

Base Classes¶

Metric¶

Metric ¶

Use with XGBoost¶

Use with LightGBM¶

__call__ ¶

xgb_metric ¶

lgb_metric ¶

make_metric¶

make_metric ¶

Use with XGBoost¶

Ordinal Metrics¶

qwk_metric¶

qwk_metric ¶

With ordinal objective¶

ordinal_mae_metric¶

ordinal_mae_metric ¶

ordinal_accuracy_metric¶

ordinal_accuracy_metric ¶

adjacent_accuracy_metric¶

adjacent_accuracy_metric ¶

Classification Metrics¶

auc_metric¶

auc_metric ¶

log_loss_metric¶

log_loss_metric ¶

accuracy_metric¶

accuracy_metric ¶

f1_metric¶

f1_metric ¶

precision_metric¶

precision_metric ¶

recall_metric¶

recall_metric ¶

Regression Metrics¶

mse_metric¶

mse_metric ¶

rmse_metric¶

rmse_metric ¶

mae_metric¶

mae_metric ¶

r2_metric¶

r2_metric ¶

Bounded Regression Metrics¶

bounded_mse_metric¶

bounded_mse_metric ¶

For bounded regression with sigmoid link¶

out_of_bounds_metric¶

out_of_bounds_metric ¶

Check how many predictions fall outside [0, 1]¶

XGBoost vs LightGBM Interface¶

XGBoost Example¶

LightGBM Example¶

call ¶