Skip to content

diagnostics

Ivan Svetunkov edited this page Jun 24, 2026 · 3 revisions

Diagnostics — Outlier Detection

Function signatures

R

outlierdummy(object, level=0.999, type=c("rstandard","rstudent"), ...)

Python

def outlier_dummy(
    model,
    level: float = 0.999,
    type: Literal["rstandard", "rstudent"] = "rstandard",
) -> OutlierResult: ...

Overview

The diagnostics module provides functions for detecting outliers in fitted regression models. The outlier_dummy() function identifies observations that lie outside the expected distribution bounds and creates dummy variables that can be used to re-estimate the model with outlier effects removed.

Import

# R — function is in the greybox namespace
library(greybox)
# Python
from greybox.diagnostics import outlier_dummy, OutlierResult

Algorithm

  1. Extract residuals from the fitted model
  2. Compute standardised (rstandard) or studentised (rstudent) residuals using leverage (hat values)
  3. Determine critical bounds based on the model's distribution family and the specified confidence level
  4. Flag observations whose standardised residuals fall outside the bounds
  5. Return a matrix of dummy variables (one column per outlier)

outlier_dummy — Detect Outliers and Create Dummy Variables

Parameters

Parameter R Python Type Default Description
model/object object model alm / ALM Fitted alm model
level level level numeric / float 0.999 Confidence level for outlier detection
type type type character / str "rstandard" Residual type: "rstandard" or "rstudent"

Return Value

Field R Python Type Description
outliers $outliers .outliers matrix / np.ndarray or None Matrix of dummy variables (or NULL if no outliers)
statistic $statistic .statistic numeric / np.ndarray Critical values used for detection
id $id .id vector / np.ndarray Indices of outlier observations
level $level .level numeric / float Confidence level used
type $type .type character / str Residual type used
errors $errors .errors vector / np.ndarray Standardised/studentised residuals

Examples

Basic Outlier Detection

# R
library(greybox)

x <- rnorm(100)
y <- 2 * x + rnorm(100)
y[50] <- 100  # inject outlier

model <- alm(y ~ x, distribution="dnorm")
result <- outlierdummy(model, level=0.999)
print(result$id)       # Which observations are outliers
print(result$outliers)  # Dummy variable matrix
# Python
import numpy as np
from greybox.alm import ALM
from greybox.formula import formula
from greybox.diagnostics import outlier_dummy

np.random.seed(42)
x = np.random.randn(100)
y = 2 * x + np.random.randn(100)
y[50] = 100  # inject outlier

data = {"y": y, "x": x}
y_vec, X = formula("y ~ x", data)

model = ALM(distribution="dnorm")
model.fit(X, y_vec)

result = outlier_dummy(model, level=0.999)
print(f"Outlier indices: {result.id}")
print(f"Critical bounds: {result.statistic}")

Re-estimating with Outlier Dummies

# R — add outlier dummies to the model
if (!is.null(result$outliers)) {
  model2 <- alm(y ~ x + result$outliers, data=data, distribution="dnorm")
  print(paste("Original scale:", model$scale))
  print(paste("With dummies:", model2$scale))
}
# Python — add outlier dummies to the model
if result.outliers is not None:
    X_with_dummies = np.column_stack([X, result.outliers])
    model2 = ALM(distribution="dnorm")
    model2.fit(X_with_dummies, y_vec)
    print(f"Original scale: {model.scale:.4f}")
    print(f"With dummies:   {model2.scale:.4f}")

Using Studentised Residuals

# R — rstudent is more sensitive to single outliers
result_student <- outlierdummy(model, level=0.999, type="rstudent")
print(result_student$id)  # Outliers
# Python — rstudent is more sensitive to single outliers
result_student = outlier_dummy(model, level=0.999, type="rstudent")
print(f"Outliers (rstudent): {result_student.id}")

R vs Python Function Names

R Function Python Function
outlierdummy() outlier_dummy()

References

  • Cook, R.D. and Weisberg, S. (1982). Residuals and Influence in Regression. Chapman and Hall.
  • Svetunkov, I. (2023). Statistics for Business Analytics. https://openforecast.org/sba/

Clone this wiki locally