robpy.utils

Distance

robpy.utils.distance.mahalanobis_distance(data: ndarray | DataFrame, location: ndarray, covariance: ndarray)[source]

Calculate the Mahalanobis distance for multiple data vectors.

Parameters:

data (np.ndarray or pd.DataFrame) – An array-like object where each row is a data vector.
location (np.ndarray) – the center of the data
covariance (np.ndarray) – the scatter estimator of the data

Returns:

an array of Mahalanobis distances for each data vector.

Return type:

np.ndarray

class robpy.utils.rho.BaseRho[source]

Bases: object

class robpy.utils.rho.Huber(b: float = 1.5)[source]

class robpy.utils.rho.TukeyBisquare(c: float = 1.56)[source]

robpy.utils.general.inverse_submatrix(A: ndarray, A_inv: ndarray, indices: array) → ndarray[source]

Given a matrix A and its inverse A_inv, this function calculates the inverse of the submatrix of A consisting of the rows and columns in indices.

Parameters:

A (np.ndarray) – the matrix of interest
A_inv (np.ndarray) – the inverse of the matrix of interest
indices (np.array) – the indices corresponding to the submatrix of interest

robpy.utils.median.l1median(X: ndarray) → float[source]

Implementation of the L1-median

References

Fritz, H. and Filzmoser, P. and Croux, C. (2012) A comparison of algorithms for the multivariate L1-median. Computational Statistics 27, 393–410

robpy.utils.median.weighted_median(X: ndarray, weights: ndarray) → float[source]

Computes a weighted median.

References

Time-efficient algorithms for two highly robust estimators of scale, Christophe Croux and Peter J. Rousseeuw (1992)

robpy.utils.outlyingness.stahel_donoho(X: ndarray, n_points: int = 2, n_dir: int = 250) → ndarray[source]

Calculate the degree of outlyingness for multivariate points. Based on the algorithm proposed by Stahel (1981) and Donoho (1982).

Parameters:

X (np.ndarray) – data matric of shape (n_obs, n_features)
n_points (int, optional) – number of points to determine the hyperplane. Defaults to 2.
n_dir (int, optional) – number of random directions to consider. Defaults to 250.

Returns:

single column of outlyingness values

Return type:

np.ndarray

References

Stahel W.A. (1981). Robuste Schatzungen: infinitesimale Optimalitat und Schatzungen von Kovarianzmatrizen. PhD Thesis, ETH Zurich.

Donoho D.L. (1982). Breakdown properties of multivariate location estimators. Ph.D. Qualifying paper, Dept. Statistics, Harvard University, Boston.