{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CellMCD" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook we analyze the covariance of the TopGear dataset using the cellwise minimum covariance determinant which can handle NAs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "from robpy.datasets import load_topgear\n", "from robpy.preprocessing import DataCleaner\n", "from robpy.covariance.cellmcd import CellMCD\n", "\n", "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load and preprocess the data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To preprocess the data, we run `DataCleaner`, as this should always be done before the cellwise analysis. We additionally remove the variable `Verdict` and take log-transforms of the skewed variables." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | Price | \n", "Displacement | \n", "BHP | \n", "Torque | \n", "Acceleration | \n", "TopSpeed | \n", "MPG | \n", "Weight | \n", "Length | \n", "Width | \n", "Height | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "3.056357 | \n", "7.376508 | \n", "4.653960 | \n", "5.463832 | \n", "11.3 | \n", "4.744932 | \n", "64.0 | \n", "1385.0 | \n", "4351.0 | \n", "1798.0 | \n", "1465.0 | \n", "
| 1 | \n", "2.718331 | \n", "7.221105 | \n", "4.653960 | \n", "4.553877 | \n", "10.7 | \n", "4.753590 | \n", "49.0 | \n", "1090.0 | \n", "4063.0 | \n", "1720.0 | \n", "1446.0 | \n", "
| 2 | \n", "3.433826 | \n", "7.192182 | \n", "4.584967 | \n", "4.521789 | \n", "11.8 | \n", "4.663439 | \n", "56.0 | \n", "988.0 | \n", "3078.0 | \n", "1680.0 | \n", "1500.0 | \n", "
| 3 | \n", "4.882764 | \n", "8.688622 | \n", "6.248043 | \n", "6.124683 | \n", "4.6 | \n", "5.209486 | \n", "19.0 | \n", "1785.0 | \n", "4720.0 | \n", "NaN | \n", "1282.0 | \n", "
| 4 | \n", "4.955792 | \n", "8.688622 | \n", "6.248043 | \n", "6.124683 | \n", "4.6 | \n", "5.209486 | \n", "19.0 | \n", "1890.0 | \n", "4720.0 | \n", "NaN | \n", "1282.0 | \n", "
CellMCD()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
CellMCD()