📜  numpy compute mad - Python (1)

📅  最后修改于: 2023-12-03 15:33:13.928000             🧑  作者: Mango

Introduction to Computing Median Absolute Deviation (MAD) using NumPy in Python

NumPy is a powerful Python library for scientific computing that provides support for multidimensional arrays and matrices, a large collection of mathematical functions, and tools for working with these arrays in a high-level language. One of the useful functions provided by NumPy is numpy.median, which computes the median of a given array along a specified axis.

In addition, NumPy also provides a way to compute the median absolute deviation (MAD) of an array. The MAD is a measure of statistical dispersion that is more robust to outliers than the standard deviation. It is defined as the median of the absolute deviations from the median of the array.

To compute the MAD using NumPy, we can first compute the median of the array using numpy.median, and then compute the absolute deviations from this median using the numpy.abs function. Finally, we compute the median of these absolute deviations using numpy.median again.

Here's an example of how to compute the MAD of a one-dimensional array using NumPy:

import numpy as np

# Generate a random one-dimensional array of size 10
arr = np.random.randn(10)

# Compute the median of the array
med = np.median(arr)

# Compute the absolute deviations from the median
absdev = np.abs(arr - med)

# Compute the median of these absolute deviations
mad = np.median(absdev)

print("MAD:", mad)

This will output the MAD of the array, which is a single number.

We can also compute the MAD along a specified axis of a multidimensional array using the numpy.median and numpy.expand_dims functions. Here's an example:

import numpy as np

# Generate a random two-dimensional array of size 5x10
arr = np.random.randn(5, 10)

# Compute the median of the array along the second axis
med = np.median(arr, axis=1)

# Expand the median array to have the same shape as the original array
med_expanded = np.expand_dims(med, axis=1)

# Compute the absolute deviations from the median
absdev = np.abs(arr - med_expanded)

# Compute the median of these absolute deviations along the second axis
mad = np.median(absdev, axis=1)

print("MAD:", mad)

This will output a one-dimensional array of size 5 containing the MAD of each row of the original array.

In conclusion, NumPy provides a convenient way to compute the MAD of an array in Python. This measure of statistical dispersion is more robust to outliers than the standard deviation and can be used in various statistical analyses.