📜  count nan pandas - Python (1)

📅  最后修改于: 2023-12-03 15:00:02.546000             🧑  作者: Mango

Introduction to Counting NaN Values in Pandas - Python

When working with data in Pandas, it is common to come across missing or NaN values. NaN stands for "Not a Number" and represents the absence of a value. It is important to identify and handle NaN values properly as they can impact the accuracy of your analysis.

In this article, we will explore several methods to count the NaN values in a Pandas DataFrame.

The isna() Method

The isna() method in Pandas returns a boolean DataFrame, indicating which values are NaN. We can then use the sum() method to count the number of NaN values per column or row.

Here is an example:

import pandas as pd

# create a DataFrame with some NaN values
df = pd.DataFrame({'A': [1, 2, np.nan, 4],
                   'B': [5, np.nan, 7, 8],
                   'C': [9, 10, 11, np.nan]})

# count the number of NaN values per column
print(df.isna().sum())

# count the number of NaN values per row
print(df.isna().sum(axis=1))

Output:

A    1
B    1
C    1
dtype: int64

0    1
1    1
2    1
3    1
dtype: int64

Notice that the first sum() call counts the number of NaN values in each column, while the second call counts the number of NaN values in each row.

The isnull() Method

The isnull() method is similar to isna(), but it is an alias for it. It returns a boolean DataFrame with True where the values are NaN.

import pandas as pd

# create a DataFrame with some NaN values
df = pd.DataFrame({'A': [1, 2, np.nan, 4],
                   'B': [5, np.nan, 7, 8],
                   'C': [9, 10, 11, np.nan]})

# count the number of NaN values per column
print(df.isnull().sum())

# count the number of NaN values per row
print(df.isnull().sum(axis=1))

Output:

A    1
B    1
C    1
dtype: int64

0    1
1    1
2    1
3    1
dtype: int64
The notna() Method

The notna() method is the opposite of isna(). It returns a boolean DataFrame with True where the values are not NaN.

import pandas as pd

# create a DataFrame with some NaN values
df = pd.DataFrame({'A': [1, 2, np.nan, 4],
                   'B': [5, np.nan, 7, 8],
                   'C': [9, 10, 11, np.nan]})

# count the number of non-NaN values per column
print(df.notna().sum())

# count the number of non-NaN values per row
print(df.notna().sum(axis=1))

Output:

A    3
B    3
C    3
dtype: int64

0    2
1    2
2    2
3    2
dtype: int64

Notice that the first sum() call counts the number of non-NaN values in each column, while the second call counts the number of non-NaN values in each row.

The count() Method

The count() method returns the number of non-NaN values in each column or row.

import pandas as pd

# create a DataFrame with some NaN values
df = pd.DataFrame({'A': [1, 2, np.nan, 4],
                   'B': [5, np.nan, 7, 8],
                   'C': [9, 10, 11, np.nan]})

# count the number of non-NaN values per column
print(df.count())

# count the number of non-NaN values per row
print(df.count(axis=1))

Output:

A    3
B    3
C    3
dtype: int64

0    3
1    3
2    3
3    2
dtype: int64

Notice that the second count() call returns a different result than the previous sum() calls, since it counts only the non-NaN values.

Conclusion

Counting NaN values in Pandas is an important task when working with data. In this article, we explored several methods to count the NaN values in a Pandas DataFrame, including isna(), isnull(), notna(), and count(). Each of these methods has its own advantages and can be used depending on the specific needs of your analysis.