📜  pandas df where row has na - Python (1)

📅  最后修改于: 2023-12-03 15:18:13.749000             🧑  作者: Mango

Pandas DataFrame: How to Select Rows with Missing Values

When working with data in pandas, it's often necessary to identify and handle missing values. In this tutorial, we'll explore how to select rows from a DataFrame that contain missing values.

The df.where() Function

One way to select rows that contain missing values is to use the df.where() function. This function returns a DataFrame where all the rows have been replaced with NaN values if a given condition is not true. Here's what that looks like:

import pandas as pd

# create sample dataframe
df = pd.DataFrame({"A": [1, 2, None, 4], "B": [None, 6, None, 8]})
print(df)
     A    B
0  1.0  NaN
1  2.0  6.0
2  NaN  NaN
3  4.0  8.0

To select rows in df where any of the columns contain missing values, we can use the following code:

df_missing = df.where(pd.isna(df))
print(df_missing)
    A   B
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN

In this example, we're using the pd.isna() function to identify NaN values in the DataFrame. The df.where() function then returns a DataFrame with the same shape as df, but where any rows that did not meet the condition (in this case, rows with missing values) have been replaced with NaN values.

Selecting Rows with Any Missing Values

If we only want to select rows that contain missing values, we can use the df.dropna() function instead of df.where(). Here's how that looks:

df_missing = df.dropna()
print(df_missing)
     A    B
1  2.0  6.0
3  4.0  8.0

In this example, the df.dropna() function removes any rows that contain missing values in any of their columns.

Conclusion

In this tutorial, we explored how to select rows that contain missing values in a pandas DataFrame. We used the df.where() and df.dropna() functions to filter our data and return only the rows that met our criteria.