📌  相关文章
📜  value_counts pandas - Python (1)

📅  最后修改于: 2023-12-03 15:20:56.549000             🧑  作者: Mango

value_counts() in pandas - Python

When working with data, it is often useful to know the frequency distribution of a categorical variable. This is where the value_counts() method in pandas comes in handy.

What is value_counts()?

value_counts() is a method in pandas that returns a Series containing counts of unique values in a DataFrame column. It can be used on a pandas series to get the count of unique values.

Syntax
DataFrame['Column_name'].value_counts(dropna=True)

where:

  • DataFrame['Column_name'] is the name of the column on which you want to perform the operation.
  • dropna is a boolean parameter which drops the NaN values before performing the operation. The default value is True.
Returning the Result as a DataFrame

You can also return the result as a DataFrame with named columns using the to_frame() method. The to_frame() method can be called on any Series object and it returns a new DataFrame with the row labels of the original Series.

DataFrame['Column_name'].value_counts(dropna=True).to_frame()
Example

Let's suppose we have a DataFrame df with a column fruit containing values of different fruits.

import pandas as pd

df = pd.DataFrame({'fruit': ['apple', 'orange', 'banana', 'orange', 'orange', 'apple', 'banana']})

To get the frequency distribution of the fruit column, we can use the value_counts() method as follows:

freq_dist = df['fruit'].value_counts()
print(freq_dist)

Output:

orange    3
apple     2
banana    2
Name: fruit, dtype: int64

We can also return the result as a DataFrame with named columns:

freq_dist_df = df['fruit'].value_counts().to_frame()
freq_dist_df.columns = ['Frequency']
print(freq_dist_df)

Output:

        Frequency
orange          3
apple           2
banana          2
Conclusion

In summary, the value_counts() method in pandas makes it easy to get the frequency distribution of a categorical variable. It is a must-have tool in a data analyst's toolbox.