📜  pandas groupby - Python (1)

📅  最后修改于: 2023-12-03 15:33:23.602000             🧑  作者: Mango

Pandas GroupBy - Python

Pandas GroupBy is a powerful tool that allows you to group data by one or more columns and apply a function to each group. It is an essential technique for data analysis and is widely used in data science.

Syntax
grouped = dataframe.groupby(column_name)
  • dataframe: The Pandas DataFrame object.
  • column_name: The name of the column to group the data by.
Example

Suppose we have a Pandas DataFrame containing the following data:

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie', 'Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35, 25, 30, 35],
        'Salary': [50000, 60000, 70000, 55000, 65000, 75000]}

df = pd.DataFrame(data)

We can group the data by the Name column and calculate the mean salary for each group using the mean function:

grouped = df.groupby('Name')

mean_salary = grouped['Salary'].mean()

print(mean_salary)

Output:

Name
Alice      52500.0
Bob        62500.0
Charlie    72500.0
Name: Salary, dtype: float64

In the above example, we grouped the data by the Name column and applied the mean function to the Salary column. The result is a new Series object containing the mean salary for each group.

Conclusion

Pandas GroupBy is a powerful tool for grouping and aggregating data. It allows you to apply functions to each group and can be used in a variety of contexts. If you are working with data in Python, Pandas GroupBy is a must-know technique.