📜  pandas split groupby - Python (1)

📅  最后修改于: 2023-12-03 15:03:28.562000             🧑  作者: Mango

Pandas Split Groupby - Python

Pandas is a powerful data manipulation library in Python which provides extensive tools to work with data frames. In this article, we will discuss the Pandas Split Groupby method which is used to group data frames based on one or more attributes.

Groupby method

The groupby() method is used to split the data frame into groups based on one or more attributes. This function returns a groupby object which can be used to apply different operations on the groups. The groupby() function can be used in conjunction with many other functions like sum, mean, aggregate, apply, etc.

Syntax
data_frame.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)
Parameters
  • by – mapping, function, label, or list of labels. This parameter defines the criteria on which to group the data frame. It can take multiple forms.

  • axis – integer or string type value, defining the axis to apply, i.e. 0 or ‘index’ for row; 1 or ‘columns’ for column.

  • level – if the axis is a multi-index (hierarchical), group by a particular level or levels.

  • as_index – boolean type, if True, the group keys will become the index of the results.

  • sort – boolean type, sort group keys.

  • group_keys – boolean type, make keys in result as the group names.

  • squeeze – boolean type, reduce dimensionality of the returned result.

Example Use Case
import pandas as pd

# create data frame
data = {'Group': ['A', 'A', 'B', 'B', 'C', 'C'], 'Sales': [100, 150, 200, 250, 300, 350]}

df = pd.DataFrame(data)

# group by group label
grouped = df.groupby('Group')

# display group data frames
for group, df_group in grouped:
    print(group)
    print(df_group)

Output:

A

Group Sales

0 A 100

1 A 150

B

Group Sales

2 B 200

3 B 250

C

Group Sales

4 C 300

5 C 350

Conclusion

The Pandas Split Groupby method is a powerful tool to group data frames based on specific criteria. It can be used in conjunction with other functions to provide efficient data manipulation operations.