📜  pandas groupby 并显示特定列 - Python (1)

📅  最后修改于: 2023-12-03 15:03:28.360000             🧑  作者: Mango

Pandas Groupby and Display Specific Columns - Python

Introduction

In data analysis, it is often necessary to group data by a certain attribute and then perform some operations on the groups. Pandas groupby() function is an efficient way to group data in pandas. However, sometimes we only want to show a specific column of the grouped data.

In this tutorial, we will learn how to use the groupby() function in pandas to group data and then display only specific columns of the grouped data.

Required Libraries
  • pandas
Sample Data

We will use the following sample data for the demonstration:

| Country | Year | GDP | Population | |---------|------|--------|------------| | USA | 2010 | 14624 | 309 | | USA | 2011 | 14964 | 312 | | USA | 2012 | 15497 | 314 | | China | 2010 | 6070 | 1339 | | China | 2011 | 7319 | 1347 | | China | 2012 | 8560 | 1355 |

Pandas Groupby and Display Specific Columns

First, we need to import pandas library and load our sample data into a pandas dataframe:

import pandas as pd

data = {
    'Country': ['USA', 'USA', 'USA', 'China', 'China', 'China'],
    'Year': [2010, 2011, 2012, 2010, 2011, 2012],
    'GDP': [14624, 14964, 15497, 6070, 7319, 8560],
    'Population': [309, 312, 314, 1339, 1347, 1355]
}

df = pd.DataFrame(data)

Next, we can group the data by a certain attribute, such as 'Country', using the groupby() function:

grouped = df.groupby('Country')

This will group our data by the 'Country' attribute. Now, to display only specific columns of the grouped data, we can use the apply() function along with a lambda function. The lambda function will select the columns we want to display. For example, if we want to display only the 'Year' and 'GDP' columns of the grouped data, we can do the following:

result = grouped.apply(lambda x: x[['Year', 'GDP']])

The apply() function will apply the lambda function to each group and the result will be a new dataframe with only the 'Year' and 'GDP' columns from each group.

To display the resulting dataframe, we can simply print it:

print(result)

The output will be:

                Year    GDP
Country                   
China   0       2010   6070
        1       2011   7319
        2       2012   8560
USA     0       2010  14624
        1       2011  14964
        2       2012  15497

This dataframe shows only the 'Year' and 'GDP' columns of the grouped data.

Conclusion

Pandas groupby() function is a powerful tool for grouping data in pandas. In this tutorial, we have learned how to use groupby() function to group data and then display only specific columns of the grouped data. We used the apply() function with a lambda function to select the columns we wanted to display.