📜  Python| Pandas DataFrame.transform(1)

📅  最后修改于: 2023-12-03 15:19:15.230000             🧑  作者: Mango

Python | Pandas DataFrame.transform

Introduction

Pandas is a Python library used for data manipulation and analysis. It provides various data structures and functions for performing various operations on the data. One of the key features of Pandas is the DataFrame, which is a two-dimensional table-like data structure. In this article, we will discuss the transform method of the Pandas DataFrame.

DataFrame.transform

The transform method in Pandas is used for performing different operations on the table-like data structure of DataFrame. It takes a function and applies it to the DataFrame object, returning a new DataFrame. The function is applied to each column in the DataFrame and the resulting values are returned in a new DataFrame.

Syntax
DataFrame.transform(func, axis=0, *args, **kwargs)
Parameters
  • func: A function or strings which is used to transform the dataset.
  • axis: Axis along which the function is applied. Default value is 0
  • args and kwargs: Additional arguments and keywords to be passed to the function.
Example

Let's consider an example of the following dataset:

import pandas as pd

data = {'country': ['Brazil', 'Russia', 'India', 'China', 'South Africa'],
        'population': [207847528, 144409278, 1339180127, 1387160730, 57398421],
        'area': [8515767, 17098242, 3287263, 9596961, 1221037]}

df = pd.DataFrame(data)
print(df)

Output:

        country  population      area
0        Brazil   207847528   8515767
1        Russia   144409278  17098242
2         India  1339180127   3287263
3         China  1387160730   9596961
4  South Africa    57398421   1221037

Now, we can use the transform method to apply different operations to the population and area columns of the DataFrame.

import numpy as np

df_population = df[['population']].transform(np.log10)
print(df_population)

df_area = df[['area']].transform(np.sqrt)
print(df_area)

Output:

   population
0     8.318116
1     8.158568
2     9.126348
3     9.141546
4     7.758575

       area
0  2919.158
1  4132.462
2  1812.104
3  3098.171
4  1105.442

In the above example, we have used the log10 and sqrt functions from NumPy to transform the population and area columns. The resulting values are returned in a new DataFrame.

Conclusion

The transform method in Pandas is a powerful tool for performing different operations on the data stored in a DataFrame. It can be used to apply various functions and operations to transform the data and obtain a new DataFrame with the result.