📜  pandas .nlargest - Python (1)

📅  最后修改于: 2023-12-03 15:33:23.393000             🧑  作者: Mango

pandas .nlargest - Python

Pandas is a popular library in Python for data manipulation and analysis. The .nlargest() method is a convenient way to retrieve the N largest elements from a pandas DataFrame or Series. This method can be particularly useful when working with large datasets, as it can help to quickly identify the most significant data points.

Syntax

The syntax for the .nlargest() method is as follows:

df.nlargest(n, columns=None)

where:

  • n: the number of largest values to retrieve.
  • columns (optional): the column(s) to sort by. If no columns are specified, the entire DataFrame/Series is used.
Example 1 - Retrieving the N largest elements from a DataFrame

Suppose we have the following DataFrame:

import pandas as pd

data = {'name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'],
        'age': [25, 30, 35, 40, 45],
        'salary': [50000, 60000, 75000, 90000, 120000]}

df = pd.DataFrame(data)

To retrieve the top 2 salaries in the DataFrame, we can use the .nlargest() method:

top_salaries = df.nlargest(2, 'salary')
print(top_salaries)

Output:

      name  age  salary
4    Emily   45  120000
3    David   40   90000

As we specified n=2 and sorted by the "salary" column, the two largest salaries in the DataFrame have been retrieved.

Example 2 - Retrieving the N largest elements from a Series

To retrieve the top 3 values in a Series, we can use the .nlargest() method as follows:

s = pd.Series([10, 20, 30, 40, 50])
top_values = s.nlargest(3)
print(top_values)

Output:

4    50
3    40
2    30
dtype: int64

As we specified n=3, the three largest values in the Series have been retrieved.

Conclusion

The .nlargest() method provides a convenient way to retrieve the N largest elements from a pandas DataFrame or Series. By specifying the number of elements to retrieve and the column(s) to sort by, this method can quickly identify the most significant data points in a large dataset.