📜  r mutate 函数 (1)

📅  最后修改于: 2023-12-03 14:46:51.843000             🧑  作者: Mango

Introducing the r mutate() Function

The r mutate() function is a powerful tool in the dplyr package of R that allows you to create new variables or modify existing variables in a dataframe. It is a highly versatile function for data manipulation and transformation. In this introduction, we will explore the various features and functions that r mutate() offers.

The general syntax of r mutate() is as follows:

new_dataframe <- mutate(dataframe, new_variable = expression)

Parameters:

  • dataframe: The dataframe on which you want to perform the mutation.
  • new_variable: The name of the new variable you want to create.
  • expression: The expression that defines the value of the new variable.

The r mutate() function creates a new dataframe new_dataframe by taking the existing dataframe and adding a new variable new_variable. The expression defines how the value of new_variable is calculated based on the existing variables in the dataframe.

For example, consider a dataframe df with variables x and y. We can create a new variable z using r mutate() as follows:

library(dplyr)
new_df <- mutate(df, z = x + y)

This will create a new dataframe new_df with an additional variable z, which is the sum of variables x and y from the original dataframe.

Advanced Usage:

The r mutate() function allows for more complex manipulations using various functions and operations. These include:

  • Mathematical operations: +, -, *, /, ^
  • Functions: log(), sqrt(), mean(), sum(), min(), max(), etc.
  • Logical operations: <, >, ==, !=, &, |, ifelse(), etc.
  • String manipulation: stringr package functions like str_extract(), str_replace(), etc.
  • Group-wise mutations using group_by in combination with mutate.

Here's an example demonstrating some of these advanced features:

new_df <- df %>%
  group_by(category) %>%
  mutate(total_sales = sum(sales),
         sales_percentage = sales / total_sales,
         discount_price = ifelse(sales_percentage > 0.5, price * 0.9, price))

In this example, we calculate the total sales for each category, the sales percentage for each row, and apply a discount to the price based on the sales percentage.

Conclusion:

The r mutate() function in dplyr package is a powerful tool for creating new variables or modifying existing variables in a dataframe. It supports a wide range of mathematical operations, functions, logical operations, and string manipulation. Its flexibility and simplicity make it an essential function for data manipulation tasks in R.