📜  r 数据框用 na 替换空白 (1)

📅  最后修改于: 2023-12-03 15:04:45.941000             🧑  作者: Mango

Introduction to Replacing Blanks with NA in R Data Frames

In R, data can be stored in a data frame, which is a two-dimensional table-like structure, where rows represent observations (or cases) and columns represent variables (or attributes). Sometimes, data in a data frame may have missing or blank values. These missing or blank values are represented by either empty strings or white spaces. However, often it is useful to replace these blank values with NA (Not Available) values, which are a standard way of representing missing data in R.

To replace blanks with NA in an R data frame, we can use various R functions such as gsub, ifelse, replace etc. Here, we show how to use the gsub function to replace blank spaces with NA in an R data frame.

# Create a sample data frame with blanks
df <- data.frame(names=c("John", "Mary", "", "Peter", "Rose"), 
                 age=c(23, 33,  , 45,  ), 
                 salary=c(2000, 3000,  , 5000, 6000))

# Replace blank spaces with NA
df[df==" "] <- NA

# View the updated data frame
df

In the above code, we create a sample data frame df with blanks in some cells. Then, we use the gsub function to replace blank spaces with NA values in the data frame. The df[df==" "] <- NA command replaces all instances of empty strings with NA values in the data frame.

By replacing blanks with NA, we can perform various operations on the data frame such as filtering, sorting, or summarizing, without the risk of ignoring missing values. In addition, many R functions, such as is.na, na.omit, complete.cases, etc., provide convenient ways to handle missing data in R data frames.