0% found this document useful (0 votes)
138 views

Dplyr Case When in R

case_when is a function that allows you to vectorize multiple if/else statements in R. It works like the SQL CASE WHEN statement. It takes a sequence of two-sided formulas as arguments, with the left-hand side determining the condition and the right-hand side providing the replacement value. The values returned will be the same type and length as the first right-hand side value. It is useful for creating new variables that depend on complex combinations of existing variables, such as in dplyr's mutate verb.

Uploaded by

loshude
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
138 views

Dplyr Case When in R

case_when is a function that allows you to vectorize multiple if/else statements in R. It works like the SQL CASE WHEN statement. It takes a sequence of two-sided formulas as arguments, with the left-hand side determining the condition and the right-hand side providing the replacement value. The values returned will be the same type and length as the first right-hand side value. It is useful for creating new variables that depend on complex combinations of existing variables, such as in dplyr's mutate verb.

Uploaded by

loshude
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

case_when {dplyr} R Documentation

A general vectorised if
Description

This function allows you to vectorise multiple if and else if statements. It is an R equivalent
of the SQL CASE WHEN statement.

Usage
case_when(...)

Arguments

... A sequence of two-sided formulas. The left hand side (LHS) determines which values match
this case. The right hand side (RHS) provides the replacement value.

The LHS must evaluate to a logical vector. Each logical vector can either have length 1 or a
common length. All RHSs must evaluate to the same type of vector.

These dots are evaluated with explicit splicing.

Value

A vector as long as the longest LHS, with the type (and attributes) of the first RHS. Inconsistent
lengths or types will generate an error.

Examples
x <- 1:50
case_when(
x %% 35 == 0 ~ "fizz buzz",
x %% 5 == 0 ~ "fizz",
x %% 7 == 0 ~ "buzz",
TRUE ~ as.character(x)
)

# Like an if statement, the arguments are evaluated in order, so you must


# proceed from the most specific to the most general. This won't work:
case_when(
TRUE ~ as.character(x),
x %% 5 == 0 ~ "fizz",
x %% 7 == 0 ~ "buzz",
x %% 35 == 0 ~ "fizz buzz"
)

# case_when is particularly useful inside mutate when you want to


# create a new variable that relies on a complex combination of existing
# variables
starwars %>%
select(name:mass, gender, species) %>%
mutate(
type = case_when(
height > 200 | mass > 200 ~ "large",
species == "Droid" ~ "robot",
TRUE ~ "other"
)
)

# Dots support splicing:


patterns <- list(
TRUE ~ as.character(x),
x %% 5 == 0 ~ "fizz",
x %% 7 == 0 ~ "buzz",
x %% 35 == 0 ~ "fizz buzz"
)
case_when(!!! patterns)

You might also like