2.6 The combine function: c()

The combine function as its name implies is used to combine several values. It is especially useful when used with the %in% operator to filter for multiple values. Remember how the gbd_short cause column had three different causes in it:

gbd_short$cause %>% unique()
## [1] "Communicable diseases"     "Injuries"                 
## [3] "Non-communicable diseases"

Say we wanted to filter for communicable and non-communicable diseases.7 We could use the OR operator | like this:

gbd_short %>% 
  # also filtering for a single year to keep the result concise
  filter(year == 1990) %>% 
  filter(cause == "Communicable diseases" | cause == "Non-communicable diseases")
## # A tibble: 2 x 3
##    year cause                     deaths_millions
##   <dbl> <chr>                               <dbl>
## 1  1990 Communicable diseases                15.4
## 2  1990 Non-communicable diseases            26.7

But that means we have to type in cause twice (and more if we had other values we wanted to include). This is where the %in% operator together with the c() function come in handy:

gbd_short %>% 
  filter(year == 1990) %>% 
  filter(cause %in% c("Communicable diseases", "Non-communicable diseases"))
## # A tibble: 2 x 3
##    year cause                     deaths_millions
##   <dbl> <chr>                               <dbl>
## 1  1990 Communicable diseases                15.4
## 2  1990 Non-communicable diseases            26.7

  1. In this example, it would just be easier to used the “not equal” operator, filter(cause != “Injuries”), but imagine your column had more than just three different values in it.↩︎