8.5 Recode the data

It is really important that variables are correctly coded for all plotting and analysis functions. Using the data dictionary, we will convert the categorical variables to factors.

In the section below, we convert the continuous variables to factors (e.g., sex %>% factor() %>%), then use the forcats package to recode the factor levels. Modern databases (such as REDCap for example) can give you an R script to recode your specific dataset. This means you don’t always have to recode your factors from numbers to names manually. But you will always be recoding variables during the exploration and analysis stages too, so it is important to follow what is happening here.

We have formatted the recode of the sex variables to be on multiple lines - to make it easier for you to see the exact steps included. We have condensed for the other recodes (e.g., ulcer.factor = factor(ulcer) %>%), but it does the exact same thing as the first one.