8.1 Factors

We said earlier that continuous data can be measured and categorical data can be counted, which is useful to remember. Categorical data can be a:

  • Factor
    • a fixed set of names/strings or numbers
    • these may have an inherent order (1st, 2nd 3rd) - ordinal factor
    • or may not (female, male)
  • Character
    • sequences of letters, numbers, or symbols
  • Logical
    • containing only TRUE or FALSE

Health data is awash with factors. Whether it is outcomes like death, recurrence, or readmission. Or predictors like cancer stage, deprivation quintile, or smoking status. It is essential therefore to be comfortable manipulating factors and dealing with outcomes which are categorical.