8.1 Factors

We said earlier that continuous data can be measured and categorical data can be counted, which is useful to remember. Categorical data can be a:

  • Factor
    • a fixed set of names/strings or numbers
    • these may have an inherent order (1st, 2nd 3rd) - ordinal factor
    • or may not (female, male)
  • Character
    • sequences of letters, numbers, and symbols
  • Logical
    • containing only TRUE or FALSE

Health data is awash with factors. Whether it is outcomes like death, recurrence, or readmission. Or predictors like cancer stage, deprivation quintile, smoker yes/no. It is essential therefore to be comfortable manipulating factors and dealing with outcomes which are categorical.