13.3 Demographics table

First, let’s look at associations between our explanatory variable of interest (exposure) and other explanatory variables.

library(tidyverse)
library(finalfit)

# Specify explanatory variables of interest
explanatory <- c("age", "sex.factor", 
                "extent.factor", "obstruct.factor", 
                "nodes")

colon_s %>% 
  summary_factorlist("differ.factor", explanatory,
                     p=TRUE, na_include=TRUE)
TABLE 13.1: Exporting ‘table 1’: Tumour differentiation by patient and disease factors.
label levels Well Moderate Poor p
Age (years) Mean (SD) 60.2 (12.8) 59.9 (11.7) 59.0 (12.8) 0.644
Sex Female 51 (54.8) 314 (47.4) 73 (48.7) 0.400
Male 42 (45.2) 349 (52.6) 77 (51.3)
Extent of spread Submucosa 5 (5.4) 12 (1.8) 3 (2.0) 0.081
Muscle 12 (12.9) 78 (11.8) 12 (8.0)
Serosa 76 (81.7) 542 (81.7) 127 (84.7)
Adjacent structures 0 (0.0) 31 (4.7) 8 (5.3)
Obstruction No 69 (74.2) 531 (80.1) 114 (76.0) 0.655
Yes 19 (20.4) 122 (18.4) 31 (20.7)
(Missing) 5 (5.4) 10 (1.5) 5 (3.3)
nodes Mean (SD) 2.7 (2.2) 3.6 (3.4) 4.7 (4.4) <0.001

Note that we include missing data in this table (see Chapter 11).

Also note that nodes has not been labelled properly.

In addition, there are small numbers in some variables generating chisq.test() warnings (expect fewer than 5 in any cell).

Now generate a final table.16

colon_s <- colon_s %>% 
  mutate(
    nodes = ff_label(nodes, "Lymph nodes involved")
    )

table1 <- colon_s %>%  
  summary_factorlist("differ.factor", explanatory, 
                     p=TRUE, na_include=TRUE, 
                     add_dependent_label=TRUE,
                     dependent_label_prefix = "Exposure: "
                     )
table1
TABLE 6.1: Exporting table 1: Adjusting labels and output.
Exposure: Differentiation Well Moderate Poor p
Age (years) Mean (SD) 60.2 (12.8) 59.9 (11.7) 59.0 (12.8) 0.644
Sex Female 51 (54.8) 314 (47.4) 73 (48.7) 0.400
Male 42 (45.2) 349 (52.6) 77 (51.3)
Extent of spread Submucosa 5 (5.4) 12 (1.8) 3 (2.0) 0.081
Muscle 12 (12.9) 78 (11.8) 12 (8.0)
Serosa 76 (81.7) 542 (81.7) 127 (84.7)
Adjacent structures 0 (0.0) 31 (4.7) 8 (5.3)
Obstruction No 69 (74.2) 531 (80.1) 114 (76.0) 0.655
Yes 19 (20.4) 122 (18.4) 31 (20.7)
(Missing) 5 (5.4) 10 (1.5) 5 (3.3)
Lymph nodes involved Mean (SD) 2.7 (2.2) 3.6 (3.4) 4.7 (4.4) <0.001

  1. The finalfit functions used here - summary_factorlist() and finalfit() were introduced in Part II - Data Analysis. We will therefore not describe the different arguments here, we use them to demonstrate R’s powers of exporting to fully formatted output documents.↩︎