8.13 Chi-squared / Fisher’s exact test using finalfit

It is easier using the summary_factorlist() function from the finalfit package. Including p = TRUE in summary_factorlist() adds a hypothesis test to each included comparison. This defaults to chi-squared tests with a continuity correction for categorical variables.

library(finalfit)
meldata %>% 
  summary_factorlist(dependent   = "status_dss", 
                     explanatory = "ulcer.factor",
                     p = TRUE)

TABLE 6.5: Two-by-two table with chi-squared test using final fit: Outcome after surgery for melanoma by tumour ulceration status.
label levels Alive Died melanoma p
Ulcerated tumour Absent 99 (66.9) 16 (28.1) <0.001
Present 49 (33.1) 41 (71.9)

Adding further variables:

meldata %>% 
  summary_factorlist(dependent = "status_dss", 
                     explanatory = 
                       c("ulcer.factor", "age.factor", 
                         "sex.factor", "thickness"),
                     p = TRUE)
## Warning in chisq.test(age.factor, status_dss): Chi-squared approximation may be
## incorrect
TABLE 6.6: Multiple variables by outcome with hypothesis tests: Outcome after surgery for melanoma by patient and disease factors (chi-squared test).
label levels Alive Died melanoma p
Ulcerated tumour Absent 99 (66.9) 16 (28.1) <0.001
Present 49 (33.1) 41 (71.9)
Age (years) ≤20 6 (4.1) 3 (5.3) 0.568
21 to 40 30 (20.3) 7 (12.3)
41 to 60 66 (44.6) 26 (45.6)
>60 46 (31.1) 21 (36.8)
Sex Female 98 (66.2) 28 (49.1) 0.036
Male 50 (33.8) 29 (50.9)
thickness Mean (SD) 2.4 (2.5) 4.3 (3.6) <0.001

Note that for continuous expanatory variables, an F-test (ANOVA) is performed by default. If variables are considered non-parametric (cont = "mean"), then a Kruskal-Wallis test is used.

Switch to Fisher’s exact test:

meldata %>% 
  summary_factorlist(dependent = "status_dss", 
                     explanatory = 
                       c("ulcer.factor", "age.factor", 
                         "sex.factor", "thickness"),
                     p = TRUE,
                     p_cat = "fisher")
TABLE 8.3: Multiple variables by outcome with hypothesis tests: Outcome after surgery for melanoma by patient and disease factors (Fisher’s exact test).
label levels Alive Died melanoma p
Ulcerated tumour Absent 99 (66.9) 16 (28.1) <0.001
Present 49 (33.1) 41 (71.9)
Age (years) ≤20 6 (4.1) 3 (5.3) 0.544
21 to 40 30 (20.3) 7 (12.3)
41 to 60 66 (44.6) 26 (45.6)
>60 46 (31.1) 21 (36.8)
Sex Female 98 (66.2) 28 (49.1) 0.026
Male 50 (33.8) 29 (50.9)
thickness Mean (SD) 2.4 (2.5) 4.3 (3.6) <0.001

Further options can be included:

meldata %>% 
  summary_factorlist(dependent = "status_dss", 
                     explanatory = 
                       c("ulcer.factor", "age.factor", 
                         "sex.factor", "thickness"),
                     p = TRUE,
                     p_cat = "fisher",
                     digits = 
                       c(1, 1, 4, 2), #1: mean/median, 2: SD/IQR 
                                      # 3: p-value, 4: count percentage
                     na_include = TRUE, # include missing in results/test
                     add_dependent_label = TRUE
  )
TABLE 8.4: Multiple variables by outcome with hypothesis tests: Options including missing data, rounding, and labels.
Dependent: Status Alive Died melanoma p
Ulcerated tumour Absent 99 (66.89) 16 (28.07) <0.0001
Present 49 (33.11) 41 (71.93)
Age (years) ≤20 6 (4.05) 3 (5.26) 0.5437
21 to 40 30 (20.27) 7 (12.28)
41 to 60 66 (44.59) 26 (45.61)
>60 46 (31.08) 21 (36.84)
Sex Female 98 (66.22) 28 (49.12) 0.0263
Male 50 (33.78) 29 (50.88)
thickness Mean (SD) 2.4 (2.5) 4.3 (3.6) <0.0001