9.7 Fitting logistic regression models with finalfit

Our preference in model fitting is now to use our own finalfit package. It gets us to our results quicker and more easily, and produces our final model tables which go directly into manuscripts for publication (we hope).

The approach is the same as in linear regression. If the outcome variable is correctly specified as a factor, the finalfit() function will run a logistic regression model directly.

library(finalfit)
dependent <- "mort_5yr"
explanatory <- "ulcer.factor"
melanoma %>% 
  finalfit(dependent, explanatory, metrics = TRUE)
TABLE 9.2: Univariable logistic regression: 5-year survival from malignant melanoma by tumour ulceration (fit 1).
Dependent: 5-year survival No Yes OR (univariable) OR (multivariable)
Ulcerated tumour Absent 105 (91.3) 10 (8.7)
Present 55 (61.1) 35 (38.9) 6.68 (3.18-15.18, p<0.001) 6.68 (3.18-15.18, p<0.001)
TABLE 9.2: Model metrics: 5-year survival from malignant melanoma by tumour ulceration (fit 1).
Number in dataframe = 205, Number in model = 205, Missing = 0, AIC = 192.2, C-statistic = 0.717, H&L = Chi-sq(8) 0.00 (p=1.000)

9.7.1 Criterion-based model fitting

Passing metrics = TRUE to finalfit() gives us a useful list of model fitting parameters.

We recommend looking at three metrics:

  • Akaike information criterion (AIC), which should be minimised,
  • C-statistic (area under the receiver operator curve), which should be maximised;
  • Hosmer–Lemeshow test, which should be non-significant.

AIC

The AIC has been previously described (Section 7.3.3). It provides a measure of model goodness-of-fit - or how well the model fits the available data. It is penalised for each additional variable, so should be somewhat robust against over-fitting (when the model starts to describe noise).

C-statistic

The c-statistic or area under the receiver operator curve (ROC) provides a measure of model ‘discrimination’. It runs from 0.5 to 1.0, with 0.5 being no better than chance, and 1.0 being perfect fit. What the number actually represents can be thought of like this. Take our example of death from melanoma. If you take a random patient who died and a random patient who did not die, then the c-statistic is the probability that the model predicts that patient 1 is more likely to die than patient 2. In our example above, the model should get that correct 72% of the time.

Hosmer-Lemeshow test

If you are interested in using your model for prediction, it is important that it is calibrated correctly. Using our example, calibration means that the model accurately predicts death from melanoma when the risk to the patient is low and also accurately predicts death when the risk is high. The model should work well across the range of probabilities of death. The Hosmer-Lemeshow test assesses this. By default, it assesses the predictive accuracy for death in deciles of risk. If the model predicts equally well (or badly) at low probabilities compared with high probabilities, the null hypothesis of a difference will be rejected (meaning you get a non-significant p-value).