Chapter 15 Encryption

Encryption matters, and it is not just for spies and philanderers.
Glenn Greenwald

Health data is precious and often sensitive. Datasets may contain patient identifiable information. Information may be clearly disclosive, such as a patient’s date of birth, post/zip code, or social security number.

Other datasets may have been processed to remove the most obviously confidential information. These still require great care, as the data is usually only ‘pseudoanonymised’. This may mean that the data of an individual patient is disclosive when considered as a whole - perhaps the patient had a particularly rare diagnosis. Or it may mean that the data can be combined with other datasets and in combination, individual patients can be identified.

The governance around safe data handling is one of the greatest challenges facing health data scientists today. It needs to be taken very seriously and robust practices must be developed to ensure public confidence.