15.1 Safe practice

Storing sensitive information as raw values leaves the data vulnerable to confidentiality breaches. This is true even when you are working in a ‘safe’ environment, such as a secure server.

It is best to simply remove as much confidential information from records whenever possible. If the data is not present, then it cannot be compromised.

This might not be a good idea if the data might need to be linked back to an individual at some unspecified point in the future. This may be a problem if, for example, auditors of a clinical trial need to re-identify an individual from the trial data. A study ID can be used, but that still requires the confidential data to be stored and available in a lookup table in another file.

This chapter is not a replacement for an information governance course. These are essential and the reader should follow their institution’s guidelines on this. The chapter does introduce a useful R package and encryption functions that you may need to incorporate into your data analysis workflow.