15.6 Encrypt columns of data

Once the keys are created, it is possible to encrypt one or more columns of data in a data frame/tibble using the public key. Every time RSA encryption is used it will generate a unique output. Even if the same information is encrypted more than once, the output will always be different. It is therefore not possible to match two encrypted values.

These outputs are also secure from decryption without the private key. This may allow sharing of data within or between research teams without sharing confidential data.

Encrypting columns to a ciphertext is straightforward. However, as stated above, an important principle is dropping sensitive data which is never going to be required. Do not hoard more data than you need to answer your question.

library(dplyr)
gp_encrypt = gp %>% 
  select(-c(name, address1, address2, address3)) %>% 
  encrypt(postcode)
gp_encrypt

#> A tibble: 1,212 x 8
#>   organisation_code city        county     postcode 
#>   <chr>             <chr>       <chr>      <chr>    
#> 1 S10002            DUNDEE      ANGUS      796284eb46ca…  
#> 2 S10017            CRIEFF      PERTHSHIRE 639dfc076ae3…