Data security plays a major role in current age. Privacy matters a lot. Data masking is one of the Key Features when comes to security.
- In MySQL community versions if you want to mask your data, You can go with a Maxscale load balancer.
- They introduced a new masking filter on the Maxscale 2.1 version.
- My colleague Prasanth written about column-level data masking. For reference click here Column Level Data Masking.
- In MySQL 8.0 Enterprise they have introduced a new security feature called data masking.
- This feature was introduced in MySQL 8.0.13 Enterprise Edition and it provides data masking and de-identification capabilities.
What is Masking ?
- Transformation of existing data to mask it and remove identifying characteristics, for example, changing all digits of a credit card number but the last four to ‘X’ characters.
- Generation of random data, such as email addresses and payment card numbers.
In MySQL 8.0 they have introduced 14 functions,
Plugin Installation :
- In MySQL 8.0 there were different types of masking functions are available.
- I have categorised those functions based on their usage.
Architecture Diagram :
General-Purpose Functions:
- mask_inner
- mask_outer
mask_inner :
- It will mask the interior part of a string, We can specify the number of strings that needs to be masked.
Example :
- In the above example, I have masked the (aadhar_number & status) column using mask_inner function. I have passed the 2 arguments.
- The first argument indicates margin1. It will mask the inner part of the data.
- The second argument indicates margin2. It will mask part of the data.
Note :
- For using this mask_inner function you need minimum “3“ characters.
mask_outer :
- It will mask the left and right ends of a string. We can specify the number of strings that needs to be masked.
Example :
In the above example, I have masked the (aadhar_number & status) column using mask_outer function.
I have passed the 2 arguments.
- The first argument indicates margin1. It will mask the data from the beginning.
- The second argument indicates margin2. It will mask the data from the end of the row.
Special-Purpose :
- mask_pan
- mask_pan_relaxed
- mask_ssn
mask_pan :
- The mask span function it will mask all the values from passed argument except the last four-digit.
Example :
Note :
- If the argument is shorter than required, it is returned unchanged value.
- For using this (mask_pan) function, We need a minimum of 5 characters.
Example :
mask_pan_relaxed :
- The mask_pan_relaxed function will mask other characters except for first six digits and last 4 digits.
Example :
Note :
- The string must be a suitable length,(ie.. string must be 11 characters long). Otherwise, this mask_pan_relaxed function did not check the string.
mask_ssn :
- The mask_ssn is a Social Security number that returns the number except for last 4 digits. Other numbers are replaced by “X” characters.
Example :
Generating Random Data with Specific Characteristics :
- One of the nice functions of MySQL Data Masking is the ability to generate business-relevant datasets.
- Because it is not always possible to test/simulate your application on your real dataset (indeed playing with the customer credit card or security social numbers is a very bad practice) this feature is very convenient.
Several functions are available,
gen_range() : returns a random integer selected from a given range.
gen_rnd_email() : returns a random email address in the example.com domain.
gen_rnd_pan() : returns a random payment card Primary Account Number.
gen_rnd_us_phone() : returns a random U.S. phone number.
Masking Data for Customer Identification :
- This type of masking function is generating random data from the dictionary file.
- Here the main advantage is we can load the dictionary files manually.
- When a dictionary is loaded, it becomes part of the dictionary registry and is assigned a name to be used by other dictionary functions.
- Dictionaries are loaded from plain text files containing one term per line. Empty lines are ignored. To be valid, a dictionary file must contain at least one non-empty line.
gen_dictionary :
- It returns some random data from a dictionary.
Example :
gen_dictionary_load :
- Loads a file into the dictionary registry and assigns the dictionary a name to be used with other functions that require a dictionary name argument.
- In order to use data from a dictionary, we must first load the data.
Sample data :
- We can load the dictionary files using (gen_dictionary_load) function.
Example :
Note :
- To loading the dictionary file to MySQL you need to put the file under (secure_file_priv) directory.Otherwise, MySQL will not allow loading the dictionary file.
How to use these functions in a production use case?
- We can achieve this by creating views. While creating the views we can specify what are the columns needs to mask.
Sample Table:
For the above table, I am masking a few columns, ie…
- aadhar_number – The gen_range function will generate some random numbers.
- mobile – The gen_rnd_us_phone function will generate random numbers in U.S standard format.
- name & address – The gen_dictionary function will load random dictionary data.
- email – The gen_rnd_email function will generate random email id’s.
- photo – The mask_inner function will mask the inner part of the selected column.
Creating View
Now the data looks like below,
- This is a really awesome feature in MySQL 8.0. But this feature is only available for Enterprise MySQL.
- In Percona MySQL 8.0 they have implemented this same feature in open source. And they are expanding the Data-Masking plugin to implementing new features.
Don't compromise on data security. Partner with Mydbops, your MySQL security experts! Contact us today for a free consultation and unlock a new level of data protection for your MySQL environment.
{{cta}}