Privacy Preserving Sensitive Data Publishing using (k,n,m) Anonymity ApproachPublished online: Mar 20, 2020
Open Science movement has enabled extensive knowledge sharing by making research publications, software, data and samples available to the society and researchers. The demand for data sharing is increasing day by day due to the tremendous knowledge hidden in the digital data that is generated by humans and machines. However, data cannot be published as such due to the information leaks that can occur by linking the published data with other publically available datasets or with the help of some background knowledge. Various anonymization techniques have been proposed by researchers for privacy preserving sensitive data publishing. This paper proposes a (k,n,m) anonymity approach for sensitive data publishing by making use of the traditional k-anonymity technique. The selection of quasi identifiers is automated in this approach using graph theoretic algorithms and is further enhanced by choosing similar quasi identifiers based on the derived and composite attributes. The usual method of choosing a single value of ‘k’ is modified in this technique by selecting different values of ‘k’ for the same dataset based on the risk of exposure and sensitivity rank of the sensitive attributes. The proposed anonymity approach can be used for sensitive big data publishing after applying few extension mechanisms. Experimental results show that the proposed technique is practical and can be implemented efficiently on a plethora of datasets.
Keywordsanonymization, data publishing, k anonymity, privacy, quasi identifier
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.