Preventing range disclosure in k-anonymised data
k-Anonymisation is an approach to preventing sensitive information about individuals being identified or inferred from a dataset. Existing work achieves this by ensuring that each individual is linked to multiple sensitive values, but they have not adequately considered how the range formed by these sensitive values may affect privacy protection. When such a range is small, sensitive information about individuals may still be inferred quite accurately, thereby breaching privacy. In this paper, we study the problem of range disclosure (i.e. estimating sensitive information through ranges) in k-anonymisation, and propose Range Diversity for quantifying the effect of range disclosure on privacy protection. Our measure considers several possible attacks and allows anonymisers to specify the level of protection required in a flexible manner. Extensive experiments show that range diversity provides better protection for range disclosure and higher level of data utility than the existing methods.