请输入您要查询的百科知识:

 

词条 Quasi-identifier
释义

  1. See also

  2. References

Quasi-identifiers are pieces of information that are not of themselves unique identifiers, but are sufficiently well correlated with an entity that they can be combined with other quasi-identifiers to create a unique identifier.[1]

Quasi-identifiers can thus, when combined, become personally identifying information. This process is called re-identification. As an example, Latanya Sweeney has shown that even though neither gender, birth dates nor postal codes uniquely identify an individual, the combination of all three is sufficient to identify 87% of individuals in the United States.[2]

The term was introduced by Tore Dalenius in 1986.[3] Since then, quasi-identifiers have been the basis of several attacks on released data. For instance, Sweeney linked health records to publicly available information to locate the then-governor of Massachusetts' hospital records using uniquely identifying quasi-identifiers,[4][5] and Sweeney, Abu and Winn used public voter records to re-identify participants in the Personal Genome Project.[6] Additionally, Arvind Narayanan and Vitaly Shmatikov made use of quasi-identifiers to de-anonymize data released by Netflix.[7]

Motwani and Ying warn about potential privacy breaches being enabled by publication of large volumes of government and business data containing quasi-identifiers.[8]

See also

  • De-identification
  • Differential privacy
  • Personally identifying information

References

1. ^{{cite web|url=http://stats.oecd.org/glossary/detail.asp?ID=6961|title=Glossary of Statistical Terms: Quasi-identifier|publisher=OECD|date=November 10, 2005|accessdate=29 September 2013}}
2. ^Sweeney, Latanya. Simple demographics often identify people uniquely. Carnegie Mellon University, 2000. http://dataprivacylab.org/projects/identifiability/paper1.pdf
3. ^Dalenius, Tore. Finding a Needle In a Haystack or Identifying Anonymous Census Records. Journal of Official Statistics, Vol.2, No.3, 1986. pp. 329–336. http://www.jos.nu/Articles/abstract.asp?article=23329
4. ^Anderson, Nate. ''Anonymized data really isn’t—and here’s why not. Ars Technica, 2009. https://arstechnica.com/tech-policy/2009/09/your-secrets-live-online-in-databases-of-ruin/
5. ^Barth-Jones, Daniel C.
The're-identification'of Governor William Weld's medical information: a critical re-examination of health data identification risks and privacy protections, then and now. Then and Now (June 4, 2012) (2012).
6. ^ Sweeney, Latanya, Akua Abu, and Julia Winn. "Identifying participants in the personal genome project by name." Available at SSRN 2257732 (2013).
7. ^Narayanan, Arvind and Shmatikov, Vitaly.
Robust De-anonymization of Large Sparse Datasets. The University of Texas at Austin, 2008. https://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf
8. ^{{cite conference |title=Efficient Algorithms for Masking and Finding Quasi-Identifiers |author=Rajeev Motwani and Ying Xu |conference=Proceedings of SDM’08 International Workshop on Practical Privacy-Preserving Data Mining |url=https://www.csee.umbc.edu/~kunliu1/p3dm08/proceedings/2.pdf |year=2008}}
{{statistics-stub}}

2 : Personal life|Information privacy

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/12 6:51:49