Data protection on the web is an illusion
Two years ago, the Belgian computer science lecturer Yves-Alexandre de Montjoye moved from Boston to London, was looking for a new doctor and was handed a form in the practice. He should agree that his health data may be passed on to research institutions and companies; anonymized of course. De Montjoye still remembers one sentence today: Some believed, it was written there, that individual patients could be identified with this data.
What his doctor said in the subjunctive, de Montjoye himself has now put in the indicative. Together with two Belgian colleagues, he investigated whether allegedly anonymous data sets actually allow conclusions to be drawn. "Our results suggest that even heavily anonymized samples do not meet the standards of data protection laws," the researchers write in the journal Nature.
Their study shows that eight out of ten cases, gender, date of birth and zip code are enough to unequivocally identify people; you already know that it is Barbara Miller or John Smith from So-and-so. And with just 15 pieces of information, the scientists can determine the identity of an American with a probability of 99.98 percent.
Companies sometimes market data sets that each contain hundreds of characteristics of millions of people. Data trading is mostly legal because a large part of the data is not considered to be personal. Regulations such as the EU General Data Protection Regulation therefore do not apply.
Billions of people's data are floating around on the Internet
As early as the mid-1990s, the then governor of Massachusetts, William Weld, involuntarily showed that anonymous data is often an illusion. His state released a database that contained patient records from civil servants. Weld assured that no one should be afraid. All personal characteristics such as name, address and social security number have been removed.
Shortly afterwards he found his own patient file in the mailbox: Latanya Sweeney, then a computer science student, now a Harvard professor, was able to identify Weld in the data set and show that he had promised too much. And in 2006 AOL published searches from 650,000 users; In 2007, Netflix released video recommendations from 500,000 users; In 2016, the Australian government published health data on 2.9 million people. Allegedly the data was completely anonymous. In all cases, researchers were able to link the information to specific people.
In addition, unsuspecting users of apps and browser extensions can be monitored without even realizing it. Criminals steal huge data sets. Billions of users' data is traded on the Internet, legally and illegally. Individuals can hardly prevent this, as stricter laws and better anonymization procedures could guarantee that. But perhaps researchers like de Montjoye can prevent people from voluntarily expanding the treasure trove of data. It can be enough not to tick the box or not to sign a form.
- Which business problem falls under the sales analysis
- Which Shayari did you read today
- Which Canon camera has the highest megapixels
- Why is Thiruvandrum being renamed Thiruvanthapuram
- Where do I get designer clothes from
- Is it possible to make partners for Mike Ross?
- What is the millennial attitude towards business
- What does 60 precipitation mean
- How does Facebook recognize fake IDs
- Creates Moong Dal Gas
- Should tourism be promoted
- What is your favorite catholic anthem
- Requires HCE NFC to work
- Can motorcycles drive between lanes
- Artificial intelligence will overtake digital marketers
- Why should a graph be linear
- What is the spiritual meaning of grace
- When did Kisame die?
- Meat makes a person fat
- Are narcissists ever sincere about philanthropic endeavors?
- What makes childhood more difficult than adulthood
- What determine the speed of aircraft
- UPES is worth it for B Tech CSE
- How would you define immortality and why