The Blog

Google Uses Your Data to Predict What You're Going To Do Next - One Day So Might the Government

Our lives were never lived entirely through the telephone or the post but much of our life, indeed much of our personal life, is now lived online. It is for this reason that we should be proposing new and ever more effective safeguards to protect our digital liberty, not illiberal measures that curtail it.

Companies such as Google, Facebook and Amazon have long been using consumer data to work out who you are, how you behave and consequently what you might buy next. This is why Amazon might bombard you with adverts for iPad cases just after you've purchased an iPad ("Customers who bought an iPad also bought..."), or why liking World of Warcraft on Facebook might conjure up adverts for personal hygiene products.

The technology has been highly successful and is employed by almost every major online retailer. Even publishers are in on the game, automatically personalising pages to present content that they think you're most likely to click on. But what if this kind of technology starts to be used for more than just flogging stuff? What if it was the government that wanted to work out what you might do next?

The furore over the government's recent proposal to change internet surveillance legislation has once again thrust the issue of online privacy into the spotlight. The details of the proposal are not yet clear but it is likely that it will seek an extension of the government's power to monitor which sites you're visiting, who you're speaking to, how often and for how long. What you're actually saying is likely to remain private unless the police are granted a warrant to access it - for now. The end result is that everything else will be accessible for monitoring in real time, without the need for a warrant. What before would have to be justified to a judge would now be a commonplace tool of security surveillance.

Currently there is nothing to stop the government running this data through an algorithm to make assumptions about the individual or individuals in question. At its most basic level this would be as simple as observing regular communications between an individual and several people suspected of links to al-Qaeda. Conclusion? The individual in question is not necessarily a terrorist but his choice of online correspondents suggests that he might at least be interested in terrorism ("Customers who bought al Qaeda also bought...").

This is basic detective work, but what happens when the analysis gets more complex? For instance, what is there to stop the authorities analysing your communications data to see how likely you are to attend a protest? If this sounds unrealistic, consider that Target, a US discount retailer, managed to figure out that one of its teenage customers was pregnant before her father did. It would only take fairly basic algorithm to extrapolate from the sites that you visit, the people that you talk to and the time that you do it, a 'protest likelihood' score.

You may not be aware of it but the police already keep a database of individuals who are known to have taken part in protests, so-called 'domestic extremists'. Data about these people, such a license plate number, is then cross-referenced with other systems such as the UK's nationwide automatic number plate recognition (ANPR) system, which scans and recognises any plate that it sees. Whatever flag has been assigned to that number plate will then activate and alert police, who can pull the car over if they so choose.

Fine when the action is retrospective, such as catching a fugitive - not so fine when stopping someone because their name is on a list of people who were at a protest, as happened to John Catt. This is a real example of state data being used to make predictions about people's behaviour and disrupt their lives, often because they've been to a single peaceful protest and despite their never having committed a crime. How much further do we want this kind of thing to go?

So far companies have learned what the public find commercially acceptable through trial and error (see Facebook's various travails with privacy policy), but we cannot allow trial and error to dominate what is acceptable behaviour for those who govern our lives. The public need clear definition of what is and isn't acceptable use of our data and, if anything is to be collected, a firm provision in Freedom of Information law to find out what the state knows about us and why it feels it needs to know it.

The answer is clear: personal data should not be monitored - and should certainly not be harvested and stored - by the state without a warrant. By all means allow the authorities access to data as quickly and efficiently as possible, but only if the cause can be justified. Real time observation treats everybody as a suspect and opens too many tempting doors to purposes far more Machiavellian than flogging a toaster.

The age-old argument that people with nothing to hide have nothing to fear simply does not wash. Plenty of people throughout history have had something to hide from the state and with very good reason - from Oskar Schindler to Liu Xiaobo. The regime that introduces illiberal powers will not necessarily be the one that uses them for nefarious ends, but once the power is there it cannot easily be removed. To imagine that liberal democracy and governmental probity is invulnerable and eternal in Western society is hubris that begs to be punished. Every government scandal, from Watergate to MPs expenses, shows that power is never entirely free from corruption.

The internet will continue to reveal and define more about humanity than any invention before it, from the cataloging of our collected knowledge to the spread of ideas which revolutionise the way we think and act. This reliance on the internet, a single system that can theoretically be brought under centralised control, makes new surveillance legislation different from any of the analog age. Our lives were never lived entirely through the telephone or the post but much of our life, indeed much of our personal life, is now lived online. It is for this reason that we should be proposing new and ever more effective safeguards to protect our digital liberty, not illiberal measures that curtail it.