The Enormous Consequences of Inconsequential Data

The Enormous Consequences of Inconsequential Data

I have recently come across this excellent Ted Talk by Kevin Slavin. It looks at algorithms and the powerful, but also hidden and unsupervised way they shape our world. It is a trifle scary. Kevin focuses on areas such as financial trading, but makes the point that algorithms are everywhere, either created by a network owner, such as Netflix, or exploiting data about transactions that can be extracted from systems such as Amazon. Basically, wherever there are large quantities of useful data, there are algorithms.

One of the consequences of the growth of social media, is the generation of large quantities of data. I don't mean the sort of data which goes hand-in-hand with the word 'privacy' or even 'protection'. When we talk about data privacy, the sort of data we mean is stuff like address, age and bank account details, things we know it is a good idea to keep private. But of course, the whole point of social media, especially tools such as Facebook, is to not keep things private. We share a huge amount of data, but this is data we are happy to share, such as what we like, what we are thinking, who our 'friends' are, what we had for lunch.

We think of this data as being inconsequential. Or we believe that insofar as there are consequences from sharing it, these are positive consequences or consequences that have little impact. However, when we use social media services, in most instances, we are making our data available to algorithms and that has tremendous consequences.

One of the powerful characteristics of algorithms is that they can derive information about something without having to rely on information that relates specifically to what it is they are investigating. Or to put it another way, they are masters of using data in ways for which it was not originally intended (thereby, of course, trampling upon one of the principal pillars of traditional data protection). For example, if you were to construct an algorithm designed to identify potential terrorists, this algorithm would not waste its time looking for behaviour that is obviously associated with terrorist activity (Google searches for "how to make a bomb" for example).

The algorithm would look at the digital behaviour of known terrorists in order to look for common patterns and these patterns could lie in areas as 'inconsequential' as what colour trainers they buy online, or even the classic "what I had for lunch" tweet. It would then look for individuals with similar patterns of 'inconsequential' digital behaviour and then search for more data to confirm the similarities. Confirmation bias is the technical term for this behaviour, but a less technical way of putting it is that you could be seen as a potential terrorist based on what you had for lunch.

The enormous social data set that we are creating is a gold-mine for algorithms and the owners of these gold-mines are organisations such as Google and Facebook. Google and Facebook, in essence make their money, by granting licences to organisations to mine this data - either with their own algorithms or with algorithms their customers create to analyse the data they buy. Should we worry about this? Well, if you surrender data to a financial services company, they can use it target their advertising more effectively, or to customise their product offers - often called 'allowing us to better meet our customers' needs'. Nice. But of course such customisation might also include deciding to 'customise' an offer to you at a higher price because it deems you a greater credit risk because within your network of friends is someone who has defaulted on their mortgage payments. Not so nice.

Once you let the algorithm in, or the data out, there is no way of controlling what the outcome is going to be. That outcome can be anything from deciding to target a digital ad at you, or deciding that you are a potential terrorist. The only control we have, as Kevin explains, is to say 'stop' or to not let the algorithm in (or the data out) in the first place. The problem for Google and Facebook is that there is no point in having a gold mine, if you can't mine the gold. Selling data is how they make their money.

At the moment, this isn't a serious problem for the owners of social tools and networks, but this is because their users have not yet woken up to the enormous consequences of sharing inconsequential data. When they do, they are going to start demanding that the social tools they use build solid walls to stop algorithms getting in and data getting out. When this happens, a seismic shift is going to happen, which will expose the huge gap that exists between the revenue Facebook or Google need to sustain their service and the revenue they need to sustain their valuation and has the potential to bring the business models of these services tumbling down.

To provide a service like Facebook doesn't cost much money, all you need are some geeks and some server space - not scarce commodities. However, Facebook needs much more money than this if it is to justify a valuation that currently sits somewhere north of $60 billion - hence the fact that its focus is not on providing a service to its users, but on persuading marketing directors that it is a platform they can use to "integrate their brands into consumers' stories".

That is fine until such point as users start to say "hey, I don't really want my social life invaded by brands and especially I don't want information about me sold to them". And it becomes a critical issue at the point when users realise that this isn't the inevitable and unavoidable cost of using a service such as Facebook and this will happen when someone shows up who can provide a social network that doesn't have to exact this price, because it doesn't need the revenue, because it doesn't have justify an stratospheric valuation.

Facebook's current valuation is based on the idea that it is a form of media or content platform. Never before in history has there been such a platform with as many as 800 million viewers or subscribers - hence why it is assumed to be so valuable. However, Facebook is not a form of media, it is an infrastructure. It is much better understood as mobile phone network than a media platform. However, mobile phone networks have to invest huge sums of money in creating their infrastructure, but Facebook's infrastructure is The Internet, therefore people are not going to pay much for it beyond the subscription they already pay to their internet service provider. Facebook therefore can't generate anywhere like sufficient revenues from subscription to justify its valuation and it knows that if it tried this would simply hasten the arrival of rival who would undercut them - hence its desperation to chase the marketing dollar.

Fortunately for Facebook, the clever chaps on Wall Street who are likely to determine the success of its anticipated $100 billion IPO, haven't woken up to this yet. They are running the Facebook numbers through their existing media models, with a few tweak here and there - they haven't yet realised they need a whole new model. At some point this will dawn on them, but from Facebook's perspective if you are sat on $100 billion by that point, I guess you don't have to sweat about it too much.

Before You Go