30/08/2016 08:32 BST | Updated 30/08/2017 06:12 BST

The Contradictions Of Big Data In Healthcare

The lady sitting in my clinic room had breathing problems likely to be related to a couple of severe chest infections in the last year. Nevertheless, after seeing an advert, she had proceeded with genetic testing from 23andme, to find out if she had "anything wrong with her heart or lungs".

Anne Wojcicki, who co-founded 23andMe in 2006, has previously said, "Our generation is used to getting information they need when they want it, and the idea that we can't have access to our own health data is bewildering. If you can be proactive with your health, you can be more in control."

The test report documented a small risk of developing cardiac disease and asked her to "seek the advice of her physician". This passing of the buck from health tech to the public health service is not uncommon. Anxiety and uncertainty replaced the empowerment of organising her own investigation, when I explained that the test did not necessarily give useful or relevant information.

The vast majority of diseases are caused by many genes and their interaction with each other. These genes also have complex relationships with lifestyles and environment, which the science of epigenetics attempts to decipher, but cannot yet capture. At present, these genetic tests cannot provide the information this lady wants or expects. There have been concerns with the test, interpretation of the results and post-test support for customers. Above all, it is unclear whether 23andme base their business model on provision of a health-related service or sale of their customers' data. These considerations led to a ban by the US Food and Drug Administration in 2013, which was eventually lifted in 2015. The UK's Medicines and Healthcare Products Regulatory Agency approved the test in December 2014.

The 5V's of big data in healthcare (velocity, veracity, volume, variety and value), coupled with direct-to-consumer technology make the promise of better access to better healthcare compelling but sometimes, too good to be true.

Theranos, the company which heralded the age of mass direct-to-consumer blood diagnostics rode on a wave of hype for several years before the lack of evidence base for its tests were exposed. The big data agenda has to avoid fuelling public expectations of technology in healthcare by grounding itself in evidence rather than hype.

At the International Population Data Linkage Network Conference in Swansea last week, a major topic of debate was the way in which privacy, anonymity and security are maintained when public data is used for research in order to improve healthcare. As plans for a revamped national healthcare IT strategy are planned after the catastrophe of, it is clear that the public's major concerns are the ability of the NHS and universities to adequately protect sensitive data, as well as suspicion that data will be sold to private companies, including insurance and pharma.

We want the benefit of big data research with improvements in diagnostics, disease prediction, treatment and prognosis, but without giving access to our big data. The major complaint of any researcher trying use public data for research, whether it is national GP data (e.g. the Clinical Practice Research Datalink, CPRD), hospital data (Hospital Episode Statistics, HES) or administrative data (from the Administrative Data Research Network, ADRN), is the time delay between request for data and approval of that request for access to data, which can range from months to years.

This delay in research inevitably slows down the process of science, and as importantly, the ability of the system to learn lessons from its current practice. It is a huge patient safety issue. In any other sector, including aviation and retail, which are often highlighted as examples for quality improvement in the NHS by our politicians, this time lag in the ability of the system to monitor, research and change its practice would be unacceptable. Interestingly, the failure of may have enhanced the scope for private companies (e.g. Google Deep Mind) to pursue projects which involve health records of millions of patients.

Whether in terms of regulation, research or public expectation, there appear to be differences in the way commercial, research and clinical sectors are operating with health data. This unequal management and perception of health data is not good for research or clinical practice. Profit to industry will be off-set by attrition of trust in all directions without adequate thought and action. If researchers, health professionals and technology companies continue to work in silos, then application of big data to healthcare will remain as potential rather than reality.