Big Data Tamps Down HIV Outbreaks

One of the best ways to prevent the spread of HIV is to treat those at high risk with a daily prophylactic pill. Unfortunately, this week Stanford University health researchers concluded that it’s simply too expensive to pre-treat even a fraction of people at increased risk for HIV.

But what if healthcare providers could track a brewing outbreak in real-time, and quickly help those at highest risk of infection? Thanks to big data and crackerjack new software, Canada’s westernmost province is doing just that.

In June 2014, a monitoring system operated by the British Columbia Centre for Excellence in HIV/AIDS (BC-CfE) detected a cluster of 11 new HIV cases in a town just outside Vancouver. The system, designed by bioinformatician Art Poon, analyzes massive amounts of HIV genetic data to detect outbreaks.

Such data is surprisingly easy to come by. In many developed countries, it is now routine for a doctor to sequence viral DNA from the blood of a HIV-positive patient. By doing so, the physician can identify which drugs, if any, the virus is resistant to and prescribe an optimal treatment.

In Canada, that DNA sequence data is regularly uploaded to BC-CfE’s secure Oracle database, home to 30,000+ anonymized HIV genotypes. Every time new sequences are added-which happens almost every day-it triggers the entire database to be downloaded to a secure workstation, where Poon’s software works its magic. During the download, all patient information is de-identified. “The system is designed to maintain patient privacy,” says Poon.