To reconstruct dates of HIV infection by the coalescent analysis of longitudinal next-generation sequencing (NGS) data.
The coalescent predicts the time that has elapsed since the most recent common ancestor (MRCA) of a population. Because HIV tends to undergo severe bottlenecks upon transmission, the MRCA may be a good predictor of the time of infection. NGS provides an efficient means for performing large-scale clonal sequencing of HIV populations within patients, and the ideal raw material for coalescent analysis.
Baseline and follow-up plasma samples were obtained from 19 individuals enrolled into the Montréal Primary HIV Infection cohort. Dates of infection were initially estimated at baseline from nongenetic data (clinical and serological markers and patient questionnaires). HIV RNA was extracted and seven regions of the genome were amplified, subjected to parallel-tagged 454 pyrosequencing, and analyzed using the software package BEAST.
Mean estimates of the time to the MRCA per patient were significantly correlated with nongenetic estimates (Spearman's ρ = 0.65, P = 4.4 × 10(-3)). The median absolute difference between coalescent and nongenetic date estimates was smallest (median 29.4 days) for highly variable regions of the HIV genome such as env V3, and greater (median 114.9 days) for more conserved regions such as pol.
This application of NGS represents an important advancement, not only because accurate estimates of dates of infection can be derived retrospectively from archived specimens, but also because each analysis is patient-specific and, therefore, robust to variation in rates of HIV evolution.