The discovery of new, more transmissible variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been underway since the onset of coronavirus disease 2019 (COVID-19). Therefore, it is crucial to quickly identify mutations in the virus, as this could greatly aid epidemic control efforts and shed light on new variants that should be watched more closely.
A new study, published on the medRxiv* preprint server, builds an analytical epidemiological model based on the SIR to analyze the effects of mutations on the transmission of SARS-CoV-2 from genomic surveillance data.
To study: Deduce the effects of mutations on the transmission of SARS-CoV-2 from genomic surveillance data. Image Credit: NIAID
Understanding virus mutations is essential because it could help us understand how efficiently they infect hosts and inform public health policies to control the spread of the virus. However, it is not easy to estimate how individual mutations affect viral transmission.
Currently, phylogenetic analyzes or fitting for changes in variant frequencies to a simple growth pattern are the techniques used to estimate changes in viral transmission. The former can be difficult due to a high degree of sequence similarity. Another disadvantage of phylogenetic analyzes is that they rely heavily on Monte Carlo Markov chain sampling, which makes it difficult to track large datasets. In terms of simple growth models, they lack the ability to account for competition between multiple variants. Neither do the above two methods take into account the superpropagation or movement of infected individuals.
A new study
To overcome the above drawbacks, scientists have developed a new SIR-based method to better understand the effects of single nucleotide variants (SNVs) on viral transmission from genomic monitoring data. The study also took into account factors such as competition between viral lineages, travel, superpropagation, etc.
Simulations have shown that this approach can reliably estimate the transmission effects of SNVs even from limited data – a huge advantage. The method has been applied to over 1.6 million SARS-CoV-2 sequences from 87 geographic regions. The aim was to understand the effects of mutations on viral transmission throughout the pandemic.
The researchers also quantified the influence of travel and competition between several variants and found that travel only slightly affected the estimated changes in drivetrain. However, significant effects of competition between variants were observed.
The approach accurately estimates the transmission effects of mutations in the simulations. Simulated epidemiological dynamics starting with a mixed population containing variants with beneficial, neutral and deleterious mutations. a, Selection coefficients for individual SNVs, represented by mean values ± one theoretical standard deviation, can be accurately derived from stochastic dynamics in a typical simulation (methods). b, Extensive tests on 1000 repeated simulations with identical parameters show that the inferred selection coefficients are centered around their true values. Deleterious coefficients are slightly more difficult to deduce with precision due to their low frequency in the data. Simulation parameters. The initial population is a mixture of two variants with beneficial SNVs (s = 0.03), two with neutral SNVs (s = 0), and two with deleterious SNVs (s = – 0.03). The number of newly infected individuals per serial interval increases rapidly from 6,000 to about 10,000 and remains almost constant thereafter. The dispersion parameter k is fixed at 0.1.
Scientists have applied data from SARS-CoV2 in many regions and revealed multiple mutations that strongly affect the rate of transmission. These mutations have been found both inside and outside of the Spike protein. In the present study, the researchers also focused on the movement and competition between the variants (using the history of 20E (EU1) as an example) – factors that were not well taken into account in the methods previously. used. They quantified the impacts of travel and competition between different lineages on the inferred transmission effects of mutations.
One encouraging observation was that the model was able to detect lineages with increased transmission as they emerged. Significant drivetrain benefits were inferred within a week of appearing in regional data for the Alpha and Delta variants. At that time, the regional frequencies of the Alpha and Delta variants were 1%. While the study data only extended through August 6, 2021, the researchers would estimate a selection coefficient of 55.2% for the newly emerged Omicron variant, based on the mutations it shares with the previous variants. Therefore, the model allows rapid identification of variants and mutations that might affect transmission from genomic surveillance data, thus providing an “early morning” for more transmissible variants.
Scientists said that sustained research is essential to identify and characterize new variants as they emerge. One example is the newly appeared Omicron variant in South Africa. The model developed in this study focused exclusively on SARS-CoV-2; however, it could be applied to study the transmission of other pathogens such as influenza. The model, coupled with extensive genomic surveillance data, is a powerful method for rapidly identifying more transmissible viral variants and, subsequently, quantifying the contributions of individual mutations to changes in transmission rates.
medRxiv publishes preliminary scientific reports that are not peer reviewed and, therefore, should not be considered conclusive, guide clinical practice / health-related behavior, or treated as established information.