What can genetic analysis tell us about the origins of SARS-CoV-2? By Heather O’Donoghue

Since December 2019, a novel respiratory infection has been spreading in humans. It started in Wuhan, a city in Hubei province China, when patients started presenting to hospitals with severe respiratory infections. The first case was identified in early December, and it has been spreading ever since. Early in January 2020, China started locking down cities and provinces. The World Health Organisation (WHO) stepped in to support containment efforts. International flights were cancelled, foreign governments evacuated their citizens, and large events were suspended. But still, the virus continued to spread to other regions of China and beyond.

By the end of January 2020, the WHO had declared the new virus a Public Health Emergency of International Concern (PHEIC). Throughout February it continued to spread globally and was soon causing infections on every continent except Antarctica. Containment is still the top priority, but it seems to be failing, with community spread now a reality in many places. At the time of writing (8thMarch 2020), there had been 105,568 confirmed cases with 3,584 deaths worldwide. The overwhelming majority of these (77% of cases; 86% of deaths) have been in China.

The cause of these infections? The virus now named SARS-CoV-2 (originally 2019-nCoV). It is a member of the viral family coronaviridae and is causing a disease called COVID-19. Before the identification of SARS-CoV-2 there were six coronaviruses known to infect humans. Two of those, SARS-CoV-1 and MERS-CoV, were known to cause severe respiratory illnesses. In a 2003 - 2004 global outbreak, SARS-CoV-1infected a little over 8,000 people in 26 countries, with 774 deaths. Since 2012, MERS-CoVhas infected 2,540 people, with 867 deaths. The remaining four human coronaviruses – HKU1, NL63, OC43, and 229E – cause only mild respiratory symptoms and have not posed a significant threat to humans.

In order to combat this new infectious disease, scientists need to understand it: how it binds with human cells, how it replicates, and what existing medicines might be able to combat it. The most effective way of discovering this information is genomic sequencing. Members of the coronaviridae family of viruses have single-stranded RNA as their genetic material. Samples were taken from early victims of the outbreak and researchers worked quickly to sequence the genome of the virus. By January 10th, the full genome had been sequenced by researchers in China and was ready to be used to help virus sleuths identify where it had come from.

The virus’ genetic code revealed a number of interesting things. Analysis showed that the genetic material was highly conserved (99%) across all samples sequenced to date. This means that there has been minimal opportunity for the viral genome to mutate. If the virus had been circulating in human populations for any significant period of time before being identified, the expectation is that a number of mutations would have occurred. As significant mutations have not been identified, scientists believe there was a single spillover event sometime between 30th October and 29th November 2019 in line with the first reported human case of COVID-19 on 8th December.

This begs the question – what was the spillover event? Spillover happens when an infectious disease is passed from its usual (endemic) host, to whom it is not deadly, to a different species. This species can be the end of the line, if the disease causes a large fatality rate or if it cannot replicate and spread successfully. However, if it finds success, this species can become what is known as an amplifier species. Providing the perfect conditions for the infectious disease to replicate extensively and be passed to more species. Bats have been implicated as the endemic host for many infectious diseases, including SARS-CoV-1 and MERS-CoV. Disease patterns in MERS-CoV have suggested camels are the amplifier species. The amplifier species for SARS-CoV-1 has long been suspected to be the civet cat. They were present at the wet markets SARS was traced back to in Guangdong, China in 2003 and were shown to be vulnerable to coronavirus infections. Wet markets are found all over china. They sell animal and fish products, including food and traditional medicines. Live animals of many different species are kept in close contact with each other in these markets, making them the ideal breeding ground for spillover events.

Unfortunately, genetic analysis alone cannot definitively say who the endemic host or amplifier species are in an outbreak. Field testing is required to identify the virus in the wild in order to confirm this. Genetic analysis can only provide clues to the source. The genome of SARS-CoV-2 was found to be 96% genetically identical to a bat coronavirus identified by scientists in 2013 who were searching for the animal reservoir of SARS-CoV-1. This makes it a potential candidate for spillover. The analysis did not provide evidence that the virus could pass directly from bats to humans. An amplifier species is highly likely to have been involved.

There have been a few suspected amplifier species since SARS-CoV-2 was identified. Initially it was thought to have passed from snakes to humans. This theory has since been dismissed by expert virologists as no coronavirus is known to infect snakes or any other reptile. The current suspicion lies with pangolins. The scaly mammals are the most illegally trafficked animal in the world – their scales are used in traditional Chinese medicine and they are known to be sold at wet markets. Two different studieshave shown that there is 99% genetic similarity between a coronavirus identified in pangolins and SARS-CoV-2.

It could take years to identify exactly when and how spillover of SARS-CoV-2 happened. In fact, that puzzle may never be solved. Other zoonotic disease (infectious diseases passed from animals to humans) have never managed to be traced to their endemic hosts. For now, one of the containment measures in China has been to shut wet markets and restrict live animal trading. A similar ban was introduced during the SARS epidemic, but was lifted once the disease was seen to be under control. Perhaps the ban needs to persist following this spillover event. Even if this outbreak can’t be traced back to a specific market, it is clear that housing live animals in these environments poses a significant risk of spillover events and thus to human health the world over.


Public Health Emergency of International Concern: https://www.statnews.com/2020/01/30/who-declares-coronavirus-outbreak-a-global-health-emergency/

Number of Infections, March 02nd: https://www.who.int/docs/default-source/coronaviruse/20200302-sitrep-42-covid-19.pdf?sfvrsn=d863e045_2

Stats from SARS outbreak: https://www.cdc.gov/sars/about/fs-sars.html

Stats from MERS outbreak: https://www.who.int/csr/don/24-february-2020-mers-saudi-arabia/en/

The other coronaviruses: https://www.nejm.org/doi/full/10.1056/NEJMoa2001017?query=featured_home

Genetic make-up of coronaviruses: https://viralzone.expasy.org/30?outline=all_by_species

Results of genetic analysis: https://www.who.int/docs/default-source/coronaviruse/who-china-joint-mission-on-covid-19-final-report.pdf

When did spillover happen?: https://www.statnews.com/2020/01/24/dna-sleuths-read-coronavirus-genome-tracing-origins-and-mutations/

Bat coronavirus: https://www.wired.co.uk/article/coronavirus-bats-snakes-pangolins

Why snakes are not the amplifier species: https://www.nature.com/articles/d41586-020-00180-8

Did SARS-CoV-2 come from pangolins?: https://www.wired.co.uk/article/coronavirus-bats-snakes-pangolins

Pangolin Image: By A. J. T. Johnsingh, WWF-India and NCF - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=79205863

135 views0 comments