An expert explains how to track coronavirus variants

New Covid-19 variants could potentially jeopardise a lot of the work that has been done so far to contain the pandemic. Sonia Gonçalves explains how genomic surveillance can help us track and contain them.

Two scientists at the Sanger Institute prepare the sequencing machines.

Sonia Gonçalves

10-minute read
10-minute read
Listen to this article
An expert explains how to track coronavirus variants
Elapsed time:00:00Total time:00:00

Before the pandemic, Sonia Gonçalves was leading genomic surveillance for the Malaria Genomic Epidemiology NetworkSince spring 2020, her team at the Wellcome Sanger Institute has been expanded to support Covid-19 work, and is now looking to see where new variants appear as the virus mutates. 

1. How do you track new coronavirus mutations? 

We do that by continuously analysing lots of virus samples, looking at how the genetic code of the virus is changing and what mutations are being acquired. 

The technology that allows us to obtain the genetic information of a particular sample is called genome sequencing

Genomic surveillance is how we translate that data into knowledge about what’s happening and where. We combine the genomic data generated through sequencing with epidemiological data (about how the virus is spreading in communities). And that provides insights on how the virus populations are changing, and how the virus is evolving.  

2. Why is genomic surveillance important? 

Genomic surveillance is important because it helps us understand what is happening in the population of a virus.

For example, in the Covid-19 pandemic we may see cases going up, or cases going down. But we don't know what is causing that increase or decrease. With genomic surveillance, we are able to understand what's happening with the virus, how it’s spreading and if it’s changing in a significant way.

As an example, it was genomic surveillance that detected a rapid increase of the frequency of a new variant (Alpha, previously known as B.1.1.7) in South East England, where Public Health England were investigating the cause of rising infection, which was happening despite restrictions.

So genomic surveillance is a tool that can support public health agencies in how they control the pandemic. That’s why it is important that we work closely together with public health agencies to provide them with the genomic data that can be then integrated with the epidemiological data. 

That information is crucial for public health agencies to support decision-making on which interventions to use on the ground.

Two scientists are wheeling a stack of boxes with Covid-19 tests.

3. How does genomic surveillance work in practice?  

At Sanger we receive hundreds of thousands of samples every week, depending on the stage of the pandemic; at the peak we were receiving around 400,000 to 500,000 samples a week. These are both positive and negative samples which arrive from coronavirus testing laboratories from across the UK.

There is a lot of activity coordinating the reception of all those samples, to make sure that we are getting the samples and all associated data required for processing, and at the right time of the day, so turnaround time to retrieve the data is as fast as possible. 

First, the positive samples are automatically selected, then processed in the lab and sequenced. The sequencing process takes around 24 hours. The sequencing data generated is then combined, analysed and reported daily to public health agencies and the Department of Health and Social Care in the UK.

So our daily activities are focused on making sure we are getting all the samples flowing through the pipelines, and that we are producing the data and sharing it on a daily basis.

It’s a massive operation, with a huge number of teams involved. To date, the Sanger Institute has generated a massive amount of data – around a quarter of all the genomic data on SARS-CoV-2 in the world. Our work is part of the Covid-19 Genomics UK Consortium (COG-UK) which has generated almost half of all the genomic data on SARS-CoV-2 in the world.

4. What happens with all the genomic data you produce?  

As soon as the sequencing is complete and the data analysed, the genomic data is immediately shared at a national and international level, and it becomes accessible to people around the world. We share it via different databases, such as GISAID, an international database where all the Covid-19 data is at present.  

Sharing the data is fundamental for managing the pandemic. We can only control the pandemic if it's a global effort, because if a variant of concern appears in one part of the world, it affects the whole world.

We can have a really good programme nationally, but if there is a new variant in a different country and we are not detecting it, or we don't know about it, it can jeopardise all the work that is being done.

A man walks in a room full of data servers.

5. How is genomic surveillance data used in the Covid-19 pandemic? 

There are different ways in which we use genomic surveillance data during a pandemic.

At the beginning of Covid-19, we were looking at how the virus was being introduced in the country, and how it was spreading within the UK.

We were also using genomic data to control outbreaks. Looking at the data, we could see if a surge of cases in one place was caused by one variant or by different ones. So we knew if it was an outbreak caused by a specific variant that was transmitting very fast, or if there were different variants in the surroundings that were causing the increase of cases. 

More recently, the focus has shifted to the use of genomic data to identify and track 'variants of concern'. The virus has developed new mutations, and there are several variants of concern circulating in the population. Some of them are associated with high transmissibility, which is in itself a concern, as it will cause an increase in cases. 

There are also concerns that some of those variants may affect the efficacy of some vaccines. It's very important that, as soon as a new variant is identified, that data is shared publicly. And that the vaccine programmes and the researchers working on developing vaccines can immediately use that data: first of all, to assess if those variants have any impact on the vaccines; and if that is the case, to quickly make changes to adapt the vaccines.

Two scientists wearing facemasks work in a lab, surrounded by lab equipment.

6. How are ‘variants of concern’ identified? 

The virus is always evolving. This is a natural process – as the virus replicates, mutations occur. Viruses with new mutations are called variants.

A ‘variant under investigation’ may have one or a series of mutations, and it's usually associated with an epidemiological context. For example, if we see a specific variant or a specific group of mutations, but we are also seeing that the number of cases of that specific variant is increasing, that could be declared a ‘variant under investigation’.

It becomes a ‘variant of concern’ when a specific mutation, or a specific group of mutations, is also associated with a difference in how the virus behaves; for example, that specific variant has an increased transmissibility, or an impact on the immune response.

The variant of concern is declared by the public health agencies. We work with them to provide the supporting genomic data. When a variant of concern is detected, it's very important to monitor how that variant is spreading, and how that variant is being introduced in the country.

From the genomic surveillance point of view, as soon as a variant of concern is detected, we are focused on identifying all the samples that go through the sequencing process, which ones are variants of concern or variants under investigation.

Our data analysis pipelines are currently screening for 4 variants of concern and 6 variants of investigation, but this is constantly being reviewed to add new ones. This data is then provided to the public health agencies, which use that information to contain the spread of those variants. 

7. How can genomic data help to contain the pandemic?  

The best example is through the detection of variants of concern. That’s where the greatest value of genomic surveillance is, in this phase of the pandemic.

The existence of these variants is worrying because they can jeopardise all the work that has been done so far with the vaccine programmes. So having the ability to know what's happening, where these variants are spreading and how to contain them, is very important, and genomic data needs to be there to support those activities.

And there's absolutely no doubt that more variants will come, the virus is continually evolving. And the problem is that the numbers are so high at the global level, that there are many opportunities for the virus to evolve and develop new mutations. So we will have new variants. And we need to keep that ability to detect them as soon as they appear.

It's very important that all countries increase their genomic surveillance efforts because knowing what is happening and how the virus is evolving is absolutely fundamental to containing the pandemic.

And even after the pandemic is controlled, it is very important to continue genomic surveillance as a baseline activity, because it will still provide us with information on how the virus is changing or not.

8. What’s the role of genomic surveillance in the future?  

The work we have been doing during Covid-19 has shown the value that genomic surveillance can bring to controlling a pandemic and how important it is to work closely with public health agencies. This work has provided the resources, the analytical methods and the foundations on how we can better respond to future pandemics. 

It has been an incredible learning experience. And we are still learning as we go. And we will still be learning even after the pandemic ends. But I truly believe we will be better prepared to respond to future pandemics.