The quest to know everything: 25 years of the Sanger Institute

Since 1993, the Wellcome Sanger Institute has been uncovering the secrets of life through its ground-breaking genomics projects. In this special feature to mark its first 25 years, Rob Reddick explores how its unique work is changing the world.

Detail from stained glass designed by Kathy Shaw Urlich, showing DNA and tree of life
The information Sanger provides about the structure and function of DNA lets others make great improvements to human health.
Credit: Rob Reddick

"You know, a lot of people, when Sanger was set up, couldn’t see the value in it," says Julian Parkhill, Group Leader of Pathogen Genomics at the Wellcome Sanger Institute. "They thought it was stamp collecting. 'Fishing expedition' was the phrase often thrown at us."

Today, it might be more accurate to describe this fishing as trawling. Genomics, the branch of biology that maps and unpicks the function of DNA, has exploded. Since human DNA was first sequenced in full, hundreds of thousands of genomes have been read, recorded, analysed and compared. Genomics has become biology’s first big data science.

And leading the way into life’s genetic oceans is the Sanger Institute, which turns 25 this month.

Not that you’d necessarily guess, looking around. Located on a country estate nine miles from Cambridge, UK, its campus is home to both a 17th-century hall and numerous brick-and-glass labs that could have been built yesterday.

It’s the sort of dualism that, spending time there, you realise characterises the place. History and the cutting edge. Competitiveness and openness. Ambition and modesty.

"We like to do the unthinkable experiments," says Sam Behjati, Group Leader of Cancer Genomics and Single Cell Transcriptomics. "But a lot of the stuff we do could be regarded as descriptive. That’s the beauty of what we do here. It’s quite fundamental: basic observing of the environment at nucleotide resolution."

This is Sanger’s remit: basic science – understanding DNA by analysing it at scale. The institute is a field leader, yes, but principally as an enabler. Whether it’s providing reference genomes for pathogens, being the first to sequence cancers, or developing model organisms – or new biomedical techniques, or new ways of analysing data – Sanger breaks new ground. The information it provides about the structure and function of DNA then lets others make great improvements to human health.

It generates this knowledge by sequencing more than 5,400 billion letters of DNA every day, a scale of work that makes it unique.

And yet, even after 25 years, there’s no end in sight. "In the world of genomics, we’re only in the foothills of discovery," says Michael Dunn, Head of Genomics and Molecular Sciences at Wellcome.

For Sanger, Dunn says, the big questions still lie ahead. Getting to the bottom of how DNA works – understanding what actually makes a collection of genes or variants come together to differentiate, say, a dog from a cat, or a sick human from a well one – "is probably a 100-year mission".

Yet, it hasn’t always seemed this way. When Sanger first opened its doors, the scope of this vision would have been unimaginable.

Becoming a world leader

The Sanger Institute began as a single-project initiative, set up by Wellcome and the UK’s Medical Research Council (MRC) to contribute to the Human Genome Project, the international collaboration to sequence the human genome.

John Sulston, from MRC’s Laboratory of Molecular Biology in Cambridge – whose prior work had focused on sequencing the genome of Caenorhabditis elegans, a nematode worm – was to be in charge.

Wellcome needed somewhere to house Sanger – and the EMBL-European Bioinformatics Institute, which was to be hosted in the UK – and so bought the Hinxton Hall Estate in 1992. The Sanger Institute opened its doors the following year. Its main sequencing centre was set up in an old factory building formerly used to manufacture metal tubes.

LinkThe sequencing lab was rustic but functional. "Looking back, there were no airs or graces," says Cordelia Langford, Sanger’s Director of Scientific Operations. "There was such huge pride to contribute to the project. We didn’t think about the surroundings, if the paint was peeling or the floor was uneven."

The central lab in the building soon became known as the Goldfish Bowl. In the beginning it housed a small, closely knit team of 15. The DNA sequencing machines were arranged on the old factory floor, with gantries above for observation. Sulston’s focus was on getting data released as soon as possible.

"John was very good at articulating this vision," says Langford. "But what made him such a good director was he was a regular guy. He was very much part of the team."

When the sequencing lab – by then in a new, purpose-built building – flooded in 1996, everyone mucked in to save the equipment and samples, including Sulston.

His approach, combined with the hard work of his team, paid off. The Human Genome Project published its full, openly available sequence in April 2003.

Sanger alone had sequenced a third of the genome, and was the project’s largest single contributor. By this point, the institute and its partners had also sequenced the genomes of a number of human pathogens.

With its initial research aim completed, thoughts turned to what Sanger should do next. The obvious follow-on was to bring function to the genome, and to do this, the institute needed to set itself up as a proper research faculty.

The somewhat romantic years of the Human Genome Project came to an end, giving way to a more modern structure; management frameworks were created and specific projects launched that could be reviewed over set periods. With a number of key genomes now completed, Wellcome called for a drastic reduction in sequencing. Biological research was to be Sanger’s focus.

But then technologies changed. High-throughput DNA sequencing arrived, reducing the time it takes to sequence a single human genome from over a decade to under 48 hours. Seeing the vast possibilities this opened up, Wellcome and Sanger decided to reverse the previous decision – the institute was to return to the forefront of sequencing.


And there it has remained.

Obviously, the scale of operations has changed, says Julia Wilson, one of Sanger’s Associate Directors. The breadth of diseases worked on and sizes of datasets involved are much larger today. New technologies and techniques have been brought on board too, such as single cell sequencing, RNA sequencing and using cancer organoids. "But the central theme of our work hasn’t changed," she says. "It’s still genomes and genome variation."

From mapping human life – to changing it

Behind a floor-to-ceiling glass wall there’s a large laboratory. Spaced out across its benches are 27 unassuming white boxes, blue LED strips on their fronts illuminating left to right and back again. Each looks like a cross between a fridge-freezer and KITT from Knight Rider.

These are the high-throughput sequencers, the heart of Sanger’s operations. What’s visible through the glass is just part of one of the largest DNA sequencing facilities in the world – there are more downstairs, busy working on the 100,000 Genomes Project for Genomics England. It’s not an overstatement to say that these machines and their predecessors are world changers.

The Human Genome Project often hogs the limelight, but Sanger has been home to countless other fascinating and game-changing projects. Other genomes sequenced at Sanger were published well before 2003. One was Sir John Sulston’s worm, C. elegans. Another was for Mycobacterium tuberculosis, which causes TB. Both were released in 1998.

LinkWhat the TB genome taught us first was the fundamentals, says Julian Parkhill – how TB emerges, how it’s transmitted, that it originated in Africa, and so on. But 20 years on, it’s having an impact clinically.

"We’re at a point where whole genome sequencing is being rolled out by Public Health England for every person in the UK who has TB," says Parkhill.

That means faster, cheaper and more accurate diagnoses, and crucial knowledge about how exactly the disease moves through the population. It also means drug-resistant TB can be quickly treated with the right antibiotics, stopping it from spreading and potentially killing.

Genomics, sometimes decried for not living up to its potential, or else downplayed as descriptive or speculative, is today being applied in real-world, potentially life-saving ways.

Point-of-care testing – sequencing in hospitals, in GP surgeries – will follow. Sanger’s pathogens team is already working with a local hospital to set up a Staphylococcus aureus genomic testing pathway, to get even more of the institute’s knowledge out there and applied. It’s been a success – the pathway can turn around a genomic test to see whether the bacteria are drug-resistant in 24 hours, and with more accuracy than conventional methods.  

"Within a few years, we’ll have hand-held sequencers that will do this," says Parkhill. "I don’t think that’s ambitious at all."

Pathogens are just one of Sanger’s areas of focus. Cancer is another – not surprising given that it’s a disease caused by errant, mutated DNA.

From being the first to sequence cancer genomes, the institute’s Cancer Genome Project is now working to catalogue the specific DNA mutations that cause different types of cancer. It’s already worked out the exact breakdown of almost every single fault that can happen in breast cancer and published these online. Several hundred other cancer types await the same treatment.

The theory is that this information can then be used by drug companies to inform more effective treatments that target specific mutations, known as precision medicine.

One target is BRAF V600E – a mutation in melanoma that was discovered at Sanger. Find it, treat it with a drug called vemurafenib, and it completely disappears, as does the melanoma.

"It’s absolutely amazing, incredible," says Sam Behjati. "It’s a proof of principle that if we have a specific mutation and have a drug to target it, we can make a cancer better," he says.

But there’s a hitch: the cancer can evolve to evade the drug. "There was that sort of hope that maybe all of human cancer is like this, and we just need to define or find the Achilles heels," says Behjati. The reality is that precision medicine hasn’t fundamentally changed how we treat cancer, at least not yet.

Nevertheless, the discovery has spurred his lab on. "I think the way forward is to get away from individual mutations and be more holistic about the whole thing, to try to target the cell," Behjati says.

The aim is to use drugs to modify rather than destroy: "The idea now is to give drugs to cells and see what they do to the function and identity of those cells, which we can read off from the DNA. Perhaps we might find ways of changing cells into something non-cancerous, rather than killing them."

The data for such experiments has been published and could be a plausible new method to treat childhood cancer, which is Behjati’s focus. What happens next, though, will depend on how others outside Sanger drive the knowledge on. "We need to develop an enthusiasm around us," says Behjati. "We can’t crack childhood cancer on our own."

In the meantime, prevention could be a valid alternative – at least for adult cancers, most of which are caused by cells interacting with carcinogens of some sort.

And on this, the institute is working with Cancer Research UK as part of its Grand Challenge. In this project, it’s sequencing 5,000 cancers from across the world and cross-referencing them with patients’ environments and lifestyles to find out what factors give rise to certain mutations.

It’s still early days, but, by matching up carcinogens with tumours, the hope is that it can reveal definitively what risk factors we should avoid, and then apply public health measures to encourage us to do so.

The A–Z of biology

All of this research comes at a cost. Wellcome has spent over £2.5 billion on Sanger and the wider Genome Campus since its inception. Each high-throughput sequencer alone costs over £600,000.

But it will be worth it, because DNA plays a role in nearly every disease. Genetics lie at the heart of antimicrobial resistance and cancer; they play a role in everything from obesity to depression.

Link"Genomics is influencing every field of biology, from archaeology to zoology," says Julian Rayner, Director of Wellcome Genome Campus Connecting Science, which connects the campus’s work with the wider world.

As if to prove this point, there are two skeletons on display in the campus’s Conference Centre, discovered during some building work. They’ve been carefully preserved, of course – but they’ve also had their genomes sequenced to reveal their ancestry.

"Ten years ago, I would never have predicted that we’d be rethinking the evolution of humans on this planet based on genomics, because you can’t sequence genomes from old bones," says Rayner. It turns out you can.

"One of the things about genomics," says Julian Parkhill, "is that the more you do it, the more demand for it there is." That in turn drives demand for faster, cheaper, more revealing techniques, and is the reason why the field moves so quickly. It’s worth pursuing because it shines a light deeply, as well as broadly.

"It also gives you information in a completely unbiased way," says Parkhill. Like many others, he speaks about modern genomics – hunting in huge datasets for genetic clues – as being hypothesis-forming rather than hypothesis-answering. "Certainly, it does allow you to answer hypotheses. But your collection of information is not constrained by the questions you could think of asking."

As an example, Parkhill mentions Salmonella. One form, Salmonella typhi, causes typhoid fever, and can only infect humans; another, the similarly named Salmonella typhimurium, only causes mild diarrhoea, but can have range of hosts.

For a long time, microbiologists questioned what S. typhi had that made it so harmful to humans. They devised all sorts of tools, but the question continued to plague biology for years. But once sequencing allowed the genomes of the two bacteria to be compared, they discovered S. typhi didn’t seem to have adapted to humans by acquiring extra genetic functions. Instead it had lost some of the functions of S. typhimurium. It wasn’t specialised, it was restricted.

"Genomics tells you things you weren’t even thinking about," Parkhill says.

A place like no other

Aside from the amazing science, there’s something special about Sanger itself that justifies its funding.

Partly it’s the size and the scale of operations, but the set-up is important too. San Francisco and Boston probably house more scientists, argues Julia Wilson, but with over 2,500 people on campus, she reckons nowhere else on Earth has such a density of genomic expertise – or the ability to progress science so rapidly.

This density makes the place incredibly collaborative. Talking about collaboration sounds like PR – but every single person I speak to while at Sanger mentions it.

"There’s a general consensus that you're better off always telling other people about your ideas," says Behjati. The risk of having your ideas scooped, he says, is far outweighed by what you can learn from others. It’s a philosophy that extends off campus too. "Everything is freely available – all of our data, all of our resources – so we’re underpinning science nationally and globally," says Wilson.

And walking around Sanger, there’s also a certain… je ne sais quoi.

"I think there’s something about the history of the place," says Rayner. "Scientists coming here have that sense of, 'I’m touching history in some way – something really big happened here.'"

While the institute’s work is undoubtedly serious, it doesn’t feel earnest. Walking to the main building, I pass a brigade of babies in four-seater pushchairs, the on-site crèche taking them on a tour of the grounds. In the toilets, a poster for a microbiome study declares, "Be cool. Donate your stool."    

Then there’s the setting. The wetlands. The gardens. But also the laboratories and data centre. The way they rub shoulders makes everything seem very balanced. "Sympathetic" is the term Wilson uses.

Old elements of the country estate have been preserved. And not just Hinxton Hall: the Conference Centre is housed in the old kitchen garden’s walls, but with a new glass roof, not dissimilar to the British Library’s.

Behind it is the estate’s orchard. Apples are piled in the grass beneath squat, gnarled trees. The Sulston years and the Human Genome Project may be over, but the rustic element hasn’t quite disappeared. It could be a scene from a cider advert, were it not for the wasps.

Thinking of the unknown

What does the future hold for Sanger? It depends who you ask.

Work is beginning on the Human Cell Atlas, which essentially is a project to create a Google Map of the human body that details individually what each of its 37 trillion cells is and does.

Then there are plans to sequence the genomes of every species on the planet, the Earth BioGenome Project. Sanger is set to play a key role in both.

And beyond that? "All the questions we have now, we have routes to answering," says Parkhill. In future, he thinks we’ll be looking into things we haven’t yet thought of.

What seems certain is that the relentless pace of change won’t slacken. "Technologies change so quickly now," says Michael Dunn. "Things you wouldn’t have even thought doable five years ago are now doable." For him, the next big question is functional: working out what all of the genes and non-coding DNA actually do.

But perhaps the biggest changes will be away from Sanger.

"For many people in the UK currently, genomics only comes in a health-threatening environment," says Julia Wilson. When we’re celebrating Sanger’s 50th anniversary, she hopes genomics will be a "daily occurrence for everyone". It’s plausible that everyone in the UK will have had their genome sequenced within the next couple of decades.

Before that, there’s still the small matter of marking Sanger’s 25th birthday. As you’d expect, the campus has held a staff party, but it’s also marking its milestone in other ways.

One is the 25 Genomes Project, which is sequencing 25 species that span the UK’s biodiversity, including a bat, a bee and the blackberry. The project has just released the genome for the golden eagle. This will help conservationists monitor a new breeding programme up in Scotland.

Closer to home, the institute is planting a new garden. At its centre will be a sapling, grown with seeds donated by Woolsthorpe Manor in Lincolnshire, Newton’s birthplace and where his apple tree still grows in the orchard.

By the time Sanger turns 50, the new tree should be bearing fruit, just like its cousins in the orchard. As for what fruit the institute will bear, we’re limited only by our imaginations.