What would the perfect mental health dataset look like?

Suzanne Gage, Research Lead for Metrics in our Mental Health team, shares her thoughts on how longitudinal data could help us better understand depression, anxiety and psychosis.

A dark and light blue illustration of a woman's face across seven disjointed puzzle pieces.


Licence: Attribution CC BY

Suzanne Gage

Suzanne Gage

Listen to this article
What would the perfect mental health dataset look like?
Elapsed time:00:00Total time:00:00

I love cohort studies. I did my PhD using a British-based pregnancy cohort (the Avon Longitudinal Study of Parents and Children or ALSPAC) and have more recently conducted cross-cohort comparison studies using ALSPAC and another cohort, the Millennium Cohort Study. 

When I was a researcher interested in understanding the links between cannabis use and psychosis, longitudinal datasets were the bread and butter for exploring this relationship. They’re a hugely important way to explore patterns in health and how these are related to potential risk or protective factors, even if they’re not the best way to get at causality. 

However, these studies are often designed to explore many and varied health behaviours, illnesses and other outcomes. While mental health outcomes are often included in these studies, they are usually very brief. And that’s not surprising.

All the researchers involved in designing a cohort are fighting for space – every measure added to investigate one outcome means another one can’t be. Participants only have so much time they’re willing or able to devote to answering surveys or attending a clinic, so difficult priority decisions have to be made. 

Mental health is harder to measure than something like blood pressure, or a diagnosis of cancer or heart disease. 

There are no biomarkers (yet) so we rely on asking someone about their symptoms and the impact they have on them. Often this takes the form of a self-report questionnaire. These take up valuable participant time and space, and so they are often kept short and perfunctory.

What if we could do mental health datasets differently? 

In February 2022, I started working for Wellcome within the Mental Health team.  

As part of my work, I’ve been tasked with thinking about how longitudinal data (not just limited to cohort studies) can help us achieve our mental health strategy

We want to gain a better understanding of how the brain, body and environment interact in depression, anxiety and psychosis, so we can spot potential points for early intervention, find better ways of identifying groups for intervention, and find new and improved ways of intervening.

That’s why I’ve been thinking about what a perfect mental health dataset would look like. 

If we weren’t constrained at all, what variables would we collect, and how often? What would our sample size look like and who would be our underlying population? Would a perfect dataset for depression look the same as a perfect dataset for psychosis research?

How do we make a ‘perfect’ mental health dataset? 

There are a couple of options for helping to make a perfect (caveat, nothing is perfect) dataset. Do we start from scratch – which would mean waiting for years before longitudinal data would be available – or do we enrich existing studies? 

To help with this work, we are scoping out the landscape of existing longitudinal datasets with the potential for transformative mental health research.

I’m also really keen to hear from mental health researchers who work with longitudinal data and lived experience experts:  

  • What are the variables you wish were in the datasets you use? 
  • Are there occasions where you wish concepts had been measured using different techniques or at different frequencies? 
  • What would your perfect dataset look like?