Next Generation Data Formats For 21st Century Biology

Grantholders

  • Prof Jason Swedlow

    University of Dundee, United Kingdom

  • Prof Madeline Parsons

    King's College London, United Kingdom

  • Prof Peter Lee

    University College London, United Kingdom

  • Prof Muzlifah Haniffa

    Newcastle University, United Kingdom

  • Dr Claire Walsh

    University College London, United Kingdom

  • Dr Matthew Hartley

    European Bioinformatics Institute, United Kingdom

  • Prof Stefanie Weidtkamp-Peters

    Center for Advanced imaging, Germany

Project summary

Bioimaging is one of the most innovative fields in modern biology. New probes, detectors, and modalities have together driven numerous discoveries in the life sciences. In particular, these advances have enabled multiscale and multimodal analysis that provide a holistic view of molecular, cellular and tissue architecture and dynamics. Moreover, the processing and analysis of bioimaging data has grown into its own field, with many advanced analytic and AI-based tools emerging that provide essential biological insights.

Our project was inspired by the AI revolution, where computed models can transform all aspects of our society.  In the biosciences, models that identify the boundaries and functions of object of imaging data will transform the pace of discovery, and the development of new diagnostics and drugs. A key part of this revolution is the reference data that is used to train and validate AI models.  For these methods to work, these data must be stored in open, efficient, flexible data structures that are designed for the cloud-based computing resources.  

Our project leverages >20 years of work by the Open Microscopy Environment (OME) and includes teams committed to cutting edge leading data generators in the UK and Europe. We have developed new concepts in high performance cloud-based data formats that target modern applications like data streaming, federation across diverse geographic regions and AI. This project, called OME Next Generation File Formats (NGFF) pairs new work in data storage technology with community engagement to ensure we build tools that can be used by the global bioimaging community. Our project combines software developers with experimental biologists, imaging scientists using many different modalities, and data repository experts who will collaborate to design and deliver these new data format technologies for use by the global imaging community.