LinkML: an open data modeling framework

Year of award: 2024

Grantholders

  • Matthew Brush

    University of North Carolina at Chapel Hill, United States

Project summary

LinkML (Linked data Modeling Language) is an open, extensible modeling framework that allows computers and people to work cooperatively to model, validate, and distribute data that is reusable and interoperable, without the overhead normally required to do this. Collecting and organizing biomedical data in a way that allows for later reanalysis and reuse across projects is very challenging. Many data standards are not machine-actionable, or are defined in isolation, leading to siloization. AI and ML increasingly enable large-scale data analysis, but lack of data harmonization limits cross-disciplinary applications. LinkML addresses these issues, weaving together elements of the Semantic Web with aspects of conventional modeling languages to provide a pragmatic way to work with a broad range of data types, maximizing interoperability and computability across sources and domains. Despite its young age, LinkML has seen rapid uptake by users and developers thanks to its expressive data modeling, easy integration of ontologies, and developer-friendliness. However, LinkML lacks dedicated funding to support ongoing maintenance, new community-requested extensions, and community engagement. With this funding, we will: - Supporting our growing open-source developer community - Lowering the bar to creating LinkML models via user-friendly web interfaces - Extending model transformation