Open access drug discovery database launches with half a million compounds

ChEMBLdb, a vast online database of information on the properties and activities of drugs and drug-like small molecules and their targets, launches today with information on over half a million compounds. The data lie at the heart of translating information from the human genome into successful new drugs in the clinic.

5-minute read
5-minute read

The database is hosted by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI). It was transferred from biotech firm Galapagos NV in July 2008 through a £4.7 million Strategic Award from the Wellcome Trust.

ChEMBLdb is a unique resource because of its focus on drug discovery and its size: information on an additional 100,000 compounds has been added to the database for its launch, taking the number of small molecules to over 520,000, and it now contains over 2.4 million records of their effects on biological systems. The data include information about how small molecules bind to their targets, how these compounds affect cells and whole organisms, and information on the molecules' absorption, distribution, metabolism, excretion and toxicity.

Dr John Overington, leader of the ChEMBL team at EMBL-EBI, said: "We hope ChEMBLdb will assist the translation of genomic-based insights into innovative drug therapies. We are pleased that there has already been big demand for ChEMBLdb data - not only from large pharmaceutical companies but also from academic institutions and small companies who will particularly benefit from free access to the data."

The human genome sequence provided a molecular 'parts list' for a human being, comprising all the genes and proteins that are encoded by our genetic blueprint. In order to develop new medicines, it is important to catalogue how each of these 'parts' interacts with drugs and drug-like molecules. ChEMBLdb brings together information from the interface of the genome with chemistry into a set of 'chemogenomic' databases that can be used to help determine whether a particular molecule has the right properties to make an effective drug.

Professor Janet Thornton, Director of EMBL-EBI, said: "We are delighted to augment the biological data archived and served from EMBL-EBI with the ChEMBLdb resource. The database adds an important new capability to address the needs of the pharmaceutical and biotechnology industries, and provide the academic chemical biology communities with previously inaccessible data."

Dr Alan Schafer, Director of Science Funding at the Wellcome Trust, said: "This unprecedented transfer of pharmaceutical data resources from the private sector to the public domain should have the greatest impact on researchers in academia and in small companies on limited budgets. ChEMBLdb will be a major resource of information for driving forward medicinal chemistry and drug development in the UK and internationally."

The launch of ChEMBLdb is accompanied by the release of Kinase SARfari, an integrated resource of sequence, compound and screening data from a variety of sources for the protein kinases, a key family for drug discovery.