About the CypComp Database
What is the CypComp Database?
The CypComp Database is a freely available electronic database containing the compounds used to train and test the in silico metabolism prediction tools CypReact and CypBoM. Both of these tools were created using a machine learning approach to produce models that can predict the cytochrome P450-mediated metabolism of chemical compounds. The database houses CypReact's training set of 1631 compounds and testing set of 169 compounds, each of which is labelled as a reactant or non-reactant for each of the nine most important human CYP enzymes. CypBoM's training set of 679 compounds and testing set of 73 compounds is also available, each with a listing of its bonds of metabolism (BoMs) (i.e. the exact bonds where each CYP enzyme metabolizes the given compound). All CypComp data is supported by scientific literature, is downloadable, and is intended for applications in pharmaceutics, toxicology, environmental monitoring, metabolomics, food science, and personalized medicine. Users can download the CypBoM tool from its repository.
All of the compounds found in the database were sourced from Zaretzki's dataset of known CYP450 reactants and non-reactants, various online chemical databases (e.g. HMDB, KEGG, DrugBank, and PubChem), and the scientific literature. Each CypCompound record (CypComp Card) contains the compound's name and structure along with either its reactant statuses (CypReact set) or bonds of metabolism (BoMs) (CypBoM set) for each of the nine most important human cytochrome P450 (CYP) enzymes (CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4). Users can download these compounds to use in their own research and to build their own models.
CypReact and CypBoM were created to help predict the metabolism of chemicals absorbed into the human body. CypReact predicts whether a given small molecule will react with a specific CYP450 isozyme and CypBoM predicts the precise location of a metabolic reaction (in terms of bonds) where a given small molecule will be metabolized. On a daily basis, humans are exposed to many chemicals through routine interactions with the environment. These exposures can occur as a result of food/drug consumption, household or workplace activities, industrial or transportation activities, and even common environmental processes. Once absorbed, these chemicals usually undergo further biologically-mediated transformations. These biotransformations can be beneficial or detrimental and understanding how a molecule can be transformed (aka metabolized) is crucial for the assessment of its bioavailability, bioactivity, and toxicology. In humans, many chemicals are extensively metabolized by CYP450 enzymes in the liver and kidney. Among the >50 known CYP450 variants, nine are the most expressed and are responsible for most of the known phase I metabolism of drugs, food compounds, environmental pollutants, and other xenobiotic molecules. As a result, identifying the metabolites of CYP450 metabolism through chemical experiments and in silico metabolite prediction have become increasingly important.
*Please note that the compound (+)-neomenthol has been removed from CypReact's testing set, therefore making the total one less than the 1632 compounds reported in the CypReact paper. No scientific literature reporting that (+)-neomenthol can be metabolized by the nine major CYP450 enzymes was found. Only its stereoisomers (+)-menthol and (-)-menthol were reported to be metabolized by at least CYP2A6.
Citing the CypComp Database
The CypComp Database is offered to the public as a freely available resource. Use and re-distribution of the data, in whole or in part, for commercial purposes requires explicit permission of the authors and explicit acknowledgment of the source material (CypComp Database) and the original publication (see below). We ask that users who download significant portions of the database cite the following paper in any resulting publications.
- Tian S, Djoumbou-Feunang Y, Greiner R, Wishart DS, CypReact: A Software Tool for in Silico Reactant Prediction for Human Cytochrome P450 Enzymes. J Chem Inf Model. 2018 Jun 25;58(6):1282-1291. 29738669