Introduction to the CypComp Database
CypComp is a detailed database containing the compounds used to train and test the in silico metabolism prediction tools CypReact and CypBoM. All compounds were sourced from Zaretzki's dataset of known CYP450 reactants and non-reactants, various online chemical databases (e.g. HMDB, KEGG, DrugBank, and PubChem), and the scientific literature. Each CypCompound record (CypComp Card) contains the compound's name and structure along with either its reactant statuses (CypReact set) or bonds of metabolism (BoMs) (CypBoM set) for each of the nine most important human cytochrome P450 (CYP) enzymes (CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4).
- An in silico metabolism prediction tool that predicts whether a given small molecule will react with a specific CYP450 isozyme.
- An in silico metabolism prediction tool that predicts the location of a metabolic reaction (in terms of bonds) where a given small molecule will be metabolized by a specific CYP450 isozyme.
- Training Set
- The set of compounds used to train models implemented in CypReact and CypBoM that can predict the CYP450-mediated metabolism of chemical compounds.
- Testing Set
- The set of compounds used to test the trained models implemented in CypReact and CypBoM that can predict the CYP450-mediated metabolism of chemical compounds.
- A CypComp Database record uniquely identified by a CypComp ID and displayed in a CypComp Card that details a compound belonging to either the training or testing sets of either CypReact or CypBoM. All of the CypCompound's structure data and metadata (e.g. reactant statuses, BoMs, references to scientific literature) are downloadable in an SDF chemical table file.
|The date/time the record was created.
|The date/time the record was last updated.
|A unique CypComp accession number consisting of a 2 letter prefix (CC) and a 5 number suffix. This ID is used to access the CypCompound entry (i.e. CypComp Card) via the URL. If an entry is deleted, its CypComp ID will not be reused.
|The name of the compound.
|The compound's 2D chemical structural representation in a PNG image format.
|The compound's standard 27 character IUPAC International Chemical Identifier Key (a hashed version of the full InChI) designed to allow for easy web searches of chemical compounds.
|The identifier assigned by PubChem corresponding to the compound's InChIKey. Clicking on the identifier link takes the user to the compound's PubMed page.
|The tool (CypReact or CypBoM) and the set (Training or Testing) to which the CypCompound belongs to.
|Classifies whether a given compound is a reactant (R) or non-reactant (N) for a given CYP enzyme.
Reactant: a substrate of a given CYP enzyme as found reported in the scientific literature
Non-Reactant: a compound that is not metabolized by a given CYP enzyme as found reported in the scientific literature
Unknown: no scientific literature that indicates whether a given compound is or is not a substrate for a given CYP enzyme was found
|Bond of Metabolism (BoM)
|Describes the location (i.e. the exact bonds) where a CYP-mediated chemical reaction occurs for a given compound. Each BoM is specified by a 4-tuple in the format: <X;Y;ReactionType;ReactionID>.
X and Y: represent a pair of atoms where the associated bond either already appears in the molecule, or is formed in a reaction. X and Y can be either the atom numbers (corresponding to the atom numbers in the connection table) or element symbols.
ReactionType: records the type of the reaction occuring on the bond (e.g. Oxidation, Cleavage, Reduction, Hydroxylation).
ReactionID: groups all the different bonds affected (i.e. BoMs) in a single reaction. For example, all of the BoMs with a ReactionID of R1 are all of the individual bonds changed in one reaction.
|A list of scientific literature that report the reactant statuses or the CYP450 Phase I metabolism