Adding data into HIPPO
In theory the HIPPO class makes it simple to insert new material into the database, but there are nuances, especially to improve performance.
In Getting started with HIPPO we saw the HIPPO.add_hits() method which loads all the data from a Fragalysis download or XChemAlign alignment. In this documentation page we’ll show how to insert data using other means.
Registering compounds
The HIPPO.register_compound() method takes as a minimum a smiles string and adds a compound to the database:
c = animal.register_compound(smiles='Clc1c[nH]cn1')
N.B. to prevent accidents, positional arguments are disabled for many methods in HIPPO.
In the case that a compound with this structure already exists, the compound is instead retrieved from the database (duplicates will never be created).
Registering compounds in bulk
The HIPPO.register_compounds() method can insert many compounds at once from a list of smiles
smiles = [...]
values = animal.register_compounds(smiles=smiles)
Part of the HIPPO compound registration process involves sanitising and flattening of smiles and generating InChI-keys. These returned values is a list of (inchikey, new_smiles) tuples which are in the same order as the input smiles.
Registering a Target and associated Poses
The Target class currently only stores a name, but it is a requirement that each Pose is associated with a single Target.
To create (or retrieve) a target:
a71 = animal.register_target(name="A71EV2A")
Then a pose can be registered:
animal.register_pose(
compound=c,
target=a71,
path=/path/to/bound.pdb, # protein ligand complex PDB
)
Optional arguments include:
tags
metadata
inspirations (other poses that inspired this one)
reference (in case your path is to a file containing only a ligand .mol, the reference points to another pose who’s protein will be used)
See also the API reference HIPPO.register_target() and HIPPO.register_pose().
Registering poses in bulk
The Database.register_poses() method can insert many poses at once from a list of dictionaries:
dicts = [
dict(
alias=..., # string can be None
reference_id=..., # reference pose id
inchikey=..., # pre-computed inchikey
smiles=..., # SMILEs
path=..., # path to mol-file on disk, used for uniqueness check, can be a fake path
compound_id=..., # Compound database ID
target_id=..., # Target database ID
mol=..., # rdkit.Chem.Mol
energy_score=..., # float, can be None
distance_score=..., # float, can be None
metadata=..., # dictionary, can be empty
),
...
]
pose_ids = animal.db.register_poses(dicts=dicts)
poses = animal.poses[pose_ids]
Loading data from an SDF
HIPPO can load compounds, poses, and their related information from a formatted SDF using the HIPPO.load_sdf() method:
animal.load_sdf(
target=..., # name of target
path=..., # path to SDF
)
Refer to the full API reference for more details: HIPPO.load_sdf()
Registering a reaction
To add a reaction:
animal.register_reaction(
type="Amidation",
product=c,
reactants=[a,b],
)
The above will register an amidation reaction combining compounds a and b into c.
See also the API reference HIPPO.register_reaction().