Sampling

Random recipe/selection generators

Class to create randomly sampled Recipe from a HIPPO Database

RandomRecipeGenerator initialisation

str(self) → str[source]: Unformatted string representation

dump_data()[source]: Dump data to JSON

classmethod from_json(db: Database, path: Path | str)[source]: Construct the RandomRecipeGenerator from a JSON file

generate(budget: float = 10000, currency: str = 'EUR', max_products: int = 1000, max_reactions: int = 1000, debug: bool = False, max_iter: int | None = None, shuffle: bool = True, balance_clusters: bool = False, permitted_clusters: None | set = None)[source]

Generate random recipe

Parameters:

budget – maximum budget (Default value = 10000)
currency – currency (Default value = ‘EUR’)
max_products – maximum number of products (Default value = 1000)
max_reactions – maximum number of reactions (Default value = 1000)
debug – increase verbosity for debugging (Default value = True)
max_iter – maximum number of iterations (Default value = None)
shuffle – randomly shuffle recipe pool (Default value = True)
balance_clusters – balance selection across scaffold clusters (Default value = False)
permitted_clusters – restrict selection to provided set of clusters (Default value = False)

get_route_pool(mini_test=False)[source]

Construct the pool of routes that will be randomly sampled from

Parameters:: mini_test – (Default value = False)

property route_pool: Get the RouteSet of all product reaction routes considered by this generator

class hippo.rgen.RandomSelectionGenerator(db, *, suppliers: list | None = None, amount: float = 1.0, start_with: Recipe | CompoundSet | IngredientSet = None, compounds: CompoundSet | None = None, quoted_only: bool = True)[source]

Class to create randomly sampled (no-chemistry) Recipe from a HIPPO Database

RandomSelectionGenerator initialisation

str(self) → str[source]: Unformatted string representation

property amount: float: Amount to quote each compound for

property compound_pool: CompoundTable | CompoundSet: The pool of compounds that will be chosen from

dump_data()[source]: Dump data to JSON

classmethod from_json(db: Database, path: Path | str) → RandomSelectionGenerator[source]: Construct the RandomRecipeGenerator from a JSON file

generate(budget: float = 10000, currency: str = 'EUR', max_iter: int | None = None, max_compounds: int = 1000, debug: bool = False, shuffle: bool = True)[source]

Generate random selection

Parameters:

budget – maximum budget
currency – currency
max_iter – maximum number of iterations
max_compounds – maximum number of compounds
debug – Increase verbosity for debugging
shuffle – Randomise order of compound pool

get_compound_pool(compounds: CompoundSet | None) → CompoundTable | CompoundSet[source]: Get pool of compounds to select from

get_starting_recipe(start_with: Recipe | CompoundSet | IngredientSet) → Recipe[source]: Process start_with into Recipe object

property quoted_only: bool: Only consider compounds with quotes

Scoring recipes

class hippo.scoring.Scorer(db: Database, directory: Path | str, pattern: str = '*.json', attributes: list[str] = None, populate: bool = True, load_cache: bool = True, allowed_poses: PoseSet | list[int] | None = None, out_key: str = 'scorer')[source]

Create a scorer object to score sets of recipes

Parameters:

db – Database
directory – path to directory containing recipe JSONs
pattern – glob pattern for Recipe JSON, default: “*.json”
attributes – attributes of Recipe objects to use for scoring
populate – Pre-populate query caches and child objects in memory (don’t disable unless you have a good reason)
load_cache – Load cache from existing JSON
allowed_pose_ids – Restrict interaction and subsite calculations to these Pose IDs

Scorer initialisation

repr(self) → str[source]: ANSI Formatted string representation

str(self) → str[source]: Unformatted string representation

add_custom_attribute(key: str, function: Callable, weight_reset_warning: bool = True) → CustomAttribute[source]

Add a custom scoring attribute

Parameters:

key – name/key for the attribute
function – function call to get the attribute alue, will be passed Recipe object
weight_reset_warning – write a warning to indicate weights have been reset

add_recipes(json_paths: list, debug: bool = False) → None[source]

Add more serialised Recipe objects to be scored

Parameters:

json_paths – list of JSON paths
debug – increase verbosity for debugging

property attribute_keys: list[str]: Return list of Attribute / CustomAttribute names/keys

property attributes: list[Attribute | CustomAttribute]: Return list of Attribute / CustomAttribute objects

property best: Recipe: Return highest scoring Recipe

compare(recipes: list[Recipe] | list[str]) → None[source]

Compare attribute values and scores for recipes

Parameters:: recipes – list of Recipe objects or hashes

property db: Database: Database

classmethod default(db: Database, directory: Path | str, pattern: str = '*.json', skip: list[str] | None = None, load_cache: bool = True, subsites: bool = True, allowed_poses: PoseSet | list[int] | None = None, out_key: str = 'scorer') → Scorer[source]: Create a Scorer instance with Default attributes

get_sorted_df() → DataFrame[source]: Get DataFrame sorted by descending score

property json_path: Path: Path where cache will be written

property num_attributes: int: Count of attributes

property num_recipes: int: Number of recipes being evaluated

plot(keys: list[str], budget: float | None = None) → plotly.graph_objects.Figure[source]

Plot any two attributes as a scatter plot

Parameters:

keys – list two attribute keys to plot
budget – limit Recipe objects to below this budget value

Returns:

plotly Figure object containing a scatter trace

property poses: PoseSet: Return all associated poses as PoseSet

property recipes: RecipeSet: Return RecipeSet of recipes being scored

score(recipe: Recipe, *, debug: bool = False) → float[source]

Score a Recipe object

Parameters:

recipe – Recipe to be scored
debug – increase verbosity for debugging

Returns:

float score from 0 to 1

property score_dict: dict[str, float]: Dictionary of scores keyed by Recipe.hash()

property scores: list[float]: List of Recipe scores

summary() → None[source]: Print some summary statistics of the scorer’s attributes

top(n: int, budget: float | None = None) → list[Recipe][source]

Return top n scoring Recipe

Parameters:

n – number of Recipe objects to return
budget – limit Recipe objects to below this budget value

Returns:

list of Recipe objects

top_keys(n: int, budget: float | None = None) → list[str][source]

Return keys of top n scoring Recipe

Parameters:

n – number of keys to return
budget – limit Recipe objects to below this budget value

Returns:

list of Recipe hashes

property weights: list[float]: List of attribute weights

class hippo.scoring.Attribute(scorer: Scorer, key: str, *, inverse: bool = False, weight: float = 1.0, bins: int = 100)[source]

Scoring Attribute to be used with a Scorer object

Parameters:

scorer – associated Scorer
key – key/name for the attribute
inverse – if true, lower values score higher
weight – adjust scores by this weight
bins – number of scoring bins

Attribute initialisation

self(recipe: Recipe) → float[source]: return the weighted score of a given Recipe

repr(self) → str[source]: ANSI Formatted string representation

str(self) → str[source]: Unformatted string representation

property bins: int: Number of bins

get_value(recipe: Recipe, serialise_price: bool = True, force: bool = False) → float[source]

Get value for a Recipe

Parameters:

serialise_price – serialise Price objects to their amount
force – force calculation? (don’t use cache)

histogram() → plotly.graph_objects.Figure[source]: Plot histogram of attribute values

property inverse: bool: Is this attribute inverted, lower values will score higher if true

property key: str: Get name/key

property max: float: Return maximum of values

property mean: float: Return mean of value

property min: float: Return minimum of values

property percentile_interpolator: Interpolator function

property scorer: Scorer: Get associated Scorer

property std: float: Return standard deviation of values

unweighted(recipe: Recipe) → float[source]: Return unweighted percentile score for a given Recipe

property value_dict: dict[str, float]: Dictionary of attribute values keyed by Recipe hash

property values: list[float]: Return list of values

property weight: float: Return weight

class hippo.scoring.CustomAttribute(scorer: Scorer, key: str, function: Callable)[source]

Scoring attribute with a custom function

CustomAttribute initialisation

get_value(recipe: Recipe, serialise_price: bool = True, force: bool = False) → float[source]

Compute custom attribute value for provided Recipe

Parameters:

serialise_price – serialise Price objects to their amount
force – force calculation? (don’t use cache)