Sampling

Random recipe/selection generators

class hippo.rgen.RandomRecipeGenerator(db, *, max_lead_time=None, suppliers: list | None = None, start_with: Recipe | CompoundSet | IngredientSet | None = None, route_pool: RouteSet | None = None, out_key: str | None = None)[source]

Class to create randomly sampled Recipe from a HIPPO Database

RandomRecipeGenerator initialisation

str(self) str[source]

Unformatted string representation

dump_data()[source]

Dump data to JSON

classmethod from_json(db: Database, path: Path | str)[source]

Construct the RandomRecipeGenerator from a JSON file

generate(budget: float = 10000, currency: str = 'EUR', max_products: int = 1000, max_reactions: int = 1000, debug: bool = False, max_iter: int | None = None, shuffle: bool = True, balance_clusters: bool = False, permitted_clusters: None | set = None)[source]

Generate random recipe

Parameters:
  • budget – maximum budget (Default value = 10000)

  • currency – currency (Default value = ‘EUR’)

  • max_products – maximum number of products (Default value = 1000)

  • max_reactions – maximum number of reactions (Default value = 1000)

  • debug – increase verbosity for debugging (Default value = True)

  • max_iter – maximum number of iterations (Default value = None)

  • shuffle – randomly shuffle recipe pool (Default value = True)

  • balance_clusters – balance selection across scaffold clusters (Default value = False)

  • permitted_clusters – restrict selection to provided set of clusters (Default value = False)

get_route_pool(mini_test=False)[source]

Construct the pool of routes that will be randomly sampled from

Parameters:

mini_test – (Default value = False)

property route_pool

Get the RouteSet of all product reaction routes considered by this generator

class hippo.rgen.RandomSelectionGenerator(db, *, suppliers: list | None = None, amount: float = 1.0, start_with: Recipe | CompoundSet | IngredientSet = None, compounds: CompoundSet | None = None, quoted_only: bool = True)[source]

Class to create randomly sampled (no-chemistry) Recipe from a HIPPO Database

RandomSelectionGenerator initialisation

str(self) str[source]

Unformatted string representation

property amount: float

Amount to quote each compound for

property compound_pool: CompoundTable | CompoundSet

The pool of compounds that will be chosen from

dump_data()[source]

Dump data to JSON

classmethod from_json(db: Database, path: Path | str) RandomSelectionGenerator[source]

Construct the RandomRecipeGenerator from a JSON file

generate(budget: float = 10000, currency: str = 'EUR', max_iter: int | None = None, max_compounds: int = 1000, debug: bool = False, shuffle: bool = True)[source]

Generate random selection

Parameters:
  • budget – maximum budget

  • currency – currency

  • max_iter – maximum number of iterations

  • max_compounds – maximum number of compounds

  • debug – Increase verbosity for debugging

  • shuffle – Randomise order of compound pool

get_compound_pool(compounds: CompoundSet | None) CompoundTable | CompoundSet[source]

Get pool of compounds to select from

get_starting_recipe(start_with: Recipe | CompoundSet | IngredientSet) Recipe[source]

Process start_with into Recipe object

property quoted_only: bool

Only consider compounds with quotes

Scoring recipes

class hippo.scoring.Scorer(db: Database, directory: Path | str, pattern: str = '*.json', attributes: list[str] = None, populate: bool = True, load_cache: bool = True, allowed_poses: PoseSet | list[int] | None = None, out_key: str = 'scorer')[source]

Create a scorer object to score sets of recipes

Parameters:
  • dbDatabase

  • directory – path to directory containing recipe JSONs

  • pattern – glob pattern for Recipe JSON, default: “*.json”

  • attributes – attributes of Recipe objects to use for scoring

  • populate – Pre-populate query caches and child objects in memory (don’t disable unless you have a good reason)

  • load_cache – Load cache from existing JSON

  • allowed_pose_ids – Restrict interaction and subsite calculations to these Pose IDs

Scorer initialisation

repr(self) str[source]

ANSI Formatted string representation

str(self) str[source]

Unformatted string representation

add_custom_attribute(key: str, function: Callable, weight_reset_warning: bool = True) CustomAttribute[source]

Add a custom scoring attribute

Parameters:
  • key – name/key for the attribute

  • function – function call to get the attribute alue, will be passed Recipe object

  • weight_reset_warning – write a warning to indicate weights have been reset

add_recipes(json_paths: list, debug: bool = False) None[source]

Add more serialised Recipe objects to be scored

Parameters:
  • json_paths – list of JSON paths

  • debug – increase verbosity for debugging

property attribute_keys: list[str]

Return list of Attribute / CustomAttribute names/keys

property attributes: list[Attribute | CustomAttribute]

Return list of Attribute / CustomAttribute objects

property best: Recipe

Return highest scoring Recipe

compare(recipes: list[Recipe] | list[str]) None[source]

Compare attribute values and scores for recipes

Parameters:

recipes – list of Recipe objects or hashes

property db: Database

Database

classmethod default(db: Database, directory: Path | str, pattern: str = '*.json', skip: list[str] | None = None, load_cache: bool = True, subsites: bool = True, allowed_poses: PoseSet | list[int] | None = None, out_key: str = 'scorer') Scorer[source]

Create a Scorer instance with Default attributes

get_sorted_df() DataFrame[source]

Get DataFrame sorted by descending score

property json_path: Path

Path where cache will be written

property num_attributes: int

Count of attributes

property num_recipes: int

Number of recipes being evaluated

plot(keys: list[str], budget: float | None = None) plotly.graph_objects.Figure[source]

Plot any two attributes as a scatter plot

Parameters:
  • keys – list two attribute keys to plot

  • budget – limit Recipe objects to below this budget value

Returns:

plotly Figure object containing a scatter trace

property poses: PoseSet

Return all associated poses as PoseSet

property recipes: RecipeSet

Return RecipeSet of recipes being scored

score(recipe: Recipe, *, debug: bool = False) float[source]

Score a Recipe object

Parameters:
  • recipeRecipe to be scored

  • debug – increase verbosity for debugging

Returns:

float score from 0 to 1

property score_dict: dict[str, float]

Dictionary of scores keyed by Recipe.hash()

property scores: list[float]

List of Recipe scores

summary() None[source]

Print some summary statistics of the scorer’s attributes

top(n: int, budget: float | None = None) list[Recipe][source]

Return top n scoring Recipe

Parameters:
  • n – number of Recipe objects to return

  • budget – limit Recipe objects to below this budget value

Returns:

list of Recipe objects

top_keys(n: int, budget: float | None = None) list[str][source]

Return keys of top n scoring Recipe

Parameters:
  • n – number of keys to return

  • budget – limit Recipe objects to below this budget value

Returns:

list of Recipe hashes

property weights: list[float]

List of attribute weights

class hippo.scoring.Attribute(scorer: Scorer, key: str, *, inverse: bool = False, weight: float = 1.0, bins: int = 100)[source]

Scoring Attribute to be used with a Scorer object

Parameters:
  • scorer – associated Scorer

  • key – key/name for the attribute

  • inverse – if true, lower values score higher

  • weight – adjust scores by this weight

  • bins – number of scoring bins

Attribute initialisation

self(recipe: Recipe) float[source]

return the weighted score of a given Recipe

repr(self) str[source]

ANSI Formatted string representation

str(self) str[source]

Unformatted string representation

property bins: int

Number of bins

get_value(recipe: Recipe, serialise_price: bool = True, force: bool = False) float[source]

Get value for a Recipe

Parameters:
  • serialise_price – serialise Price objects to their amount

  • force – force calculation? (don’t use cache)

histogram() plotly.graph_objects.Figure[source]

Plot histogram of attribute values

property inverse: bool

Is this attribute inverted, lower values will score higher if true

property key: str

Get name/key

property max: float

Return maximum of values

property mean: float

Return mean of value

property min: float

Return minimum of values

property percentile_interpolator

Interpolator function

property scorer: Scorer

Get associated Scorer

property std: float

Return standard deviation of values

unweighted(recipe: Recipe) float[source]

Return unweighted percentile score for a given Recipe

property value_dict: dict[str, float]

Dictionary of attribute values keyed by Recipe hash

property values: list[float]

Return list of values

property weight: float

Return weight

class hippo.scoring.CustomAttribute(scorer: Scorer, key: str, function: Callable)[source]

Scoring attribute with a custom function

CustomAttribute initialisation

get_value(recipe: Recipe, serialise_price: bool = True, force: bool = False) float[source]

Compute custom attribute value for provided Recipe

Parameters:
  • serialise_price – serialise Price objects to their amount

  • force – force calculation? (don’t use cache)