GAINS package¶
gains.engine module¶
-
gains.engine.
get_best
(get_fitness, optimalFitness, geneSet, display, show_ion, target, parent_candidates)[source]¶ the primary public function of the engine
Parameters: - get_fitness (function) – the fitness function. Usually based on a molecular property. An example can be found in the salt_generator module
- optimalFitness (float) – 0-1 the user specifies how close the engine should get to the target (1 = exact)
- geneSet (object) – consists of atomtypes (by periodic number), rdkit molecular fragments and custom fragments (that are currently hard coded into the engine). These are the building blocks that the engine can use to mutate the molecular candidate via the _mutate() function
- display (function) – for printing results to the screen. Display is called for every accepted mutation
- show_ion (function) – for printing results to the screen. show_ion is called when a candidate has achieved the desired fitness score and is returned by the engine
- target (array, float, or int) – the desired property value to be achieved by the engine. If an array, a model containing multi-output targets must be supplied to the engine
- parent_candidates (array) – an array of smiles strings that the engine uses to choose a starting atomic configuration
Returns: child – the accepted molecular configuration. See Chromosome class for details
Return type: Chromosome object
-
gains.engine.
molecular_similarity
(best, parent_candidates, all=False)[source]¶ returns a similarity score (0-1) of best with the closest molecular relative in parent_candidates
Parameters: - best (object) – Chromosome object, the current mutated candidate
- parent_candidates (array) – parent pool of molecules to compare with best. These are represented by SMILES
- all (boolean, optional, default = False) – default behavior is false and the tanimoto similarity score is returned. If True tanimoto, dice, cosine, sokal, kulczynski, and mcconnaughey similarities are returned
Returns: - similarity_score (float)
- similarity_index (int) – if all=False the best tanimoto similarity score as well as the index of the closest molecular relative are returned if all=True an array of best scores and indeces of the closest molecular relative are returned
-
class
gains.engine.
suppress_rdkit_sanity
[source]¶ Bases:
object
Context manager for doing a “deep suppression” of stdout and stderr during certain calls to RDKit.
-
gains.engine.
generate_geneset
()[source]¶ Populates the GeneSet class with atoms and fragments to be used by the engine. As it stands these are hardcoded into the engine but will probably be adapted in future versions
Parameters: None – Returns: GeneSet – returns an instance of the GeneSet class containing atoms, rdkit fragments, and custom fragments Return type: object
-
gains.engine.
load_data
(data_file_name, pickleFile=False, simpleList=False)[source]¶ Loads data from module_path/data/data_file_name.
Parameters: - data_file_name (string) – name of csv file to be loaded from module_path/data/ data_file_name.
- pickleFile (boolean, optional, default = False) – if True opens pickled file
- simpleList (boolean, optional, default = False) – if true will open the saved list and properly handle split lines
Returns: data
Return type: Pandas DataFrame
-
class
gains.engine.
Chromosome
(genes, fitness)[source]¶ Bases:
rdkit.Chem.rdchem.Mol
The main object handled by the engine. The Chromosome object inherits the RWMol and Mol attributes from rdkit. Two additional attributes are added: genes and fitness. Genes is the SMILES encoding of the molecule, fitness is the score (0-1) returned by the fitness function
-
class
gains.engine.
GeneSet
(atoms, rdkitFrags, customFrags)[source]¶ Bases:
object
Consists of atomtypes (by periodic number), rdkit molecular fragments and custom fragments (that are currently hard coded into the engine). These are the building blocks that the engine can use to mutate the molecular candidate via the _mutate() function
gains.salt_generator module¶
-
gains.salt_generator.
generate_solvent
(target, model_ID, heavy_atom_limit=50, sim_bounds=[0.4, 1.0], hits=1, write_file=False)[source]¶ the primary public function of the salt_generator module
Parameters: - target (array, float, or int) – the desired property value to be achieved by the engine, if an array, a multi-output model must be supplied to the engine
- model_ID (str) – the name of the model to be used by the engine. Gains has several built-in models to choose from
- heavy_atom_limit (int, optional) – the upper value for allowable heavy atoms in the returned candidate
- sim_bounds (array, optional) – the tanimoto similarity score between the returned candidate and its closest molecular relative in parent_candidates
- hits (int, optional) – the number of desired solutions
- write_file (boolean, optional) – defaults to False. if True will return the solutions and a csv log file
Returns: new – default behavior is to return a pandas DataFrame. This is a log file of the solution(s). if write_file = True the function will also return pdb files of the solutions
Return type: