GAINS package¶

gains.engine module¶

gains.engine.get_best(get_fitness, optimalFitness, geneSet, display, show_ion, target, parent_candidates)[source]¶

the primary public function of the engine

Parameters:	get_fitness (function) – the fitness function. Usually based on a molecular property. An example can be found in the salt_generator module optimalFitness (float) – 0-1 the user specifies how close the engine should get to the target (1 = exact) geneSet (object) – consists of atomtypes (by periodic number), rdkit molecular fragments and custom fragments (that are currently hard coded into the engine). These are the building blocks that the engine can use to mutate the molecular candidate via the _mutate() function display (function) – for printing results to the screen. Display is called for every accepted mutation show_ion (function) – for printing results to the screen. show_ion is called when a candidate has achieved the desired fitness score and is returned by the engine target (array, float, or int) – the desired property value to be achieved by the engine. If an array, a model containing multi-output targets must be supplied to the engine parent_candidates (array) – an array of smiles strings that the engine uses to choose a starting atomic configuration
Returns:	child – the accepted molecular configuration. See Chromosome class for details
Return type:	Chromosome object

gains.engine.molecular_similarity(best, parent_candidates, all=False)[source]¶

returns a similarity score (0-1) of best with the closest molecular relative in parent_candidates

Parameters:

best (object) – Chromosome object, the current mutated candidate
parent_candidates (array) – parent pool of molecules to compare with best. These are represented by SMILES
all (boolean, optional, default = False) – default behavior is false and the tanimoto similarity score is returned. If True tanimoto, dice, cosine, sokal, kulczynski, and mcconnaughey similarities are returned

Returns:

similarity_score (float)
similarity_index (int) – if all=False the best tanimoto similarity score as well as the index of the closest molecular relative are returned if all=True an array of best scores and indeces of the closest molecular relative are returned

class gains.engine.suppress_rdkit_sanity[source]¶

Bases: object

Context manager for doing a “deep suppression” of stdout and stderr during certain calls to RDKit.

gains.engine.generate_geneset()[source]¶

Populates the GeneSet class with atoms and fragments to be used by the engine. As it stands these are hardcoded into the engine but will probably be adapted in future versions

Parameters:	None –
Returns:	GeneSet – returns an instance of the GeneSet class containing atoms, rdkit fragments, and custom fragments
Return type:	object

gains.engine.load_data(data_file_name, pickleFile=False, simpleList=False)[source]¶

Loads data from module_path/data/data_file_name.

Parameters:	data_file_name (string) – name of csv file to be loaded from module_path/data/ data_file_name. pickleFile (boolean, optional, default = False) – if True opens pickled file simpleList (boolean, optional, default = False) – if true will open the saved list and properly handle split lines
Returns:	data
Return type:	Pandas DataFrame

class gains.engine.Chromosome(genes, fitness)[source]¶

Bases: rdkit.Chem.rdchem.Mol

The main object handled by the engine. The Chromosome object inherits the RWMol and Mol attributes from rdkit. Two additional attributes are added: genes and fitness. Genes is the SMILES encoding of the molecule, fitness is the score (0-1) returned by the fitness function

class gains.engine.GeneSet(atoms, rdkitFrags, customFrags)[source]¶

Bases: object

Consists of atomtypes (by periodic number), rdkit molecular fragments and custom fragments (that are currently hard coded into the engine). These are the building blocks that the engine can use to mutate the molecular candidate via the _mutate() function

class gains.engine.Benchmark[source]¶

Bases: object

benchmark method used by the unittests

static run()[source]¶

gains.salt_generator module¶

gains.salt_generator.generate_solvent(target, model_ID, heavy_atom_limit=50, sim_bounds=[0.4, 1.0], hits=1, write_file=False)[source]¶

the primary public function of the salt_generator module

Parameters:	target (array, float, or int) – the desired property value to be achieved by the engine, if an array, a multi-output model must be supplied to the engine model_ID (str) – the name of the model to be used by the engine. Gains has several built-in models to choose from heavy_atom_limit (int, optional) – the upper value for allowable heavy atoms in the returned candidate sim_bounds (array, optional) – the tanimoto similarity score between the returned candidate and its closest molecular relative in parent_candidates hits (int, optional) – the number of desired solutions write_file (boolean, optional) – defaults to False. if True will return the solutions and a csv log file
Returns:	new – default behavior is to return a pandas DataFrame. This is a log file of the solution(s). if write_file = True the function will also return pdb files of the solutions
Return type:	object