Salty¶

salty.core module¶

salty.core.load_data(data_file_name, dillFile=False)[source]¶

Loads data from module_path/data/data_file_name. :param data_file_name: :type data_file_name: String. Name of csv or dill file to be loaded from :param module_path/data/data_file_name. For example ‘salt_info.csv’.:

Returns:	data – A data frame. For example with each row representing one salt and each column representing the features of a given salt.
Return type:	Pandas DataFrame

class salty.core.Benchmark[source]¶

Bases: object

static run(function)[source]¶

salty.core.check_name(user_query, index=False)[source]¶

checkName uses a database to return either SMILES or IUPAC names of cations/anions.

Default behavior is to return the SMILES encoding of an ion given the ion name as input.

Parameters:	user_query (str) – string that will be used to query the database.
Returns:	output – either the name of the salt, cation, or anion; or SMILES of the salt, cation, or anion (SMILES for the salt are written as the cation and ion SMILES strings separated by a comma)
Return type:	str

class salty.core.dev_model(coef_data, data_summary, data)[source]¶

Bases: object

the dev_model is the properly formated object to be passed to machine learning engine. The input features are all scaled and centered, the data summary describes the distribution of the data (in terms of state variables and output values).

salty.core.aggregate_data(data, T=[0, inf], P=[0, inf], data_ranges=None, merge='overlap', feature_type=None, impute=False)[source]¶

Aggregates molecular data for model training

Parameters:	data (list) – density, cpt, and/or viscosity T (array) – desired min and max of temperature distribution P (array) – desired min and max of pressure distribution data_ranges (array) – desired min and max of property distribution(s) merge (str) – overlap or union, defaults to overlap. Merge type of property sets feature_type (str) – desired feature set, defaults to RDKit’s 2D descriptor set
Returns:	devmodel – returns dev_model object containing scale/center information, data summary, and the data frame
Return type:	dev_model obj

salty.core.devmodel_to_array(model_name, train_fraction=1)[source]¶

a standardized method of turning a dev_model object into training and testing arrays

Parameters:

model_name (dev_model) – the dev_model object to be interrogated
train_fraction (int) – the fraction to be reserved for training

Returns:

X_train (array) – the input training array
X_test (array) – the input testing array
Y_train (array) – the output training array
Y_test (array) – the output testing array

salty.core.merge_duplicates(model_name)[source]¶

Identifies repeated experimental values and returns mean values for those data along with their standard deviation. Only aggregates experimental values that have been aquired at the same temperature and pressure.

Parameters:	model_name (dev_model) – the dev_model object to be interrogated
Returns:	output_val (array) – array of the means of experimental measurements output_xtd (array) – array of the standard deviations of repeated experimental measurements running_size (int) – number of unique experiments salts (list) – names of salts included in the dataset

salty.adaptive_learn module¶

salty.adaptive_learn.expand_convex_hull(data, expansion_target=[1, 1.005], target_number=10)[source]¶

A method of identifying property target values based on historical experimental data. The function returns targets outside the convex hull formed by the experimental data.

Parameters:	data (pandas dataframe) – the data frame to be interrogated expansion_target (float) – the relative area/volume expansion desired (compared to the initial area/volume described by the historical data) target_number (int) – the number of targets desired
Returns:	target_list – the list of property targets
Return type:	list of floats

salty.visualization module¶

salty.visualization.parity_plot(X, Y, model, devmodel, axes_labels=None)[source]¶

A standard method of creating parity plots between predicted and experimental values for trained models. :param X: experimental input data :type X: array :param Y: experimental output data :type Y: array :param model: either sklearn or keras ML model :type model: model object :param devmodel: salty dev_model :type devmodel: dev_model object :param axes_labels: optional. Default behavior is to use the labels in the dev_model

object.

Returns:	plt – parity plot of predicted vs experimental values
Return type:	matplotlib object