Salty¶
salty.core module¶
-
salty.core.load_data(data_file_name, dillFile=False)[source]¶ Loads data from module_path/data/data_file_name. :param data_file_name: :type data_file_name: String. Name of csv or dill file to be loaded from :param module_path/data/data_file_name. For example ‘salt_info.csv’.:
Returns: data – A data frame. For example with each row representing one salt and each column representing the features of a given salt. Return type: Pandas DataFrame
-
salty.core.check_name(user_query, index=False)[source]¶ checkName uses a database to return either SMILES or IUPAC names of cations/anions.
Default behavior is to return the SMILES encoding of an ion given the ion name as input.
Parameters: user_query (str) – string that will be used to query the database. Returns: output – either the name of the salt, cation, or anion; or SMILES of the salt, cation, or anion (SMILES for the salt are written as the cation and ion SMILES strings separated by a comma) Return type: str
-
class
salty.core.dev_model(coef_data, data_summary, data)[source]¶ Bases:
objectthe dev_model is the properly formated object to be passed to machine learning engine. The input features are all scaled and centered, the data summary describes the distribution of the data (in terms of state variables and output values).
-
salty.core.aggregate_data(data, T=[0, inf], P=[0, inf], data_ranges=None, merge='overlap', feature_type=None, impute=False)[source]¶ Aggregates molecular data for model training
Parameters: - data (list) – density, cpt, and/or viscosity
- T (array) – desired min and max of temperature distribution
- P (array) – desired min and max of pressure distribution
- data_ranges (array) – desired min and max of property distribution(s)
- merge (str) – overlap or union, defaults to overlap. Merge type of property sets
- feature_type (str) – desired feature set, defaults to RDKit’s 2D descriptor set
Returns: devmodel – returns dev_model object containing scale/center information, data summary, and the data frame
Return type: dev_model obj
-
salty.core.devmodel_to_array(model_name, train_fraction=1)[source]¶ a standardized method of turning a dev_model object into training and testing arrays
Parameters: - model_name (dev_model) – the dev_model object to be interrogated
- train_fraction (int) – the fraction to be reserved for training
Returns: - X_train (array) – the input training array
- X_test (array) – the input testing array
- Y_train (array) – the output training array
- Y_test (array) – the output testing array
-
salty.core.merge_duplicates(model_name)[source]¶ Identifies repeated experimental values and returns mean values for those data along with their standard deviation. Only aggregates experimental values that have been aquired at the same temperature and pressure.
Parameters: model_name (dev_model) – the dev_model object to be interrogated Returns: - output_val (array) – array of the means of experimental measurements
- output_xtd (array) – array of the standard deviations of repeated experimental measurements
- running_size (int) – number of unique experiments
- salts (list) – names of salts included in the dataset
salty.adaptive_learn module¶
-
salty.adaptive_learn.expand_convex_hull(data, expansion_target=[1, 1.005], target_number=10)[source]¶ A method of identifying property target values based on historical experimental data. The function returns targets outside the convex hull formed by the experimental data.
Parameters: - data (pandas dataframe) – the data frame to be interrogated
- expansion_target (float) – the relative area/volume expansion desired (compared to the initial area/volume described by the historical data)
- target_number (int) – the number of targets desired
Returns: target_list – the list of property targets
Return type: list of floats
salty.visualization module¶
-
salty.visualization.parity_plot(X, Y, model, devmodel, axes_labels=None)[source]¶ A standard method of creating parity plots between predicted and experimental values for trained models. :param X: experimental input data :type X: array :param Y: experimental output data :type Y: array :param model: either sklearn or keras ML model :type model: model object :param devmodel: salty dev_model :type devmodel: dev_model object :param axes_labels: optional. Default behavior is to use the labels in the dev_model
object.Returns: plt – parity plot of predicted vs experimental values Return type: matplotlib object