readTrainingData

readTrainingData(data_files, calculator, data_classes=None)

Read the training data from file.

Parameters:
Returns:

A list of TrainingSet objects containing the configurations in the files.

Return type:

list

Usage Examples

Read in initial training data from two HDF5 files that are used as input for an active learning simulation.

# Read the initial pre-calculated training data for crystalline and amorphous silicon.
calculator = LCAOCalculator()
training_data = readTrainingData(
    ['Silicon_Crystal_Training_Data.hdf5', 'Silicon_Amorphous_Training_Data.hdf5'],
    calculator,
)

read_training_data.py

Notes

The readTrainingData utility function provides an automated method to read in various forms of training data intended for MTP training. The output is in the form of a list of TrainingSet objects that can be passed as a parameter to an MomentTensorPotentialTraining or ActiveLearningSimulation class. The first argument data_files takes a HDF5 file name or list of HDF5 file names from which to read the data. The second argument calculator takes the calculator that was used to generate the energy, forces and stresses in the training set. Where there is a calculator already set in the HDF5 file, this will be added in preference to the one specified in calculator.

When the training data is passed to an MomentTensorPotentialTraining the added calculator is checked to see if it is consistent with the training calculator to determine if the contained energy, force and stress data needs to be recalculated. When passed to an ActiveLearningSimulation the training data is included without checking for calculator consistency. Here configurations without energy, force or stress are not included.