readTrainingData¶
- readTrainingData(data_files, calculator, data_classes=None)¶
Read the training data from file.
- Parameters:
data_files (str | list) – Name of the training data file.
calculator (Calculator) – Calculator to add to the training sets.
data_classes (list |
MomentTensorPotentialTraining
|Trajectory
|ConfigurationDataContainer
|MDTrajectory
|OptimizationTrajectory
) – The type or types of data objects used to construct training sets.
- Returns:
A list of
TrainingSet
objects containing the configurations in the files.- Return type:
list
Usage Examples¶
Read in initial training data from two HDF5 files that are used as input for an active learning simulation.
# Read the initial pre-calculated training data for crystalline and amorphous silicon.
calculator = LCAOCalculator()
training_data = readTrainingData(
['Silicon_Crystal_Training_Data.hdf5', 'Silicon_Amorphous_Training_Data.hdf5'],
calculator,
)
Notes¶
The readTrainingData utility function provides an automated method to read in various
forms of training data intended for MTP training. The output is in the form of a list of TrainingSet objects that can be passed as a parameter to an MomentTensorPotentialTraining or ActiveLearningSimulation class. The first argument data_files
takes a HDF5
file name or list of HDF5 file names from which to read the data. The second argument
calculator
takes the calculator that was used to generate the energy, forces and stresses in
the training set. Where there is a calculator already set in the HDF5 file, this will be added
in preference to the one specified in calculator
.
When the training data is passed to an MomentTensorPotentialTraining the added calculator is checked to see if it is consistent with the training calculator to determine if the contained energy, force and stress data needs to be recalculated. When passed to an ActiveLearningSimulation the training data is included without checking for calculator consistency. Here configurations without energy, force or stress are not included.