crystalTrainingRandomDisplacements

crystalTrainingRandomDisplacements(reference_configuration, supercell_repetitions_list=None, target_sample_size=200, base_atomic_rattling_amplitude=PhysicalQuantity(0.15, Ang), max_atomic_rattling_amplitude=PhysicalQuantity(0.4, Ang), max_cell_rattling_amplitude=0.05, random_seed=None, strain_included=True, sample_size_per_stage=None, volume_strain_sample_size=None, system_sizes=None, log_filename_prefix=None, data_tag=None)

Preset protocol for a random displacement series for training crystalline materials. It contains stages of small displacements, small displacements with anisotropic strain, small displacements with volume strain, larger displacements, and larger displacements with anisotropic strain.

Parameters:
  • reference_configuration (BulkConfiguration | sequence of [BulkConfiguration] | Table) – One or more reference configurations to be used to generate the training set. For bulk configurations, this is the unit cell from which a supercell can be defined and additional strain can be applied.

  • supercell_repetitions_list (sequence (size 3) of int | sequence of sequence (size 3) of int) – The list of supercells to construct for each configuration, given as the number of repetitions of the bulk unit cell along the (a, b, c) directions. Cannot be specified if there are molecules in the list of reference configurations. If None: No repetitions.
    Default: None

  • target_sample_size (int) – The approximate number of training configurations to generate. The actual number of configurations can be slightly larger than the target. If target_sample_size is specified volume_strain_sample_size is determined and set automatically.
    Default: 200

  • base_atomic_rattling_amplitude (PhysicalQuantity of type length) – The minimum random displacement amplitude of the atomic positions used in the protocol.
    Default: 0.15 * Angstrom

  • max_atomic_rattling_amplitude (PhysicalQuantity of type length) – The maximum random displacement amplitude of the atomic positions used in the protocol.
    Default: 0.4 * Angstrom

  • max_cell_rattling_amplitude (float) – The maximum value of the strain tensor components in the stages that use anisotropic strains. Only applies to bulk configurations.
    Default: 0.05.

  • random_seed (int) – The random seed used for generating the displacements. If None: Generated automatically.
    Default: None.

  • strain_included (bool) – If strained configurations will be included.
    Default: True.

  • sample_size_per_stage (int) – The number of training configurations to generate for each stage of random displacements. It is recommended to use target_sample_size instead. If sample_size_per_stage is specified, target_sample_size is disabled.
    Default: None

  • volume_strain_sample_size (int) – The number of training configurations to generate for each of the 6 parts of volume strain stage of random displacements in the strained configurations. If not provided then it depends on sample_size_per_stage. The protocol is used in one stage which has six different volume strain. The total number of training configurations will be around volume_strain_sample size times six, multiplied with the number of configurations and supercell repetitions.
    Default: None

  • system_sizes (int | sequence of int) – Target number of atoms in the system. If None then supercell_repetitions_list will be used to do the repetition of the cell. If set, supercell_repetitions_list will be ignored.
    Default: None

  • log_filename_prefix (str) – Filename prefix for the logging output of the tasks associated with this set.
    Default: Defined by the MomentTensorPotentialTraining object.

  • data_tag (str) – Label for this training set to enable selection of different data in MTP fitting.
    Default: None.

Returns:

The RandomDisplacementsParameters objects for generating the training data and a single TrainingSet for the reference configurations. Hence, this method does not generate the configurations but returns the parameters/specifications for generating the training data configurations.

Return type:

A list of RandomDisplacementsParameters objects and a last element of the list being a TrainingSet.

Usage Examples

Set up a random displacement protocol and generate training data for a Silicon crystal. This will generate 181 training configurations (including the undisplaced configuration), which provides a basic training dataset for a MTP potential that can simulate the main properties of the crystal.

# Automatically create a protocol of random displacements and strain for different supercell sizes.
parameters = crystalTrainingRandomDisplacements(
    bulk_configuration,
    supercell_repetitions_list=[(2, 2, 2), (3, 3, 3)],
    sample_size_per_stage=10,
)

# Generate the displaced structures and calculate DFT training data.
mtp_training = MomentTensorPotentialTraining(
    filename='Silicon_crystal_mtp_training_testtest.hdf5',
    object_id='mtp_training',
    training_sets=parameters,
    calculator=calculator,
    calculate_stress=True,
)

mtp_training.update()

crystal_training.py

Notes

The crystalTrainingRandomDisplacements provides a pre-set list of RandomDisplacementsParameters to set up a basic training protocol for crystal structures. These parameters can be used in the MomentTensorPotentialTraining class to generate a training dataset for the machine-learned Moment Tensor Potential (MTP).

The protocol includes different stages of small atomic displacements with and without volume and anisotropic strain, followed by increasingly larger displacements magnitudes up to max_atomic_rattling_amplitude, with and without anisotropic strain. It is recommended to use various supercell sizes to improve the training of the total energy. When run in the MomentTensorPotentialTraining, this protocol typically provides a good basic training dataset that enables stable simulation of the crystal up to moderate temperatures, and a good description of crystal properties, such as lattice constants, phonons, and elastic constants.