AlloyTrainingParameters

class AlloyTrainingParameters(reference_configurations, percentages=None, new_element=<class 'NL.CommonConcepts.PeriodicTable.Hydrogen'>, algorithm=None, supercell_repetition=None, rattle=None, atomic_rattling_amplitudes=None, sample_size=None, random_seed=None, optimize=None, optimize_geometry_parameters=None, log_filename_prefix=None, data_tag=None)

Class for storing parameters for generating a set of alloy training configurations.

Parameters:
  • reference_configurations (BulkConfiguration | sequence of [BulkConfiguration]) – One or more bulk reference configurations to be used to generate the training set. This is the unit cell from which a supercell can be defined and additional strain can be applied.

  • percentages (float or sequence of floats) – The percentages the atoms that should be substituted.
    Default: 25.

  • new_element (PeriodicTableElement or None (vacancy)) – The element to change to.
    Default: Hydrogen.

  • algorithm (FixedFraction | NormalDistribution | None) – The algorithm to randomly change a percentage of the atoms. NormalDistribution means you assign a number [0,1] from a normal distribution across all atoms and only pick those that have a value below the given percentage. FixedFraction means if you set 25%, you get 25% of the total number of atoms changed. The size of the random sample are the percentage of atoms provided, rounded to nearest integer.
    Default: FixedFraction.

  • supercell_repetition (sequence (size 3) of int) – The supercell to construct for each configuration, given as the number of repetitions of the bulk unit cell along the (a, b, c) directions.

  • rattle (bool) – Switch to turn on rattling of the cell to generate more configurations.
    Default: False.

  • atomic_rattling_amplitudes (sequence (size 3) of PhysicalQuantity of type length) – List of three random displacements (small, medium, large) of the atomic positions.
    Default: [0.15, 0.275, 0.4] * Angstrom.

  • sample_size (int) – The number of training configurations to generate. The actual number of returned samples can differ slightly due to rounding. When rattle is not set, the returned number of samples will likely be smaller since duplicates are removed.
    Default: 200.

  • random_seed (int) – The random seed used for generating the displacements.
    Default: Generated automatically.

  • optimize (bool | Calculator) – Switch to turn on optimization of reference configurations using a fast LCAO calculator with SingleZeta basis, KpointDensity(density_a=1.0*Angstrom) and NumericalAccuracyParameters(density_mesh_cutoff=60.0*Hartree). Alternatively a calculator object might be supplied to replace the default one.
    Default: False.

  • optimize_geometry_parameters (OptimizeGeometryParameters or None) – Parameters to be passed to a OptimizeGeometry object.
    Default: None.

  • log_filename_prefix (str) – Filename prefix for the logging output of the tasks associated with this set.
    Default: Defined by the MomentTensorPotentialTraining object.

  • data_tag (str) – Label for this training set to enable selection of different data in MTP fitting.

algorithm()
Returns:

The algorithm to randomly change a percentage of the atoms.

Return type:

FixedFraction | NormalDistribution

atomicRattlingAmplitudes()
Returns:

Maximum rattling intensities.

Return type:

PhysicalQuantity of type length | None

configurations()
Returns:

The list of configurations to be used to generate the training set.

Return type:

configuration: BulkConfiguration

dataTag()
Returns:

The selection tag added to the data in the training set.

Return type:

str

logFilenameIdentifier()
Returns:

Filename identifier for the logging output of the tasks associated with this set, or None if it hasn’t been set yet.

Return type:

str | None

logFilenamePrefix()
Returns:

Filename prefix for the logging output of the tasks associated with this set, or None if it is to be defined by the MomentTensorPotentialTraining object.

Return type:

str | LogToStdOut | None

newElement()
Returns:

The element to change to.

Return type:

PeriodicTable.Element or None (vacancy)

optimize()
Returns:

Switch to turn on optimization of interface configurations using the LCAO calculator.

Return type:

bool

optimizeGeometryParameters()
Returns:

The optimize geometry parameters.

Return type:

OptimizeGeometryParameters

percentages()
Returns:

The percentages of the atoms that should be changed.

Return type:

float or sequence of floats

randomSeed()
Returns:

The random seed used for generating the displacements, or None if it should be generated automatically.

Return type:

int | None

rattle()
Returns:

Switch to turn on rattling of the interfaces to generate more configurations.

Return type:

bool | None

referenceConfigurations()
Returns:

The list of reference configurations containing the new element.

Return type:

list of [BulkConfiguration]

sampleSize()
Returns:

The number of training configurations for each combination of list parameters.

Return type:

int

supercellRepetition()
Returns:

The supercell to construct for each configuration, given as the number of repetitions of the bulk unit cell along the (a, b, c) directions.

Return type:

tuple (size 3) of int

uniqueString()

Return a unique string representing the state of the object.

Usage Examples

Setup of a training set for a copper/silver alloy using AlloyTrainingParameters.

Note

The particular force-field used for optimization in below script is only for demonstration purposes and should be replaced by a higher quality method in actual training.

# Set up lattice
lattice = FaceCenteredCubic(3.61496*Angstrom)

# Define elements
elements = [Copper]

# Define coordinates
fractional_coordinates = [[ 0.,  0.,  0.]]

# Set up configuration
bulk_configuration_copper = BulkConfiguration(
    bravais_lattice=lattice,
    elements=elements,
    fractional_coordinates=fractional_coordinates
    )

# Define the substituting element
substitute = Silver

# Define calculator for pre-optimization calculations.
potentialSet = EAM_AgCu_2009()
calculator_ff = TremoloXCalculator(parameters=potentialSet)

optimize_geometry_parameters = OptimizeGeometryParameters(
    max_forces=0.1*eV/Ang,
    max_step_length=0.2*Ang,
    trajectory_interval=1,
    optimize_cell=True,
    optimizer_method=FIRE(),
    enable_optimization_stop_file=True,
    restart_strategy=NoRestart,
)

training_set = AlloyTrainingParameters(
    reference_configurations=bulk_configuration_copper,
    percentages=[10, 20, 30],
    new_element=substitute,
    algorithm=FixedFraction,
    supercell_repetition=(3, 3, 3),
    rattle=True,
    atomic_rattling_amplitudes=[0.275, 0.4, 0.15] * Angstrom,
    sample_size=60,
    random_seed=667,
    optimize=calculator_ff,
    optimize_geometry_parameters=optimize_geometry_parameters,
)

# -------------------------------------------------------------
# Calculator
# -------------------------------------------------------------
k_point_sampling = KpointDensity(
    density_a=5.0*Angstrom,
    )
numerical_accuracy_parameters = NumericalAccuracyParameters(
    density_mesh_cutoff=100.0*Hartree,
    k_point_sampling=k_point_sampling,
    occupation_method=MethfesselPaxton(0.2*eV, 1),
    )

iteration_control_parameters = IterationControlParameters(
    tolerance=5e-05,
    damping_factor=0.3,
    number_of_history_steps=12,
    max_steps=1000,
    non_convergence_behavior=StopCalculation(),
    )

checkpoint_handler = CheckpointHandler(
    time_interval=1000000.0*Hour,
    )
algorithm_parameters = AlgorithmParameters(
    scf_restart_step_length=0.3*Angstrom,
    )

calculator = LCAOCalculator(
    numerical_accuracy_parameters=numerical_accuracy_parameters,
    iteration_control_parameters=iteration_control_parameters,
    checkpoint_handler=checkpoint_handler,
    algorithm_parameters=algorithm_parameters,
    )

# Set up non-linear coefficients with optimization.
non_linear_coefficients_parameters = NonLinearCoefficientsParameters(
   perform_optimization=True,
   energy_only=False,
)

# Set up parameters to use in the MTP fitting.
fitting_parameters = MomentTensorPotentialFittingParameters(
   basis_size=1000,
   outer_cutoff_radii=4.5*Angstrom,
   mtp_filename='mtp_Cu-Ge_alloy.mtp',
   non_linear_coefficients_parameters=non_linear_coefficients_parameters,
)

# Set up MTP training.
moment_tensor_potential_training = MomentTensorPotentialTraining(
    filename='mtp_study',
    object_id='training',
    training_sets=training_set,
    calculator=calculator,
    calculate_stress=True,
    fitting_parameters_list=fitting_parameters,
    train_test_split=0.8,
    random_seed=13345,
    number_of_processes_per_task=8,
    log_filename_prefix='fit_mtp_Cu-Ge_alloy',
)
moment_tensor_potential_training.update()

alloy-training-sets.py

Notes

The AlloyTrainingParameters class can be used to generate training configurations by generating random alloys from a crystalline host materials and a replacement element using the substitutionalAlloy API.

The sample_size parameter (approximately) determines the total number of generated configurations.

It is recommended to pre-optimize the automatically generated alloy configurations by specifying a fast calculator. The optimization parameters can be adjusted by passing an OptimizeGeometryParameters object. By default the LCAO method with a single zeta basis set is used.

To achieve a more broadly applicable training set, the generated alloy configurations can be randomly rattled. A set of three displacement amplitudes has to be provided which will be ordered smallest to largest. In addition to the positions, also the cell shape and volume are rattled simultaneously using the smallest atomic displacement. The unrattled configurations are always part of the final set of configurations.

Alternatively the crystalTrainingRandomDisplacements protocol can be used with the trajectory of optimized alloy configurations.

For the actual training data generation, a AlloyTrainingParameters object needs to be passed into a MomentTensorPotentialTraining object.