scanOverNonLinearCoefficients¶

scanOverNonLinearCoefficients(number_of_initial_guesses=30, perform_optimization=None, max_force_rmse_change=None, max_steps=None, random_seed=None, regularization=None, energy_only=None, radial_function_order=None, zero_interaction_pairs=None, basis_size=None, inner_cutoff_radii=None, tapering_cutoff_radii=None, outer_cutoff_radii=None, mtp_filename_suffix=None, cutoff_function=None, load_energy=None, load_forces=None, load_stress=None, ridge_regression_regularization=None, ridge_regression_cross_validation=None, constant_terms=None, weights=None, forces_cap=None, data_tags=None, use_element_specific_coefficients=None, solver=None, use_gpu_acceleration=None)¶

Preset protocol for generating a series of MomentTensorPotentialFittingParameters to scan over different initial non-linear coefficients.

Parameters:

number_of_initial_guesses (int) – The number of initial guesses used to fit an MTP.
perform_optimization (bool) – Whether to optimize the non-linear coefficients during the fitting.
Default: False.
max_force_rmse_change (int) – The convergence criterion for the change in the RMSE on the force values between subsequent steps in the optimization of the non-linear coefficients. Ignored if perform_optimization is False.
Default: 0.05 * eV / Angstrom.
max_steps (int) – The maximum number of optimization steps to take. Ignored if perform_optimization is False.
Default: 100.
random_seed (non-negative int or sequence of non-negative ints) – The random seed used for generating the initial non-linear coefficients.
Default: Generated automatically.
regularization (float) – The regularization strength for the optimization. Ignored if perform_optimization is False.
Default: 1.0e-2.
energy_only (bool) – Whether to only optimize on the energy values. Ignored if perform_optimization is False.
Default: True.
radial_function_order (int) – The order of Chebychev polynomials in the expansion of each radial function. Only used when CutoffFunctionChebyshevExpansion is selected as cutoff function.
Default: 5.
zero_interaction_pairs (None | Sequence of tuples of PeriodicTableElement) – If not None, the interactions between the given element pairs will be set to zero.
basis_size (PredefinedBasisSmall | PredefinedBasisBig | int | str) – The basis set size to use. If a filename is given the basis is read for this file.
Default: PredefinedBasisSmall.
inner_cutoff_radii (PhysicalQuantity of type length | dict of{tuple (size 2) of Element: PhysicalQuantity of type length}) – The inner cutoff radius for each element pair. If a single value is given, this will be used for all pairs.
Default: 0.5 * Angstrom.
tapering_cutoff_radii (PhysicalQuantity of type length | dict of{tuple (size 2) of Element: PhysicalQuantity of type length}) – The tapering cutoff radius for each element pair. If a single value is given, this will be used for all pairs.
Default: 0.7 * Angstrom.
outer_cutoff_radii (PhysicalQuantity of type length | dict of{tuple (size 2) of Element: PhysicalQuantity of type length}) – The outer cutoff radius for each element pair. If a single value is given, this will be used for all pairs.
Default: 5.0 * Angstrom.
mtp_filename_suffix (str) – The filename suffix for the fitted MTP parameter set.
Default: Unique name set by MomentTensorPotentialTraining.
cutoff_function (CutoffFunctionChebyshevOriginal | CutoffFunctionChebyshevExpansion) – The cutoff function to use.
Default: CutoffFunctionChebyshevOriginal.
load_energy (LoadQuantityAlways | LoadQuantityNever | LoadQuantityWhereAvailable) – Whether to use the energy during fitting.
Default: LoadQuantityWhereAvailable.
load_forces (LoadQuantityAlways | LoadQuantityNever | LoadQuantityWhereAvailable) – Whether to use the forces during fitting.
Default: LoadQuantityWhereAvailable.
load_stress (LoadQuantityAlways | LoadQuantityNever | LoadQuantityWhereAvailable) – Whether to use the stress during fitting.
Default: LoadQuantityWhereAvailable.
ridge_regression_regularization (float) – The regularization strength for the ridge regression model.
Default: 1.0e-3.
ridge_regression_cross_validation (int) – The number of folds for enabling cross-validation in the ridge regression model.
Default: No cross-validation.
constant_terms (dict of {Element: PhysicalQuantity of type energy}) – The energy of an isolated atom for each species.
Default: Calculated with the fitting calculator. by MomentTensorPotentialTraining.
weights (Sequence of PhysicalQuantity of type (1/energy, length/energy, length**3/energy)) – The weights used for weighting energy, forces and stress [E, F, S] of every training configuration. If the quantity is not used for fitting or absent from the data, the corresponding weight is ignored.
Default: No weighting.
forces_cap (PhysicalQuantity of type energy / length) – Configurations with max. force magnitudes on an atom larger than this value will be discarded from the training data.
Default: 100.0 * eV / Angstrom.
data_tags (None | str | list) – A single tag or a list of tags identifying the training sets to be used in the fitting. None results in all available training sets being included.
Default: None.
use_element_specific_coefficients (bool) – True, if element-specific coefficients should be used, otherwise False.
Default: True.
solver (SVDSolver | ProjectionSolver) – The linear solver algorithm to determine the linear coefficients.
Default: ProjectionSolver
use_gpu_acceleration (bool) – Switch GPU acceleration using CUDA on or off. This accelerates the singular value decomposition (SVD or Projection) solver when determining the linear coefficients. This requires running the training on a CUDA-enabled GPU, otherwise it has no effect.
Default: False.

Returns:

The non-linear coefficients initial guesses.

Return type:

list of MomentTensorPotentialFittingParameters:

Usage Examples¶

Generate training data for a Silicon crystal and set up a random scan over non-linear coefficients. This will generate 30 separate fits sampling initial guesses for the non-linear coefficients.

# -*- coding: utf-8 -*-
setVerbosity(MinimalLog)

# -------------------------------------------------------------
# Bulk Configuration
# -------------------------------------------------------------

# Set up lattice
lattice = FaceCenteredCubic(5.4306*Angstrom)

# Define elements
elements = [Silicon, Silicon]

# Define coordinates
fractional_coordinates = [[ 0.  ,  0.  ,  0.  ],
                          [ 0.25,  0.25,  0.25]]

# Set up configuration
bulk_configuration = BulkConfiguration(
    bravais_lattice=lattice,
    elements=elements,
    fractional_coordinates=fractional_coordinates
    )

# -------------------------------------------------------------
# Calculator
# -------------------------------------------------------------
k_point_sampling = KpointDensity(
    density_a=7.0*Angstrom,
    )
numerical_accuracy_parameters = NumericalAccuracyParameters(
    density_mesh_cutoff=30.0*Hartree,
    k_point_sampling=k_point_sampling,
    occupation_method=FermiDirac(25.0*meV),
    )

iteration_control_parameters = IterationControlParameters(
    tolerance=5e-05,
    )

calculator = LCAOCalculator(
    numerical_accuracy_parameters=numerical_accuracy_parameters,
    iteration_control_parameters=iteration_control_parameters,
    )

# Generate a list of initial guesses for non-linear coefficients.
fitting_parameters_list = scanOverNonLinearCoefficients(
    number_of_initial_guesses=30,
    basis_size=PredefinedBasisSmall,
    mtp_filename_suffix='MTP_fit.mtp',
    random_seed=42,
    perform_optimization=False,

)

# Define RandomDisplacementsParameters used to generate the training sets.
training_sets = RandomDisplacementsParameters(
    reference_configurations=bulk_configuration,
    supercell_repetitions_list=[(1, 1, 1), (2, 2, 2)],
    sample_size=10,
    atomic_rattling_amplitudes=0.15*Angstrom,
    cell_rattling_amplitudes=0.07,
)

# Generate the displaced structures and calculate DFT training data.
mtp_training = MomentTensorPotentialTraining(
    filename='Silicon_crystal_mtp_training_testtest.hdf5',
    object_id='mtp_training',
    training_sets=training_sets,
    calculator=calculator,
    calculate_stress=True,
    fitting_parameters_list=fitting_parameters_list,
)
mtp_training.update()

# Determine the best fit and extract its parameters.
best_fit_index = mtp_training.rankFits(
    data_tags=None,
    weights=[[1, 1, 1], [1, 1, 1]],
    statistical_measure=R2Score
)[0][0]

best_fitting_parameters = mtp_training.fittingParametersList()[best_fit_index]

random_scan.py

Notes¶

The scanOverNonLinearCoefficients function provides a list of MomentTensorPotentialFittingParameters to set up a sampling of initial guesses. These parameters can be used in the MomentTensorPotentialTraining class to generate different fits within the machine-learned Moment Tensor Potential (MTP) framework.

The function combines the APIs of NonLinearCoefficientsParameters and MomentTensorPotentialFittingParameters while providing a simple way of generating a fitting parameters list without the need for scripting a loop. The names of the MTP potential files will be prepended by an index beginning with the number 0.

After the training has concluded, the fits can be ranked calling the``rankFits()`` method on the MomentTensorPotentialTraining instance. By default the R2 score between reference data and predicted data is used for ranking. Optionally, weights can be supplied for energies, forces and stresses of training and testing sets. The fitting parameters can be re-obtained by calling fittingParametersList() and filtering for the desired fit index.

The parameters can also be further optimized by using the corresponding MTP files as input for a second run or be passed to an ActiveLearningSimulation.

nl_parameters = NonLinearCoefficientsParameters(
    perform_optimization=True,
    energy_only=True,
    regularization=1.0e-3,
    initial_coefficients='0_MTP_fit.mtp',
)

fitting_parameters = MomentTensorPotentialFittingParameters(
    basis_size=PredefinedBasisSmall,
    mtp_filename='MTP_0_optimization.mtp',
    non_linear_coefficients_parameters=nl_parameters
)

The fitting of the linear coefficients can be accelerated by running on a GPU. This is particularly beneficial when the non-linear optimization is switched off via setting perform_optimization=False. To enable GPU acceleration, the flag use_gpu_acceleration needs to be set to True. To make use of this feature, the calculation needs to be run on a GPU that supports CUDA-11.8, otherwise it will fall back to the default CPU mode.