MolecularConfigurationsParameters¶
- class MolecularConfigurationsParameters(reference_configurations, sample_size=None, monte_carlo_calculator=None, monte_carlo_temperature=None, max_rotated_bonds=None, overlap_ratio=None, max_displacement=None, sample_cis_trans=None, generate_bulk_configurations=None, random_seed=None, log_filename_prefix=None, data_tag=None)¶
Class for storing parameters for generating a set of training configurations using a combination of sampling different torsion angles and atomic displacements.
- Parameters:
reference_configurations (
MoleculeConfiguration
|BulkConfiguration
|DeviceConfiguration
|SurfaceConfiguration
| sequence of configurations) – One or more reference configurations to be used to generate the training set.sample_size (int) – The total number of configurations to return. Default: 20
monte_carlo_calculator (Calculator | None) – The calculator to use for the molecular dynamics simulations. Giving None disables Monte Carlo structure selection. Default: None.
monte_carlo_temperature (
PhysicalQuantity
of type temperature) – Temperature used in the Monte Carlo selection. Default: 300K.max_rotated_bonds (int) – The maximum number of bonds rotated in each new configuration. Default: Rotate all available bonds in each configuration.
overlap_ratio (float) – The acceptable VDW radius overlap between two atoms. Default: 0.7.
max_displacement (
PhysicalQuantity
of type length) – The maximum displacement in the atomic random displacement. Default: 0.1 Angstrom.sample_cis_trans (bool) – Whether or not cis and trans orientations of SP2 bonds are sampled. Default: True.
generate_bulk_configurations (bool) – Whether or not the returned dataset contains bulk configurations. Default: False.
random_seed (int) – The random seed used for generating the displacements. Default: Generated automatically.
log_filename_prefix (str) – Filename prefix for the logging output of the tasks associated with this set. Default: Defined by the
MomentTensorPotentialTraining
object.data_tag (str) – Label for this training set to enable selection of different data in MTP fitting.
- dataTag()¶
- Returns:
The selection tag added to the data in the training set.
- Return type:
str
- generateBulkConfigurations()¶
- Returns:
Whether or not the returned configurations are Bulk or Molecules.
- Return type:
bool
- logFilenameIdentifier()¶
- Returns:
Filename identifier for the logging output of the tasks associated with this set, or None if it hasn’t been set yet.
- Return type:
str | None
- logFilenamePrefix()¶
- Returns:
Filename prefix for the logging output of the tasks associated with this set, or None if it is to be defined by the
MomentTensorPotentialTraining
object.- Return type:
str |
LogToStdOut
| None
- maxDisplacement()¶
- Returns:
The maximum random displacement for each atom.
- Return type:
- maxRotatedBonds()¶
- Returns:
The maximum number of bonds rotated in each trial configuration.
- Return type:
int
- monteCarloCalculator()¶
- Returns:
The calculator used to calculate Monte Carlo energies.
- Return type:
Calculator | None
- monteCarloTemperature()¶
- Returns:
The temperature used in the Monte Carlo selection
- Return type:
PhysicalQuantity of type temperature
- overlapRatio()¶
- Returns:
The minimum acceptable VDW radii overlap ratio.
- Return type:
float
- randomSeed()¶
- Returns:
The random seed used in the random number generator.
- Return type:
int | None
- referenceConfigurations()¶
- Returns:
The list of reference configurations to be used to generate the training set.
- Return type:
list of [
MoleculeConfiguration
|BulkConfiguration
]
- sampleCisTrans()¶
- Returns:
Whether or not cis and trans orientations of SP2 bonds are sampled.
- Return type:
bool
- sampleSize()¶
- Returns:
The number of training configurations for each combination of list parameters.
- Return type:
int
- strainTensors()¶
- Returns:
The list of strains to apply to each configuration (None for no strain).
- Return type:
list of [numpy.ndarray | None]
- supercellRepetitionsList()¶
- Returns:
The list of supercells to construct for each configuration, given as the number of repetitions of the bulk unit cell along the (a, b, c) directions.
- Return type:
list of tuple (size 3) of int
- systemSizes()¶
- Parameters:
system_sizes (int | sequence of int) – Minimum number of atoms in the system. If None then supercell_repetitions_list will be used to do the repetition of the cell. If set, supercell_repetitions_list will be ignored. Default:
None
- uniqueString()¶
Return a unique string representing the state of the object.
Usage Examples¶
Setup of a training set of C2F 4 and C2F 6 using MolecularConfigurationsParameters.
setVerbosity(MinimalLog)
# -------------------------------------------------------------
# Fluoroethane configuration
# -------------------------------------------------------------
# Define elements
elements = [Carbon, Carbon, Fluorine, Fluorine, Fluorine, Fluorine, Fluorine, Fluorine]
# Define coordinates
cartesian_coordinates = [[-0.003263937084, -0. , 0. ],
[ 1.516263937084, -0. , -0. ],
[-0.469715381813, 0. , -1.301600235398],
[-0.469715381813, -1.127218869426, 0.650800117699],
[-0.469715381813, 1.127218869426, 0.650800117699],
[ 1.982715381813, 1.127218869426, -0.650800117699],
[ 1.982715381813, 0. , 1.301600235398],
[ 1.982715381813, -1.127218869426, -0.650800117699]]*Angstrom
# Set up configuration
C2F6_configuration = MoleculeConfiguration(
elements=elements,
cartesian_coordinates=cartesian_coordinates
)
C2F6_configuration.findBonds()
# -------------------------------------------------------------
# Fluoroethene configuration
# -------------------------------------------------------------
# Define elements
elements = [Carbon, Carbon, Fluorine, Fluorine, Fluorine, Fluorine]
# Define coordinates
cartesian_coordinates = [[ 0.089971642147, 0. , -0. ],
[ 1.423028357853, 0. , 0. ],
[-0.594324695679, 0. , -1.173915650421],
[-0.594324695679, 0. , 1.173915650421],
[ 2.107324695679, 0. , 1.173915650421],
[ 2.107324695679, 0. , -1.173915650421]]*Angstrom
# Set up configuration
C2F4_configuration = MoleculeConfiguration(
elements=elements,
cartesian_coordinates=cartesian_coordinates
)
C2F4_configuration.findBonds()
# -------------------------------------------------------------
# Specify molecular training set
# -------------------------------------------------------------
training_set = MolecularConfigurationsParameters(
[C2F6_configuration, C2F4_configuration],
sample_size=100,
generate_bulk_configurations=True
)
# -------------------------------------------------------------
# Specify fitting parameters
# -------------------------------------------------------------
nl_parameters = NonLinearCoefficientsParameters(
perform_optimization=True,
energy_only=False,
)
fitting_parameters = MomentTensorPotentialFittingParameters(
inner_cutoff_radii=1.0*Angstrom,
tapering_cutoff_radii=1.1*Angstrom,
outer_cutoff_radii=5*Angstrom,
mtp_filename='Carbon_Fluorine_MTP.mtp',
non_linear_coefficients_parameters=nl_parameters,
)
# -------------------------------------------------------------
# Generate the MTP
# -------------------------------------------------------------
mtp = MomentTensorPotentialTraining(
'Carbon_Fluorine_MTP.hdf5',
'Carbon_Fluorine_MTP',
[training_set],
calculator=LCAOCalculator(),
fitting_parameters_list=[fitting_parameters],
train_test_split=0.9,
)
mtp.update()
This script generates 100 configurations each of C2F 4 and C2F 6. The energy, force and stress
of each configuration is then calculated using density functional theory. This data is then used
to fit an MTP potential with settings specified with
MomentTensorPotentialFittingParameters. The MomentTensorPotentialTraining study
object containing the data and fitting statistics is in the file Carbon_Fluorine_MTP.hdf5
.
The fitted MTP potential is in the file Carbon_Fluorine_MTP.mtp
. A calculator using this MTP
potential can be returned from the MomentTensorPotentialTraining object using the method
momentTensorPotentialCalculators
.
Notes¶
The MolecularConfigurationsParameters
generates a trajectory of configurations that
can be used to train a moment tensor potential. Starting with one or more configurations as
input, new configurations are generated by rotating available bond torsions and adding random
displacements to the atoms. Bonds that are part of cyclic structures, such as the
carbon-carbon bond in cyclohexane are not freely rotated so as to not break the ring structure.
Double and triple bonds are also not freely rotated because of their typically high rotational
barriers. Double bonds are rotated over a smaller range centered on the planar geometry.
To avoid larger molecules being twisted onto itself while sampling the torsions, distances
between non-bonded atoms are checked for each configuration. Configurations where inter-atomic
distances are shorter than the factor of the van der Waals radius given by the parameter
overlap_ratio
are rejected. An additional Monte Carlo selection can also be performed on
configurations with rotated torsions. A calculator for evaluating the configuration energy and a
Monte Carlo selection temperature must be specified. This is done using the arguments
monte_carlo_calculator
and monte_carlo_temperature
respectively. If a calculator is not
given then then Monte Carlo selection is not performed.
The number of bonds rotated in each trial structure can be set using the argument
max_rotated_bonds
. Limiting the number of bonds rotated at each step can make it easier
to find non-overlapping structures when fitting an MTP for a polymer or other large molecule. For
configurations with double bonds, both cis and trans orientations of the bond are sampled by default.
Setting the argument sample_cis_trans
disables this sampling and turns off bond rotation for
double bonds.
In some cases, such as fitting an MTP for a surface process simulations molecular and bulk training
sets may be combined together. In these cases it is more convenient for the molecular training
sets to also contain bulk configurations. Setting the argument generate_bulk_configurations
to True
generates a training set of molecules that are placed in periodic boundary conditions
with a large buffer.
After defining the MolecularConfigurationsParameters
it can be used with
MomentTensorPotentialTraining for fitting an MTP potential.