TorchXPotential¶
Included in QATK.Calculators.ForceField
- class TorchXPotential(dtype=None, device=None, file=None, mapping=None, enforceLTX=None, task_str=None, suffix_cueq_fp32=None, suffix_cueq_fp64=None)¶
Constructor of the potential.
- Parameters:
dtype (str) – The dtype of the involved torch tensors [TorchXPotential.float32, TorchXPotential.float64]. Default is TorchXPotential.float32.
device (str) – The device to evaluate the torchscript model. E.g.: cpu, cuda, cuda:0, cuda:1. Default is to choose available cuda devices.
file (str) – The name of the file that contains the torchscript model parameters.
mapping (dict with (ParticleIdentifier : Number)) – A mapping from ParticleIdentifiers to TorchXModel Particle Numbers
enforceLTX (bool) – Enforce to use also in sequential case the LTX version if available LTX versions support local stress calculation. Note that, in parallel MPI decomposition mode LTX (if available) is used be default anyway.
task_str (str) – The name of a specific task for the model, if the model contains multiple tasks.
suffix_cueq_fp32 (str) – The suffix for the cueq file for float32 dtype.
suffix_cueq_fp64 (str) – The suffix for the cueq file for float64 dtype.
- classmethod getAllParameterNames()¶
Return the names of all used parameters as a list.
- getAllParameters()¶
Return all parameters of this potential and their current values as a <parameterName / parameterValue> dictionary.
- static getDefaults()¶
Get the default parameters of this potential and return them in form of a dictionary of <parameter name, default value> key-value pairs.
- getParameter(parameterName)¶
Get the current value of the parameter parameterName.
- static get_cue_file(file, device, dtype, suffix)¶
Return cue-accelerated TorchScript file name if present.
Search order: 1. Directory of the provided file path (if it exists) 2. The TorchX potential files directory (same logic as get_symbols)
- File name pattern (appended to base without .pt):
float32 -> _cue_fp32.pt
float64 -> _cue_fp64.pt
- Conditions:
Only performed if ‘device’ refers to a CUDA device (starts with ‘cuda’)
File must exist
Returns the discovered cue filename (basename) or None if not found / conditions unmet.
- static get_mace_cue_file(file, device, dtype)¶
Return cue-accelerated TorchScript file name if present.
Search order: 1. Directory of the provided file path (if it exists) 2. The TorchX potential files directory (same logic as get_symbols)
- File name pattern (appended to base without .pt):
float32 -> _cue_fp32.pt
float64 -> _cue_fp64.pt
- Conditions:
Only performed if ‘device’ refers to a CUDA device (starts with ‘cuda’)
File must exist
TorchScript extra file cuequivariance_version.txt must be present.
Returns the discovered cue filename (basename) or None if not found / conditions unmet.
- static get_sevennet_cue_file(file, device, dtype)¶
Return cue-accelerated TorchScript file name if present.
Search order: 1. Directory of the provided file path (if it exists) 2. The TorchX potential files directory (same logic as get_symbols)
- File name pattern (appended to base without .pt):
float32, float64 -> _cue.pt
- Conditions:
Only performed if ‘device’ refers to a CUDA device (starts with ‘cuda’)
File must exist
TorchScript extra file cuequivariance_version.txt must be present.
Returns the discovered cue filename (basename) or None if not found / conditions unmet.
- setParameter(parameterName, value)¶
Set the parameter parameterName to the given value.
- Parameters:
parameterName (str) – The name of the parameter that will be modified.
value – The new value that will be assigned to the parameter parameterName.
Usage Examples¶
Set up a TorchXPotential and add it to the TremoloXPotentialSet.
In this example the dtype is switched from float32 to float64.
potential_set = TremoloXPotential_set(name='TorchX-example')
# Add particle types for the needed elements, in this case Si and H.
potential_set.addParticleType(
ParticleType(symbol='H')
)
potential_set.addParticleType(
ParticleType(symbol='Si')
)
# Add the TorchXPotential with the torch-script file trained via QuantumATK.
_potential = TorchXPotential(
file='MACE_training_example.qatkpt',
dtype='float64',
)
potential_set.addPotential(_potential)
calculator = TremoloXCalculator(parameters=potential_set)
Set up a pre-trained MACE model and switch the dtype to float64. Enforce LTX-mode (even on single GPU) to enable calculation of local stress.
# Set up a pre-trained MACE potential with float64 precision.
# Enable LTX-mode to calculate local stress.
potential_set = TorchX_MACE_MP_0_L0_2023(dtype='float64', enforceLTX=True)
calculator = TremoloXCalculator(parameters=potential_set)
bulk_configuration.setCalculator(calculator)
# Calculate and print the local stress with the MACE ML-FF.
local_stress = LocalStress(bulk_configuration)
nlprint(local_stress)
Notes¶
The TorchXPotential class can be used to include torch-based
machine-learned FF models, such as MACE [1]. The
primary use case is to run simulations with user-trained ML-FF models, after
training the model using the MachineLearnedForceFieldTrainer. The
.qatkpt file which results from the training workflow can be loaded in
the TorchXPotential, either in script or using
the MachineLearnedForceField block in the Workflow Builder. The path
to the torch-script should be passed via the file argument when
constructing the TorchXPotential object.
For external user-trained models, currently only models based on the MACE architecture are supported. A given model can be converted using the function convertMACEModelToQATKFormat.
The dtype keyword can be used to switch the floating point precision that
is used for calculating the energy, forces, and stress. The default
float32 value is normally sufficient for MD simulations, as it provides a
better performance. However, for calculations that require a higher accuracy,
e.g. DynamicalMatrix, OptimizeNudgedElasticBand, etc. it is
recommended to use float64 instead.
The enforceLTX keyword can be used to manually switch between standard
serial mode (enforceLTX=False) and multi-GPU mode
(enforceLTX=True), which enables MPI communication of node features
between message-passing layers of the ML-FF (currently supported only for
MACE and MatterSim models). By default, the modes are selected automatically,
depending on whether the calculation is run with GPU and multiple MPI processes or
not. One can use this flag to enforce the multi-GPU mode even when running in
serial mode by setting enforceLTX=True. This enables calculation
of LocalStress, which is otherwise not supported
for TorchXPotential.
The mapping keyword can be used to specify a dict between atomic symbols
and atomic numbers used to encode the elements in most ML-FF backends. The
mapping is not needed for models trained with QuantumATK, or converted
using convertMACEModelToQATKFormat.
GPU and CuEquivariance Support¶
The device keyword can be used to manually disable or enable GPU
acceleration. However, the recommended approach is to leave this parameter at the
default value None, and enable the GPU acceleration as described in
the technical notes.
When run on GPU, TorchXPotentials can use acceleration via cuequivariance if
the architecture supports it.
This requires a separate model file, as explained in convertMACEModelToQATKFormat.
When using user-trained models, the user has to make sure that the model filename is consistent
with the chosen dtype and cuequivariance settings, e.g. when specifying a model file with
cuequivariance for float32, the dtype parameter should be set to float32. Furthermore, the
simulation needs to be run in GPU-enabled mode to make sure cuequivariance can be used, otherwise
errors can be encountered.
For the pre-trained ML-FF models shipped with QuantumATK that support cuequivariance,
it is sufficient to pass the base model filename without any cuequivariance suffix, as
it is automatically detected at runtime whether running with cuequivariance is appropriate.
If cuequivariance is used during runtime, the dtype parameter controls which
cuequivariance model is used.
Therefore, these models are conveniently handled via ready-made TorchXPotentials such as
TorchX_MACE_MPA_0_medium.
An example of how to set up the MACE-MPA-0 model with 64-bit floating point precision,
which automatically chooses between the standard model file and the appropriate
cuequivariance model file (here the float64 version), based on whether GPU is
available and enabled or not is shown here:
potential = TorchX_MACE_MPA_0_medium(dtype="float64")
calculator = TremoloXCalculator(potential)
For user-trained models, it is recommended to pass the specific model file for the model
type to be used to ensure model file availability, reliability, and consistency across
job submissions and clusters. This includes ensuring that the model file matches the chosen
dtype and that GPU will be available and enabled if cuequivariance is desired. If
any of these settings are inconsistent, errors may be encountered during runtime.
An example of how to set up a user-trained model with cuequivariance support is shown here:
potential = TorchXPotential(
file="custom_trained_mace_model_cue_fp64.qatkpt",
dtype="float64"
)
calculator = TremoloXCalculator(potential)
It should be noted that the generated script is only compatible with either CPU or GPU execution, depending on whether the cuequivariance model file is used or not. To run with the opposite device, the model that is read in the potential setup has to be changed accordingly.