TorchXPotential

Included in QATK.Calculators.ForceField

class TorchXPotential(dtype=None, device=None, file=None, mapping=None, enforceLTX=None, task_str=None, suffix_cueq_fp32=None, suffix_cueq_fp64=None)

Constructor of the potential.

Parameters:
  • dtype (str) – The dtype of the involved torch tensors [TorchXPotential.float32, TorchXPotential.float64]. Default is TorchXPotential.float32.

  • device (str) – The device to evaluate the torchscript model. E.g.: cpu, cuda, cuda:0, cuda:1. Default is to choose available cuda devices.

  • file (str) – The name of the file that contains the torchscript model parameters.

  • mapping (dict with (ParticleIdentifier : Number)) – A mapping from ParticleIdentifiers to TorchXModel Particle Numbers

  • enforceLTX (bool) – Enforce to use also in sequential case the LTX version if available LTX versions support local stress calculation. Note that, in parallel MPI decomposition mode LTX (if available) is used be default anyway.

  • task_str (str) – The name of a specific task for the model, if the model contains multiple tasks.

  • suffix_cueq_fp32 (str) – The suffix for the cueq file for float32 dtype.

  • suffix_cueq_fp64 (str) – The suffix for the cueq file for float64 dtype.

classmethod getAllParameterNames()

Return the names of all used parameters as a list.

getAllParameters()

Return all parameters of this potential and their current values as a <parameterName / parameterValue> dictionary.

static getDefaults()

Get the default parameters of this potential and return them in form of a dictionary of <parameter name, default value> key-value pairs.

getParameter(parameterName)

Get the current value of the parameter parameterName.

static get_cue_file(file, device, dtype, suffix)

Return cue-accelerated TorchScript file name if present.

Search order: 1. Directory of the provided file path (if it exists) 2. The TorchX potential files directory (same logic as get_symbols)

File name pattern (appended to base without .pt):
  • float32 -> _cue_fp32.pt

  • float64 -> _cue_fp64.pt

Conditions:
  • Only performed if ‘device’ refers to a CUDA device (starts with ‘cuda’)

  • File must exist

Returns the discovered cue filename (basename) or None if not found / conditions unmet.

static get_mace_cue_file(file, device, dtype)

Return cue-accelerated TorchScript file name if present.

Search order: 1. Directory of the provided file path (if it exists) 2. The TorchX potential files directory (same logic as get_symbols)

File name pattern (appended to base without .pt):
  • float32 -> _cue_fp32.pt

  • float64 -> _cue_fp64.pt

Conditions:
  • Only performed if ‘device’ refers to a CUDA device (starts with ‘cuda’)

  • File must exist

  • TorchScript extra file cuequivariance_version.txt must be present.

Returns the discovered cue filename (basename) or None if not found / conditions unmet.

static get_sevennet_cue_file(file, device, dtype)

Return cue-accelerated TorchScript file name if present.

Search order: 1. Directory of the provided file path (if it exists) 2. The TorchX potential files directory (same logic as get_symbols)

File name pattern (appended to base without .pt):
  • float32, float64 -> _cue.pt

Conditions:
  • Only performed if ‘device’ refers to a CUDA device (starts with ‘cuda’)

  • File must exist

  • TorchScript extra file cuequivariance_version.txt must be present.

Returns the discovered cue filename (basename) or None if not found / conditions unmet.

setParameter(parameterName, value)

Set the parameter parameterName to the given value.

Parameters:
  • parameterName (str) – The name of the parameter that will be modified.

  • value – The new value that will be assigned to the parameter parameterName.

Usage Examples

Set up a TorchXPotential and add it to the TremoloXPotentialSet. In this example the dtype is switched from float32 to float64.

potential_set = TremoloXPotential_set(name='TorchX-example')

# Add particle types for the needed elements, in this case Si and H.
potential_set.addParticleType(
    ParticleType(symbol='H')
)
potential_set.addParticleType(
    ParticleType(symbol='Si')
)

# Add the TorchXPotential with the torch-script file trained via QuantumATK.
_potential = TorchXPotential(
    file='MACE_training_example.qatkpt',
    dtype='float64',
)
potential_set.addPotential(_potential)
calculator = TremoloXCalculator(parameters=potential_set)

Set up a pre-trained MACE model and switch the dtype to float64. Enforce LTX-mode (even on single GPU) to enable calculation of local stress.

# Set up a pre-trained MACE potential with float64 precision.
# Enable LTX-mode to calculate local stress.
potential_set = TorchX_MACE_MP_0_L0_2023(dtype='float64', enforceLTX=True)
calculator = TremoloXCalculator(parameters=potential_set)

bulk_configuration.setCalculator(calculator)

# Calculate and print the local stress with the MACE ML-FF.
local_stress = LocalStress(bulk_configuration)
nlprint(local_stress)

Notes

The TorchXPotential class can be used to include torch-based machine-learned FF models, such as MACE [1]. The primary use case is to run simulations with user-trained ML-FF models, after training the model using the MachineLearnedForceFieldTrainer. The .qatkpt file which results from the training workflow can be loaded in the TorchXPotential, either in script or using the MachineLearnedForceField block in the Workflow Builder. The path to the torch-script should be passed via the file argument when constructing the TorchXPotential object.

For external user-trained models, currently only models based on the MACE architecture are supported. A given model can be converted using the function convertMACEModelToQATKFormat.

The dtype keyword can be used to switch the floating point precision that is used for calculating the energy, forces, and stress. The default float32 value is normally sufficient for MD simulations, as it provides a better performance. However, for calculations that require a higher accuracy, e.g. DynamicalMatrix, OptimizeNudgedElasticBand, etc. it is recommended to use float64 instead.

The enforceLTX keyword can be used to manually switch between standard serial mode (enforceLTX=False) and multi-GPU mode (enforceLTX=True), which enables MPI communication of node features between message-passing layers of the ML-FF (currently supported only for MACE and MatterSim models). By default, the modes are selected automatically, depending on whether the calculation is run with GPU and multiple MPI processes or not. One can use this flag to enforce the multi-GPU mode even when running in serial mode by setting enforceLTX=True. This enables calculation of LocalStress, which is otherwise not supported for TorchXPotential.

The mapping keyword can be used to specify a dict between atomic symbols and atomic numbers used to encode the elements in most ML-FF backends. The mapping is not needed for models trained with QuantumATK, or converted using convertMACEModelToQATKFormat.

GPU and CuEquivariance Support

The device keyword can be used to manually disable or enable GPU acceleration. However, the recommended approach is to leave this parameter at the default value None, and enable the GPU acceleration as described in the technical notes.

When run on GPU, TorchXPotentials can use acceleration via cuequivariance if the architecture supports it. This requires a separate model file, as explained in convertMACEModelToQATKFormat. When using user-trained models, the user has to make sure that the model filename is consistent with the chosen dtype and cuequivariance settings, e.g. when specifying a model file with cuequivariance for float32, the dtype parameter should be set to float32. Furthermore, the simulation needs to be run in GPU-enabled mode to make sure cuequivariance can be used, otherwise errors can be encountered.

For the pre-trained ML-FF models shipped with QuantumATK that support cuequivariance, it is sufficient to pass the base model filename without any cuequivariance suffix, as it is automatically detected at runtime whether running with cuequivariance is appropriate. If cuequivariance is used during runtime, the dtype parameter controls which cuequivariance model is used. Therefore, these models are conveniently handled via ready-made TorchXPotentials such as TorchX_MACE_MPA_0_medium. An example of how to set up the MACE-MPA-0 model with 64-bit floating point precision, which automatically chooses between the standard model file and the appropriate cuequivariance model file (here the float64 version), based on whether GPU is available and enabled or not is shown here:

potential = TorchX_MACE_MPA_0_medium(dtype="float64")
calculator = TremoloXCalculator(potential)

For user-trained models, it is recommended to pass the specific model file for the model type to be used to ensure model file availability, reliability, and consistency across job submissions and clusters. This includes ensuring that the model file matches the chosen dtype and that GPU will be available and enabled if cuequivariance is desired. If any of these settings are inconsistent, errors may be encountered during runtime. An example of how to set up a user-trained model with cuequivariance support is shown here:

potential = TorchXPotential(
    file="custom_trained_mace_model_cue_fp64.qatkpt",
    dtype="float64"
)
calculator = TremoloXCalculator(potential)

It should be noted that the generated script is only compatible with either CPU or GPU execution, depending on whether the cuequivariance model file is used or not. To run with the opposite device, the model that is read in the potential setup has to be changed accordingly.