MachineLearnedModelEvaluator

Included in QATK.MLFF

class MachineLearnedModelEvaluator(calculator, model_identifier, training_set=None, test_set=None, calculate_stress=True, isolated_atom_energies=None)

Initialize the MachineLearnedModelEvaluator.

Parameters:
  • calculator (Calculator) – The machine learning calculator to evaluate.

  • model_identifier (str) – Identifier for the model (e.g., filename or unique model description).

  • training_set (TrainingSet | None) – The training set containing reference configurations and data.
    Default: None.

  • test_set (TrainingSet | None) – The test set containing reference configurations and data.
    Default: None.

  • calculate_stress (bool) – Whether to calculate stress. Only applies to bulk configurations.
    Default: True.

  • isolated_atom_energies (dict of {Element: PhysicalQuantity of type energy} | None) – The energy of an isolated atom for each species. Optional, used to calculate cohesive energy corrections.

calculateStatistics(statistical_measure=None, use_cohesive_energy=None)

Calculate statistics for the model’s predictions for both training and test sets. If isolated atom energies were provided during initialization, both raw and cohesive energy statistics can be calculated.

Parameters:
  • statistical_measure (RMSE | MAE | R2Score | None) – The statistical measure to use.
    Default: All measures.

  • use_cohesive_energy (bool | None) – Whether to use cohesive energy calculations. If None, determined automatically based on isolated_atom_energies.
    Default: None.

Returns:

Tuple of dictionaries of statistical measures for training and test sets. Each dictionary contains measures for energy, forces, and stress. If a dataset was not provided, its dictionary will be None.

Return type:

tuple (size 2) of dict | None

calculateStress()
Returns:

Whether stress will be calculated.

Return type:

bool

calculator()
Returns:

The machine learning calculator.

Return type:

Calculator

fittingReport(stream, use_cohesive_energy=None)

Print a string containing an ASCII table summarizing the statistical metrics. If isolated atom energies were provided during initialization, both raw and cohesive energy statistics can be calculated.

Parameters:
  • stream (file-like) – The stream to write to.

  • use_cohesive_energy (bool | None) – Whether to use cohesive energy calculations. If None, determined automatically based on isolated_atom_energies.
    Default: None.

generateStatisticsData(use_cohesive_energy=None, dataset_type='test')

Generate statistics data for model predictions and reference values for a specific dataset. If isolated atom energies were provided during initialization, both raw and cohesive energy statistics can be calculated.

Parameters:
  • use_cohesive_energy (bool | None) – Whether to use cohesive energy calculations. If None, determined automatically based on isolated_atom_energies.
    Default: None.

  • dataset_type (str) – The type of dataset to process (MLParameterOptions.DATASET_TYPE.TRAINING or MLParameterOptions.DATASET_TYPE.TEST).
    Default: MLParameterOptions.DATASET_TYPE.TEST.

Returns:

The statistics data arrays for energy, forces, and stress.

Return type:

tuple (size 3) of numpy.ndarray | None

isolatedAtomEnergies()
Returns:

The energy of an isolated atom for each species

Return type:

dict of {Element: PhysicalQuantity of type energy} | None

modelIdentifier()
Returns:

The model model_identifier.

Return type:

str

nlprint(stream=None)

Print a string containing an ASCII table summarizing the available results. If isolated atom energies were provided during initialization, both raw and cohesive energy statistics can be calculated.

Parameters:

stream (file-like) – The stream to write to.

Returns:

The fitting report.

Return type:

fittingReport

uniqueString()

Return a unique string representing the state of the object.

Notes

  • The MachineLearnedModelEvaluator object contains validation results and metadata for a trained machine-learned force field model. It is automatically generated after training with MachineLearnedForceFieldTrainer completes and provides detailed performance metrics on both training and test datasets.

  • When creating a MachineLearnedModelEvaluator object, a model_identifier parameter is used to uniquely identify the model linking it to the model file’s filename. In MachineLearnedForceFieldTrainer, this identifier is automatically retrieved from the fitting parameters: for Moment Tensor Potential models it uses the mtp_filename, while for MACE models it combines experiment_name and random_seed from TrainingParameters as <experiment_name>_<random_seed>.qatkpt.

  • The evaluator object is retrieved from a MachineLearnedForceFieldTrainer instance using the modelEvaluator() method after calling train().

  • The MachineLearnedModelEvaluator class can also be used manually to evaluate the performance of a trained model. To do this, load the trained model into a calculator and create a MachineLearnedModelEvaluator object by passing the calculator, a model_identifier string, and labeled TrainingSet data (as training_set and/or test_set). After creation, call the calculateStatistics() method to compute performance metrics.

  • The evaluator can be visually inspected in the GUI using the MLFFAnalyzer tool, which provides interactive plots and detailed performance comparisons between predicted and reference values for energies, forces, and, if relevant, stresses. The analyzer refers to the model using its identifier.

  • Calling nlprint() on the evaluator object will display a comprehensive summary including statistical measures (RMSE, MAE, R2Score) for energies, forces, and stresses on both training and test datasets.

  • The trained calculator can be retrieved from the evaluator using the calculator() method, which returns the fitted machine-learned force field calculator ready for use in simulations.