GridValuesModelEvaluation

Included in QATK.MLDFT

class GridValuesModelEvaluation(grid_values_dataset, evaluation_samples=None, evaluation_interval=None, probe_count=None, gpu_acceleration=None)

Initialize the evaluation class with samples for periodic evaluation during training.

The evaluation samples can be specified in several flexible ways:

  • None (default): Randomly selects 10% of the validation samples using the dataset’s seed for reproducibility.

  • Integer: Randomly selects the specified number of samples from the validation set using the dataset’s seed.

  • Float: A fraction (between 0 and 1) of the validation samples to randomly select.

  • List of tuples: Each tuple should be (file_path, object_id) specifying exact samples from the dataset.

  • List of integers: Direct indices into the dataset’s member list.

  • Mixed list: A combination of tuples and integers in the same list.

When Specifying samples as a list, both training and validation samples can be used.

Parameters:
  • grid_values_dataset – The dataset object used for evaluation. Must be of type GridValuesDataset.

  • evaluation_samples

    Specification of which samples to use for evaluation. Can be:

    • None: Randomly select 10% of validation samples (default behavior).

    • int: Number of random validation samples to select (must be ≤ validation set size).

    • float: Fraction (0 < fraction < 1) of validation samples to select randomly.

    • List[Tuple[str, str]]: List of (file_path, object_id) tuples identifying specific samples.

    • List[int]: List of integer indices into the dataset’s member list (must be valid indices).

    Random selection uses the dataset’s seed to ensure reproducibility across runs with the same dataset.
    Default: None (selects 10% of validation samples randomly).

  • evaluation_interval – Interval (in training steps) at which evaluation is performed during training. Evaluation results are logged and written to the evaluation log file at these intervals.
    Default: 100000.

  • probe_count – Number of probe points to use for model prediction when constructing graphs. This should not affect the results, only the computational cost.
    Default: 5000.

  • gpu_acceleration

    Whether to use GPU acceleration for model inference:

    • Enabled: Force GPU usage (raises error if GPU unavailable).

    • Disabled: Use CPU only.

    • Automatic: Use GPU if available, fall back to CPU otherwise.


    Default: Automatic.

evaluate(model)

Evaluate the model on the specified evaluation samples from the dataset.

This method performs inference using the provided model on the evaluation samples and computes various error metrics by comparing predictions to reference values. The evaluation behavior depends on the type of grid values being analyzed:

For ElectronDifferenceDensity (EDD):
  • Predicted density is grounded to ensure charge neutrality (zero mean).

  • For device configurations, grounding is performed only in the central region.

  • Computes multiple error metrics:
    • MAE on electron difference density (normalized by reference values)

    • MAE on Hartree difference potential

    • Errors in charge, dipole moments (x, y, z), and spheropole moment

  • Always sums spin components to get total charge density for evaluation.

For other grid values (e.g., ElectronDensity, EffectivePotential):
  • Computes mean absolute error (MAE) normalized by reference values.

  • For unpolarized models with multi-component predictions, averages the components.

  • For polarized/noncollinear calculations, evaluates spin-resolved quantities.

Parameters:

model – The trained model to evaluate.

Returns:

List of dictionaries containing various metric data, one per evaluation sample.

evaluationInterval()

Return the evaluation interval in training steps.

Returns:

The evaluation interval in training steps.

evaluationSamples()

Return the list of evaluation sample indices referring to the members in the GridValuesDataset.

Returns:

List of integer indices into the dataset.

gpuAcceleration()

Return the GPU acceleration setting for model inference during evaluation.

Returns:

The NLFlag indicating GPU acceleration setting.

gridValuesDataset()

Return the grid values dataset that is used for evaluation.

Returns:

The GridValuesDataset object.

static logEvaluationResults(step, evaluation_results, evaluation_log_file, logger=None, terminal_output=None)

Log the evaluation results obtained from calling the “evaluate” method to a CSV file and optionally print a formatted string summarizing the mean of each metric across all samples.

Parameters:
  • step – The training step at which the evaluation was performed.

  • evaluation_results – List of dictionaries containing evaluation metrics for each sample.

  • evaluation_log_file – Path to the CSV file where evaluation results will be logged.

  • logger – Optional NLLogger for logging summary messages.

  • terminal_output – Optional TerminalOutput for printing summary messages.

probeCount()

Return the number of probe points used for model prediction during evaluation.

Returns:

The number of probe points.