Automated Precursor Molecule Generation for Vapor Deposition

This application demonstrates the automated generation and optimization of precursor molecules for vapor deposition processes such as chemical vapor deposition (CVD) and atomic layer deposition (ALD). The method is general and can construct both homoleptic (identical ligands) and heteroleptic (mixed ligands) precursor structures by systematically attaching ligands around a central atom. It is not limited to metal-organic molecules and can be applied to a wide range of central atoms and ligand types.

Important

QuantumATK Version: This application is designed for QuantumATK X-2025.06.

This application requires the PrecursorGenerator module for automated molecular construction. The module is provided as a compiled Python file (.pye) and must be available in your Python path. You can download the required files below:

Key Features

  • Combinatorial generation: Automatically creates all unique ligand combinations

  • Homoleptic and heteroleptic: Generates precursors with identical or mixed ligands

  • Geometric optimization: Smart placement with non-overlapping geometry constraints

  • Anchor atom recognition: Identifies ligand attachment points via atom tagging

  • Moment of inertia alignment: Orients ligands naturally around the central atom

  • Automated optimization: Uses machine learning force fields for structure relaxation

System Overview

Precursor molecules consist of a central atom (metal or non-metal) surrounded by organic or inorganic ligands. These molecules are used in thin film deposition where:

  • Central atom: The element to be deposited (e.g., Ir, Si)

  • Ligands: Organic molecules that stabilize the central atom and control reactivity

  • Anchor atoms: Specific atoms in each ligand (e.g., C, O, N) that bond to the central atom

Example System: Iridium Alkyl Precursors

This example demonstrates Ir-alkyl precursors with three different alkyl ligands:

  • Methyl (-CH₃): Smallest alkyl ligand

  • Ethyl (-C₂H₅): Medium-sized alkyl ligand

  • Propyl (-C₃H₇): Larger alkyl ligand with extended chain

Types of Precursors Generated

  • Homoleptic: All ligands are identical

  • Heteroleptic: Mixed ligands

The tool generates all mathematically unique combinations for the central Ir atom with coordination number 3, avoiding permutational duplicates.

Simulation Workflow

Step 0: Ligand Library Preparation

Each ligand must be prepared as a separate molecular configuration with an anchor atom tagged to indicate the attachment point. In this example, three alkyl ligands are defined:

Methyl Ligand (-CH₃)

# Define elements
elements = [Carbon, Hydrogen, Hydrogen, Hydrogen]

# Define coordinates
cartesian_coordinates = [[ 0.0,  0.0,  0.0],
                         [ 1.09,  0.0,  0.0],
                         [-0.363,  0.0, -1.028],
                         [-0.363, -0.890,  0.514]] * Angstrom

configuration_0 = MoleculeConfiguration(
    elements=elements,
    cartesian_coordinates=cartesian_coordinates
)

# Tag atoms: carbon is anchor, hydrogens are ligand structure
configuration_0.addTags('H_C', [1, 2, 3])
configuration_0.addTags('anchor', [0])

configuration_name_0 = "Methyl"

Ethyl Ligand (-C₂H₅)

# Define elements
elements = [Carbon, Carbon, Hydrogen, Hydrogen, Hydrogen, Hydrogen, Hydrogen]

# Define coordinates
cartesian_coordinates = [[ 0.0,    0.0,    0.0],
                         [ 1.513,  0.0,    0.0],
                         [-0.363,  0.0,   -1.028],
                         [-0.363, -0.890,  0.514],
                         [-0.363,  0.890,  0.514],
                         [ 1.876,  0.890, -0.514],
                         [ 1.876, -0.890, -0.514]] * Angstrom

configuration_2 = MoleculeConfiguration(
    elements=elements,
    cartesian_coordinates=cartesian_coordinates
)

# Tag the terminal carbon as anchor
configuration_2.addTags('H_C', [1, 2, 3, 4, 5, 6])
configuration_2.addTags('anchor', [1])

configuration_name_2 = "Ethyl"

Propyl Ligand (-C₃H₇)

# Define elements
elements = [Carbon, Carbon, Hydrogen, Hydrogen, Hydrogen, Hydrogen,
            Carbon, Hydrogen, Hydrogen, Hydrogen]

# Define coordinates (extended chain)
cartesian_coordinates = [[ 0.0,    0.0,    0.0],
                         [ 1.513,  0.0,    0.0],
                         [-0.363,  0.0,   -1.028],
                         [-0.363, -0.890,  0.514],
                         [-0.363,  0.890,  0.514],
                         [ 1.876,  0.890, -0.514],
                         [ 2.017,  0.0,    1.426],
                         [ 1.876, -0.890, -0.514],
                         [ 3.107,  0.0,    1.426],
                         [ 1.654, -0.890,  1.940]] * Angstrom

configuration_1 = MoleculeConfiguration(
    elements=elements,
    cartesian_coordinates=cartesian_coordinates
)

# Tag the terminal carbon as anchor
configuration_1.addTags('H_C', [1, 2, 3, 4, 5, 6, 7, 8, 9])
configuration_1.addTags('anchor', [6])

configuration_name_1 = "Propyl"

Key parameters for ligand library:

  • Each ligand configuration must have exactly one atom tagged as 'anchor'

  • Anchor atoms (typically C for alkyl, O for alkoxy, N for amine ligands) bond to the central atom

  • Additional tags like 'H_C' are optional and help identify ligand structure

  • All ligands are stored in a Table for systematic processing

Step 1: Define Central Atom and Coordination

Specify the central metal atom and desired coordination number (number of ligands):

host_atom = 'Ir'           # Central metal atom (Iridium)
num_ligands = 3            # Coordination number
distance_to_host = 2.0     # Initial M-ligand distance in Angstroms

Step 2: Combinatorial Precursor Generation

The generator creates all unique ligand combinations using the precursor_generator function from the PrecursorGenerator module. The process involves:

Algorithm Features

  • Unique combinations: Uses permutation-invariant enumeration (e.g., ABC = ACB = BAC counted once)

  • Fibonacci sphere distribution: Evenly distributes ligands in 3D space around central atom

  • Overlap detection: Checks minimum interatomic distances (≥ 2.0 Å)

  • Anchor validation: Ensures anchor atoms (C in alkyl ligands) are closest to central metal

  • Automatic adjustment: Increases radius if overlaps detected (up to 1000 attempts)

  • Systematic naming: Each molecule named as “Ir_ligand1_ligand2_ligand3” (alphabetically sorted)

Output

For the example with 3 different ligands (Methyl, Ethyl, Propyl) and coordination number 3, the generator creates 10 unique combinations:

  • 3 homoleptic: Ir(CH₃)₃, Ir(C₂H₅)₃, Ir(C₃H₇)₃

  • 7 heteroleptic: Ir(CH₃)₂(C₂H₅), Ir(CH₃)₂(C₃H₇), Ir(CH₃)(C₂H₅)₂, Ir(CH₃)(C₂H₅)(C₃H₇), Ir(CH₃)(C₃H₇)₂, Ir(C₂H₅)₂(C₃H₇), Ir(C₂H₅)(C₃H₇)₂

All generated molecules are saved to the output table in HDF5 file for subsequent optimization.

Step 3: Calculator Selection

Before optimizing the generated precursor structures, a calculator must be assigned to compute energies and forces. Universal machine learning force fields provide an excellent balance between accuracy and computational speed. So, we employ them at this stage. Users can prefer to use DFT calculators for higher accuracy but at greater computational cost.

MACE Machine Learning Force Field

The example uses the TorchX MACE-MP-0 (Materials Project) medium model:

# Set up MACE machine learning potential
potentialSet = TorchX_MACE_MP_0b3_medium(
    dtype='float32',
    enforceLTX=False
)
calculator = TremoloXCalculator(parameters=potentialSet)

# Assign calculator to configuration
configuration.setCalculator(calculator)

Calculator Parameters

  • Force field: TorchX_MACE_MP_0b3_medium

    • Trained on Materials Project database (diverse chemical space)

    • Handles metals, organics, and metal-organic systems

    • Medium-sized model balances accuracy and speed

  • Data type: float32

    • Faster computation than float64

    • Sufficient precision for geometry optimization

    • Reduces memory requirements

  • enforceLTX: False

    • Disables long-range electrostatics enforcement

    • Suitable for molecular systems (non-periodic)

    • Faster evaluation for isolated molecules

Alternative Calculator Options

Depending on system size, accuracy requirements, and available computational resources:

  • TorchX_MACE_MP_0b3_small: Faster, slightly lower accuracy

  • TorchX_MACE_MP_0b3_large: Higher accuracy, slower, more memory

  • Classical force fields: UFF, Dreiding for quick screening (less accurate for metal systems)

  • DFT calculators: B3LYP for high-accuracy refinement (much slower, typically used after MACE optimization)

Why MACE for Precursor Molecules?

  • Metal-ligand bonding: Trained on transition metal compounds

  • Organic accuracy: Handles C-H, C-C bonds and alkyl chains accurately

  • Fast optimization: 1-3 minutes per molecule vs hours with DFT

  • No parameterization needed: Universal force field (no manual parameter fitting)

  • Good initial structures: Optimized geometries suitable for subsequent DFT refinement

Step 4: Molecular Geometry Optimization

Generated precursor molecules undergo geometry optimization using the assigned machine learning force field calculator. The optimization is performed iteratively for each generated molecule using a table iteration loop.

Optimization parameters:

  • Max steps: 1000 optimization steps per molecule

  • Trajectory interval: Save geometry every 100 steps

  • Convergence criteria: Default force tolerances (0.05 eV/Å)

The optimization refines:

  • Ir-C bond lengths: Optimal metal-ligand distances (~2.0-2.2 Å)

  • Bond angles: Coordination geometry (trigonal planar for 3 ligands)

  • Intra-ligand geometry: C-C and C-H bond lengths, angles, dihedrals

  • Overall symmetry: Natural molecular shape based on steric effects

Step 5: Subsequent Analysis Suggestion for Selection

After optimization, analyze the generated precursors for properties relevant to vapor deposition:

Geometric Properties

  • Molecular size: Overall diameter affects vaporization

  • Symmetry: High symmetry can indicate stability

  • Steric hindrance: Bulky ligands may prevent close packing

Electronic Properties (with DFT follow-up)

  • HOMO-LUMO gap: Affects reactivity and stability

  • Partial charges: Indicates bond polarity

  • Dipole moment: Affects molecular interactions

Stability Indicators

  • Total energy: Lower energy indicates more stable isomers

  • No imaginary frequencies: Confirms true minimum

  • Bond strain: Short or long bonds indicate instability

Customization Options

Distance Control

Adjust initial metal-ligand distance for different coordination environments:

distance_to_host = 1.8  # Shorter for small ligands
distance_to_host = 2.5  # Longer for bulky ligands

Ligand Types

Support for various ligand classes:

  • Bidentate ligands: Require two anchor atoms (future extension)

  • Monodentate ligands: Single anchor atom (current implementation)

  • Neutral ligands: CO, PR₃, etc.

  • Anionic ligands: acac⁻, Cp⁻, alkoxides, etc.

Central Atom Options

Any element can be specified:

host_atom = 'Ti'   # Titanium precursors for TiO₂
host_atom = 'Hf'   # Hafnium precursors for HfO₂
host_atom = 'W'    # Tungsten precursors for W films
host_atom = 'Ir'   # Iridium precursors (example shown)

Coordination Numbers

Common coordination geometries:

num_ligands = 2    # Linear or bent
num_ligands = 3    # Trigonal planar
num_ligands = 4    # Tetrahedral or square planar
num_ligands = 5    # Trigonal bipyramidal
num_ligands = 6    # Octahedral

Running the Script

Using NanoLab Workflow (Recommended)

  1. Open Precursor_Generator_Example.hdf5 in NanoLab

  2. Prepare ligand library with anchor atom tags as shown in Step 0

  3. Configure the ligand generator block with:

    • host_atom: Central metal element (e.g., ‘Ir’)

    • num_ligands: Coordination number (e.g., 3)

    • distance_to_host: Initial distance (e.g., 2.0 Å)

  4. Run the workflow to generate and optimize all precursors

  5. Analyze results in the output tables

Using Python Script

For command-line execution:

atkpython Precursor_Generator_Example_results.py > output.log

The script will:

  • Generate all 10 unique Ir-alkyl combinations (for 3 ligands, coordination 3)

  • Save unoptimized structures to Precursor_Generator_Example_results.hdf5

  • Optimize each structure sequentially using MACE force field

  • Save optimized structures to Optimized_Molecules.hdf5

  • Log progress for each molecule optimization

Viewing Results

Open the HDF5 files in NanoLab to visualize:

  • Initial generated geometries (unoptimized)

  • Optimized molecular structures

  • Compare homoleptic vs heteroleptic precursors

  • Analyze coordination geometry and bond lengths

Performance Considerations

Computational Scaling

  • Number of combinations: Grows combinatorially with ligand types and coordination number

    • 1 ligand type, coordination 3: 1 combination (homoleptic only)

    • 2 ligand types, coordination 3: 4 combinations (2 homoleptic + 2 heteroleptic)

    • 3 ligand types, coordination 3: 10 combinations (3 homoleptic + 7 heteroleptic)

    • 3 ligand types, coordination 4: 15 combinations

    • 4 ligand types, coordination 4: 35 combinations

  • Generation time: Typically < 1 second per molecule

  • Optimization time:

    • 1-3 minutes per molecule with MACE force field (small alkyl ligands)

    • 5-10 minutes for larger ligands or coordination numbers

    • Depends on initial geometry quality and force field accuracy

Memory Requirements

  • Minimal for molecule generation (< 100 MB)

  • MACE force field: ~2-4 GB RAM per molecule during optimization

  • GPU acceleration recommended for MACE (faster by 5-10×)

Parallelization

The table iteration is sequential by default. For large numbers of combinations, consider updating the workflow with array table iteration for parallel execution across multiple CPU cores or compute nodes.

Best Practices

Ligand Preparation

  • Pre-optimize ligands before use (e.g., using MACE or DFT)

  • Tag anchor atoms correctly using addTags('anchor', [atom_index]) - critical for bonding

  • Use consistent orientation with anchor pointing outward from ligand center

  • For alkyl ligands, terminal carbon is typically the anchor

  • For chelating ligands (O-, N- donors), use the donor atom as anchor

Parameter Selection

  • Start with distance_to_host around 2.0 Å for transition metals with alkyl ligands

  • Adjust based on ligand type:

    • Shorter (1.8-1.9 Å) for small ligands (H, methyl)

    • Longer (2.1-2.5 Å) for bulky ligands (propyl, isopropyl, tert-butyl)

    • Match experimental M-ligand bond lengths when available

  • Check for overlaps in initial structures - algorithm retries with larger radius if needed

  • For coordination numbers > 4, may need larger initial distances

Validation

  • Verify optimized geometries visually in NanoLab

  • Check for unrealistic bond lengths or angles

  • Compare M-ligand distances with experimental data

  • Calculate vibrational frequencies to confirm minima (no imaginary modes)

DFT Refinement (Optional)

For critical precursors, perform DFT calculations:

  1. Use optimized MACE structure as starting point

  2. Apply DFT with appropriate functional (e.g., B3LYP)

  3. Re-optimize geometry with tighter convergence

  4. Calculate thermochemical properties (enthalpy, entropy, free energy)

  5. Evaluate thermal stability and decomposition pathways

Summary

The Precursor Generator application provides an automated workflow for creating precursor molecules used in chemical vapor deposition and related thin film processes. Key capabilities demonstrated in this Ir-alkyl example include:

  • Combinatorial generation of all 10 unique combinations from 3 alkyl ligands

  • Intelligent geometry construction with Fibonacci sphere ligand placement and overlap avoidance

  • High-throughput optimization using MACE machine learning force fields

  • Systematic exploration of homoleptic (Ir(CH₃)₃, Ir(C₂H₅)₃, Ir(C₃H₇)₃) and heteroleptic precursors

This tool accelerates precursor design for CVD/ALD applications by:

  • Eliminating manual construction: No need to build each molecule by hand

  • Ensuring completeness: All mathematically unique combinations are generated

  • Providing optimized geometries: Ready for further DFT refinement or property calculations

  • Enabling rapid screening: Quickly evaluate many precursor candidates

Demonstrated Example: Iridium Alkyl Precursors

The example generates and optimizes 10 Ir-alkyl precursors:

  • 3 homoleptic: Ir(Methyl)₃, Ir(Ethyl)₃, Ir(Propyl)₃

  • 7 heteroleptic: Mixed methyl/ethyl/propyl combinations

These can be used to study:

  • Steric effects: How ligand size affects molecular structure

  • Deposition selectivity: Which precursors favor specific reaction pathways

  • Thermal stability: Comparative decomposition energies

  • Volatility: Molecular size/mass effects on vapor pressure

The automated workflow enables materials scientists to systematically explore precursor chemical space and identify optimal candidates for experimental synthesis and deposition testing.