Automated Precursor Molecule Generation for Vapor Deposition¶
This application demonstrates the automated generation and optimization of precursor molecules for vapor deposition processes such as chemical vapor deposition (CVD) and atomic layer deposition (ALD). The method is general and can construct both homoleptic (identical ligands) and heteroleptic (mixed ligands) precursor structures by systematically attaching ligands around a central atom. It is not limited to metal-organic molecules and can be applied to a wide range of central atoms and ligand types.
Important
QuantumATK Version: This application is designed for QuantumATK X-2025.06.
This application requires the PrecursorGenerator module for automated molecular construction.
The module is provided as a compiled Python file (.pye) and must be available in your
Python path. You can download the required files below:
Precursor_Generator_Example_results.py- Complete QuantumATK scriptPrecursorGenerator.pye- Compiled precursor generation modulePrecursor_Generator_Example.hdf5- NanoLab workflow
Key Features
Combinatorial generation: Automatically creates all unique ligand combinations
Homoleptic and heteroleptic: Generates precursors with identical or mixed ligands
Geometric optimization: Smart placement with non-overlapping geometry constraints
Anchor atom recognition: Identifies ligand attachment points via atom tagging
Moment of inertia alignment: Orients ligands naturally around the central atom
Automated optimization: Uses machine learning force fields for structure relaxation
System Overview¶
Precursor molecules consist of a central atom (metal or non-metal) surrounded by organic or inorganic ligands. These molecules are used in thin film deposition where:
Central atom: The element to be deposited (e.g., Ir, Si)
Ligands: Organic molecules that stabilize the central atom and control reactivity
Anchor atoms: Specific atoms in each ligand (e.g., C, O, N) that bond to the central atom
Example System: Iridium Alkyl Precursors
This example demonstrates Ir-alkyl precursors with three different alkyl ligands:
Methyl (-CH₃): Smallest alkyl ligand
Ethyl (-C₂H₅): Medium-sized alkyl ligand
Propyl (-C₃H₇): Larger alkyl ligand with extended chain
Types of Precursors Generated
Homoleptic: All ligands are identical
Heteroleptic: Mixed ligands
The tool generates all mathematically unique combinations for the central Ir atom with coordination number 3, avoiding permutational duplicates.
Simulation Workflow¶
Step 0: Ligand Library Preparation¶
Each ligand must be prepared as a separate molecular configuration with an anchor atom tagged to indicate the attachment point. In this example, three alkyl ligands are defined:
Methyl Ligand (-CH₃)
# Define elements
elements = [Carbon, Hydrogen, Hydrogen, Hydrogen]
# Define coordinates
cartesian_coordinates = [[ 0.0, 0.0, 0.0],
[ 1.09, 0.0, 0.0],
[-0.363, 0.0, -1.028],
[-0.363, -0.890, 0.514]] * Angstrom
configuration_0 = MoleculeConfiguration(
elements=elements,
cartesian_coordinates=cartesian_coordinates
)
# Tag atoms: carbon is anchor, hydrogens are ligand structure
configuration_0.addTags('H_C', [1, 2, 3])
configuration_0.addTags('anchor', [0])
configuration_name_0 = "Methyl"
Ethyl Ligand (-C₂H₅)
# Define elements
elements = [Carbon, Carbon, Hydrogen, Hydrogen, Hydrogen, Hydrogen, Hydrogen]
# Define coordinates
cartesian_coordinates = [[ 0.0, 0.0, 0.0],
[ 1.513, 0.0, 0.0],
[-0.363, 0.0, -1.028],
[-0.363, -0.890, 0.514],
[-0.363, 0.890, 0.514],
[ 1.876, 0.890, -0.514],
[ 1.876, -0.890, -0.514]] * Angstrom
configuration_2 = MoleculeConfiguration(
elements=elements,
cartesian_coordinates=cartesian_coordinates
)
# Tag the terminal carbon as anchor
configuration_2.addTags('H_C', [1, 2, 3, 4, 5, 6])
configuration_2.addTags('anchor', [1])
configuration_name_2 = "Ethyl"
Propyl Ligand (-C₃H₇)
# Define elements
elements = [Carbon, Carbon, Hydrogen, Hydrogen, Hydrogen, Hydrogen,
Carbon, Hydrogen, Hydrogen, Hydrogen]
# Define coordinates (extended chain)
cartesian_coordinates = [[ 0.0, 0.0, 0.0],
[ 1.513, 0.0, 0.0],
[-0.363, 0.0, -1.028],
[-0.363, -0.890, 0.514],
[-0.363, 0.890, 0.514],
[ 1.876, 0.890, -0.514],
[ 2.017, 0.0, 1.426],
[ 1.876, -0.890, -0.514],
[ 3.107, 0.0, 1.426],
[ 1.654, -0.890, 1.940]] * Angstrom
configuration_1 = MoleculeConfiguration(
elements=elements,
cartesian_coordinates=cartesian_coordinates
)
# Tag the terminal carbon as anchor
configuration_1.addTags('H_C', [1, 2, 3, 4, 5, 6, 7, 8, 9])
configuration_1.addTags('anchor', [6])
configuration_name_1 = "Propyl"
Key parameters for ligand library:
Each ligand configuration must have exactly one atom tagged as
'anchor'Anchor atoms (typically C for alkyl, O for alkoxy, N for amine ligands) bond to the central atom
Additional tags like
'H_C'are optional and help identify ligand structureAll ligands are stored in a Table for systematic processing
Step 1: Define Central Atom and Coordination¶
Specify the central metal atom and desired coordination number (number of ligands):
host_atom = 'Ir' # Central metal atom (Iridium)
num_ligands = 3 # Coordination number
distance_to_host = 2.0 # Initial M-ligand distance in Angstroms
Step 2: Combinatorial Precursor Generation¶
The generator creates all unique ligand combinations using the precursor_generator function from the PrecursorGenerator module. The process involves:
Algorithm Features
Unique combinations: Uses permutation-invariant enumeration (e.g., ABC = ACB = BAC counted once)
Fibonacci sphere distribution: Evenly distributes ligands in 3D space around central atom
Overlap detection: Checks minimum interatomic distances (≥ 2.0 Å)
Anchor validation: Ensures anchor atoms (C in alkyl ligands) are closest to central metal
Automatic adjustment: Increases radius if overlaps detected (up to 1000 attempts)
Systematic naming: Each molecule named as “Ir_ligand1_ligand2_ligand3” (alphabetically sorted)
Output
For the example with 3 different ligands (Methyl, Ethyl, Propyl) and coordination number 3, the generator creates 10 unique combinations:
3 homoleptic: Ir(CH₃)₃, Ir(C₂H₅)₃, Ir(C₃H₇)₃
7 heteroleptic: Ir(CH₃)₂(C₂H₅), Ir(CH₃)₂(C₃H₇), Ir(CH₃)(C₂H₅)₂, Ir(CH₃)(C₂H₅)(C₃H₇), Ir(CH₃)(C₃H₇)₂, Ir(C₂H₅)₂(C₃H₇), Ir(C₂H₅)(C₃H₇)₂
All generated molecules are saved to the output table in HDF5 file for subsequent optimization.
Step 3: Calculator Selection¶
Before optimizing the generated precursor structures, a calculator must be assigned to compute energies and forces. Universal machine learning force fields provide an excellent balance between accuracy and computational speed. So, we employ them at this stage. Users can prefer to use DFT calculators for higher accuracy but at greater computational cost.
MACE Machine Learning Force Field
The example uses the TorchX MACE-MP-0 (Materials Project) medium model:
# Set up MACE machine learning potential
potentialSet = TorchX_MACE_MP_0b3_medium(
dtype='float32',
enforceLTX=False
)
calculator = TremoloXCalculator(parameters=potentialSet)
# Assign calculator to configuration
configuration.setCalculator(calculator)
Calculator Parameters
Force field: TorchX_MACE_MP_0b3_medium
Trained on Materials Project database (diverse chemical space)
Handles metals, organics, and metal-organic systems
Medium-sized model balances accuracy and speed
Data type: float32
Faster computation than float64
Sufficient precision for geometry optimization
Reduces memory requirements
enforceLTX: False
Disables long-range electrostatics enforcement
Suitable for molecular systems (non-periodic)
Faster evaluation for isolated molecules
Alternative Calculator Options
Depending on system size, accuracy requirements, and available computational resources:
TorchX_MACE_MP_0b3_small: Faster, slightly lower accuracy
TorchX_MACE_MP_0b3_large: Higher accuracy, slower, more memory
Classical force fields: UFF, Dreiding for quick screening (less accurate for metal systems)
DFT calculators: B3LYP for high-accuracy refinement (much slower, typically used after MACE optimization)
Why MACE for Precursor Molecules?
Metal-ligand bonding: Trained on transition metal compounds
Organic accuracy: Handles C-H, C-C bonds and alkyl chains accurately
Fast optimization: 1-3 minutes per molecule vs hours with DFT
No parameterization needed: Universal force field (no manual parameter fitting)
Good initial structures: Optimized geometries suitable for subsequent DFT refinement
Step 4: Molecular Geometry Optimization¶
Generated precursor molecules undergo geometry optimization using the assigned machine learning force field calculator. The optimization is performed iteratively for each generated molecule using a table iteration loop.
Optimization parameters:
Max steps: 1000 optimization steps per molecule
Trajectory interval: Save geometry every 100 steps
Convergence criteria: Default force tolerances (0.05 eV/Å)
The optimization refines:
Ir-C bond lengths: Optimal metal-ligand distances (~2.0-2.2 Å)
Bond angles: Coordination geometry (trigonal planar for 3 ligands)
Intra-ligand geometry: C-C and C-H bond lengths, angles, dihedrals
Overall symmetry: Natural molecular shape based on steric effects
Step 5: Subsequent Analysis Suggestion for Selection¶
After optimization, analyze the generated precursors for properties relevant to vapor deposition:
Geometric Properties
Molecular size: Overall diameter affects vaporization
Symmetry: High symmetry can indicate stability
Steric hindrance: Bulky ligands may prevent close packing
Electronic Properties (with DFT follow-up)
HOMO-LUMO gap: Affects reactivity and stability
Partial charges: Indicates bond polarity
Dipole moment: Affects molecular interactions
Stability Indicators
Total energy: Lower energy indicates more stable isomers
No imaginary frequencies: Confirms true minimum
Bond strain: Short or long bonds indicate instability
Customization Options¶
Distance Control¶
Adjust initial metal-ligand distance for different coordination environments:
distance_to_host = 1.8 # Shorter for small ligands
distance_to_host = 2.5 # Longer for bulky ligands
Ligand Types¶
Support for various ligand classes:
Bidentate ligands: Require two anchor atoms (future extension)
Monodentate ligands: Single anchor atom (current implementation)
Neutral ligands: CO, PR₃, etc.
Anionic ligands: acac⁻, Cp⁻, alkoxides, etc.
Central Atom Options¶
Any element can be specified:
host_atom = 'Ti' # Titanium precursors for TiO₂
host_atom = 'Hf' # Hafnium precursors for HfO₂
host_atom = 'W' # Tungsten precursors for W films
host_atom = 'Ir' # Iridium precursors (example shown)
Coordination Numbers¶
Common coordination geometries:
num_ligands = 2 # Linear or bent
num_ligands = 3 # Trigonal planar
num_ligands = 4 # Tetrahedral or square planar
num_ligands = 5 # Trigonal bipyramidal
num_ligands = 6 # Octahedral
Running the Script¶
Using NanoLab Workflow (Recommended)
Open
Precursor_Generator_Example.hdf5in NanoLabPrepare ligand library with anchor atom tags as shown in Step 0
Configure the ligand generator block with:
host_atom: Central metal element (e.g., ‘Ir’)num_ligands: Coordination number (e.g., 3)distance_to_host: Initial distance (e.g., 2.0 Å)
Run the workflow to generate and optimize all precursors
Analyze results in the output tables
Using Python Script
For command-line execution:
atkpython Precursor_Generator_Example_results.py > output.log
The script will:
Generate all 10 unique Ir-alkyl combinations (for 3 ligands, coordination 3)
Save unoptimized structures to
Precursor_Generator_Example_results.hdf5Optimize each structure sequentially using MACE force field
Save optimized structures to
Optimized_Molecules.hdf5Log progress for each molecule optimization
Viewing Results
Open the HDF5 files in NanoLab to visualize:
Initial generated geometries (unoptimized)
Optimized molecular structures
Compare homoleptic vs heteroleptic precursors
Analyze coordination geometry and bond lengths
Performance Considerations¶
Computational Scaling
Number of combinations: Grows combinatorially with ligand types and coordination number
1 ligand type, coordination 3: 1 combination (homoleptic only)
2 ligand types, coordination 3: 4 combinations (2 homoleptic + 2 heteroleptic)
3 ligand types, coordination 3: 10 combinations (3 homoleptic + 7 heteroleptic)
3 ligand types, coordination 4: 15 combinations
4 ligand types, coordination 4: 35 combinations
Generation time: Typically < 1 second per molecule
Optimization time:
1-3 minutes per molecule with MACE force field (small alkyl ligands)
5-10 minutes for larger ligands or coordination numbers
Depends on initial geometry quality and force field accuracy
Memory Requirements
Minimal for molecule generation (< 100 MB)
MACE force field: ~2-4 GB RAM per molecule during optimization
GPU acceleration recommended for MACE (faster by 5-10×)
Parallelization
The table iteration is sequential by default. For large numbers of combinations, consider updating the workflow with array table iteration for parallel execution across multiple CPU cores or compute nodes.
Best Practices¶
Ligand Preparation
Pre-optimize ligands before use (e.g., using MACE or DFT)
Tag anchor atoms correctly using
addTags('anchor', [atom_index])- critical for bondingUse consistent orientation with anchor pointing outward from ligand center
For alkyl ligands, terminal carbon is typically the anchor
For chelating ligands (O-, N- donors), use the donor atom as anchor
Parameter Selection
Start with
distance_to_hostaround 2.0 Å for transition metals with alkyl ligandsAdjust based on ligand type:
Shorter (1.8-1.9 Å) for small ligands (H, methyl)
Longer (2.1-2.5 Å) for bulky ligands (propyl, isopropyl, tert-butyl)
Match experimental M-ligand bond lengths when available
Check for overlaps in initial structures - algorithm retries with larger radius if needed
For coordination numbers > 4, may need larger initial distances
Validation
Verify optimized geometries visually in NanoLab
Check for unrealistic bond lengths or angles
Compare M-ligand distances with experimental data
Calculate vibrational frequencies to confirm minima (no imaginary modes)
DFT Refinement (Optional)
For critical precursors, perform DFT calculations:
Use optimized MACE structure as starting point
Apply DFT with appropriate functional (e.g., B3LYP)
Re-optimize geometry with tighter convergence
Calculate thermochemical properties (enthalpy, entropy, free energy)
Evaluate thermal stability and decomposition pathways
Summary¶
The Precursor Generator application provides an automated workflow for creating precursor molecules used in chemical vapor deposition and related thin film processes. Key capabilities demonstrated in this Ir-alkyl example include:
Combinatorial generation of all 10 unique combinations from 3 alkyl ligands
Intelligent geometry construction with Fibonacci sphere ligand placement and overlap avoidance
High-throughput optimization using MACE machine learning force fields
Systematic exploration of homoleptic (Ir(CH₃)₃, Ir(C₂H₅)₃, Ir(C₃H₇)₃) and heteroleptic precursors
This tool accelerates precursor design for CVD/ALD applications by:
Eliminating manual construction: No need to build each molecule by hand
Ensuring completeness: All mathematically unique combinations are generated
Providing optimized geometries: Ready for further DFT refinement or property calculations
Enabling rapid screening: Quickly evaluate many precursor candidates
Demonstrated Example: Iridium Alkyl Precursors
The example generates and optimizes 10 Ir-alkyl precursors:
3 homoleptic: Ir(Methyl)₃, Ir(Ethyl)₃, Ir(Propyl)₃
7 heteroleptic: Mixed methyl/ethyl/propyl combinations
These can be used to study:
Steric effects: How ligand size affects molecular structure
Deposition selectivity: Which precursors favor specific reaction pathways
Thermal stability: Comparative decomposition energies
Volatility: Molecular size/mass effects on vapor pressure
The automated workflow enables materials scientists to systematically explore precursor chemical space and identify optimal candidates for experimental synthesis and deposition testing.