CrystalStructurePrediction(initial_population, number_of_generations, selection_pressure=2.0, number_of_elites=None, number_to_promote=1, heredity_probability=0.6, permutation_probability=0.0, mutation_probability=0.4, sigma_lattice=0.7, max_forces=PhysicalQuantity(0.01, eV/Ang), max_stress=PhysicalQuantity(0.1, GPa), max_steps=1000, max_step_length=PhysicalQuantity(0.2, Ang), external_pressure=None, rng=None, write_population=True, population_format=None, log_filename_prefix='generation_', permutation_operator=<class 'NL.Dynamics.CrystalStructurePrediction.GeneticOperators.PermutationOperator'>, heredity_operator=<class 'NL.Dynamics.CrystalStructurePrediction.GeneticOperators.HeredityOperator'>, mutation_operator=<class 'NL.Dynamics.CrystalStructurePrediction.GeneticOperators.MutationOperator'>, kpoint_density=None, fitness_function=None, fitness_tolerance=None)¶
Function to perform a crystal sturcture search by applying a genetic algorithm on a given initial population.
- initial_population (list of individuals) – A list of initial BulkConfigurations
- number_of_generations (int) – The number of times the population should be evolved.
- selection_pressure (float) – The pressure applied during the selection of the parents from the population. The higher the pressure the less likely it is to pick an individual with lower fitness. Default: 2.0
- number_of_elites (int) – Only consider the
number_of_elitesfittest individuals when making the selection. Default: Consider all individuals during selection.
- number_to_promote (int) – The number of most fittest individuals to promote to the next generation. Default: 1
- heredity_probability (float) – The unnormalized probability of using the heredity operator. Default: 0.6
- permutation_probability (float) – The unnormalized probability of using the permutation operator. The permutation operator will only be used when there is more than one type of an element in the system. Default: 0.0
- mutation_probability (float) – The probability of using the mutation operator. Default: 0.4
- sigma_lattice (float) – The standard deviation of the random strain applied by the mutation operator. Default: 0.7
- max_forces (PhysicalQuantity in units of force) – The maximum forces determining when the optimization should stop. Default: 0.01*eV/Ang
- max_stress (PhysicalQuantity in units of pressure) – The maximum stress determining when the optimization should stop. Default: 0.1*GPa
- max_steps (uint) – The maximum number of steps for local optimization. Default: 1000
- max_step_length (PhysicalQuantity in units of length) – The maximum step length the local optimizer may take. Default: 0.2*Ang
- external_pressure (PhysicalQuantity in units of pressure) – The target external pressure of the system. Default: 0*GPa
- rng (numpy.random.RandomState) – The random number generated used to generate the random number stream. Default: A new random number generator (seeded from OS entropy pool)
- write_population (bool) – Enables saving the configuration of each individual configuration during the crystal structure prediction. Default: True
- population_format (str) – Sets the storage format for populations. Default: hdf5
- log_filename_prefix (str) – The filename prefix to use for logging the individual optimizations.
- kpoint_density (PhysicalQuantity of type length | None) – The target k-point sampling density. If None, a fixed k-point grid is used. Default: Fixed k-point grid
- fitness_function (callable) – By default, the fitness is defined as the negative enthalpy of the system. This
function allows for the fitness to be customized. The
fitness_functionshould accept two arguments, “configuration” and “enthalpy” , and return the fitness (larger values are more fit). The fitness should be given as a floating point number (without units). The configuration argument is the geometry optimized individual to evaluate the fitness for.
- fitness_tolerance (float) – Configurations whose fitness values differ by more than this amount are considered difference structures. This is used as part of the niching step, which removes duplicate configurations from the population. Default: 0.02
Evolves the current population for
Parameters: number_of_generations (int) – The number of times the genetic algorithm is applied to the population.
Returns: Return the fitnesses of the current population sorted by descending fitness. Return type: list of type float
Returns: Return all configurations in the population sorted by descending fitness. Return type: list of BulkConfigurations
Return the international symbol for each individual of the current population sorted by descending fitness.
Returns: a list of international symbols Return type: list of type str
Determine the global minimum energy crystal structure for TiO2.
# ------------------------------------------------------------- # Initial Population # ------------------------------------------------------------- initial_population = generateInitialPopulation( elements=elements, population_size=10, calculator=calculator, volume=206.29*Angstrom**3, max_forces=0.01*eV/Angstrom, max_stress=0.1*GPa, max_steps=1000, max_step_length=0.2*Angstrom, external_pressure=0.0*GPa, log_filename_prefix="initial_population_", ) # ------------------------------------------------------------- # Crystal Structure Prediction # ------------------------------------------------------------- global_optimization = CrystalStructurePrediction( initial_population, number_of_generations=16, heredity_probability=0.85, mutation_probability=0.1, permutation_probability=0.05, number_to_promote=2, number_of_elites=int(0.6*population_size), selection_pressure=3.0, max_forces=0.005 * eV/Angstrom, )
Find the crystal structure that most closely matches a target band gap, that is also stable (low enthalpy).
# Define custom fitness function. def fitnessFunction(configuration, enthalpy): # Calculate the band gap. band_gap = Bandstructure( configuration=configuration, route=['G','G'], points_per_segment=3, ).indirectBandGap() # Target band gap. target_band_gap = 1.34 * eV # High fitness is low enthalpy and a band gap near the target. A parameter # alpha (ranged from zero to one) controls the relative importance of # enthalpy vs band gap error. alpha = 0.5 fitness = -alpha*enthalpy - (1 - alpha) * abs(band_gap - target_band_gap) # The fitness should not have units. return fitness.inUnitsOf(eV) global_optimization = CrystalStructurePrediction( initial_population, number_of_generations=16, heredity_probability=0.85, mutation_probability=0.1, permutation_probability=0.05, number_to_promote=2, number_of_elites=int(0.6*len(initial_population)), selection_pressure=3.0, max_forces=0.01 * eV/Angstrom, fitness_function=fitnessFunction, )
The goal of crystal structure prediction is to explore the state space of a compound and locate the structure with the maximum fitness for a given chemical composition. In QuantumATK, the default fitness function is set to the negative enthalpy, i.e. the algorithm converges to the structure with the minimal enthalpy (or energy if no external pressure is applied). The primary input to the calculation is the chemical formula, which determines the number of atoms in the unit cell. It is important that enough atoms are present in the unit cell to represent the crystal structure. Just because the correct stoichiometric ratio is used, does not ensure that it is possible to represent the minimum energy structure.
Custom Fitness Function¶
The default fitness function is the negative enthalpy. A custom fitness
function may be provided using the
fitness_function argument. The function
takes two arguments: the minimized configuration and the enthalpy. It should
return the fitness of the configuration (larger fitness means more favorable
Custom fitness functions can be used to find structures that yield desired material properties or match known experimental values. Care must be taken to ensure that the resulting structures are physically plausible. This means that the enthalpy should be included in the fitness. Practically, this means that the fitness should be a weighted sum of the desired physical properties and the enthalpy. Otherwise, the evolutionary algorithm will find very high energy structures that are not stable or possible to synthesize.
The crystal structure prediction algorithm in QuantumATK uses a genetic algorithm. The genetic algorithm begins with an initial population of structures provided by the user. The initial population can be generated using the generateInitialPopulation function. Each step of the algorithm involves generating the next population of structures from the previous one. This evolution process is done in the following manner:
- Promote the
number_to_promotefittest individuals to the next generation.
- Loop until there are
population_sizeindividuals in the new generation:
- Pick which genetic operation should be performed. Each operation is given a user defined probability of being selected.
- Select one (mutation and permutation operator) or two parent structures (heredity operator). The selection probability is weighted by the fitness of the parental structures.
- Execute the genetic operation and create a new structure.
- Perform a local optimization of each individual of the population.
There are four types of genetic operators available:
- Promotion: this operation is always performed at the beginning of the
evolution process to ensure that the best
number_to_promoteindividuals are kept.
- Mutation: perturb the lattice vectors and atomic positions of the chosen configuration randomly.
- Permutation: swap the position of atoms of differing elements.
- Heredity: combine two parental structures to a new one. A random cut plane is used to divide the atoms of the two parent structures; all atoms below the plane in one structure are combined with the atoms above the plane in the other structure to create a new “child” structure. Care is taken to ensure that the total number of each element is not changed.
The algorithm stops after evolving the population
times. Given the stochastic nature of the algorithm, this stopping criterion
does not ensure that one has found the global solution. The stability of the
search should be checked at least with respect to the chosen seed of the pseudo
random number generator (
After generating a new population, each individual of the population is away
from a local minimum in enthalpy, hence a geometry optimization must be
performed to converge to a close local minimum. Since the global minimum is
also a local minimum, one is guaranteed to find it for sufficiently large
number_of_generations. These geometry (local) optimizations
comprise nearly all of the computational work during crystal structure
One key difference between a genetic algorithm and a random search is that
the selection of the parent structure(s), on which the genetic operations
are performed, is biased by the fitness of that structure. The higher the
fitness (i.e. the lower the calculated enthalpy in this implementation),
the more likely it is to pick this structure. The skewness of the
probability distribution is altered by
selection_pressure. A large
selection_pressure makes it less likely to select a
configuration with a lower fitness.
A crystal structure prediction run can be restarted / resumed from a previous point in the following manner:
CSP = CrystalStructurePrediction( ... ) # Save the current state of the CSP object. nlsave('my_file.hdf5', CSP, object_id='my_run') ... # Read back a stored CSP object. The last population is restored # and all given properties / attributes are set up. CSP = nlread('my_file.hdf5', object_id='my_run') # Run 10 more iterations of the genetic algorithm. CSP.evolvePopulation(10) # Run 5 more iterations of the genetic algorithm. CSP.evolvePopulation(5)
At this time it is not possible to resume runs that use a custom fitness function.
True then after each step of the genetic
algorithm the current population will be written to disk. The filename will
log_filename_prefix and include the current generation number.
For example, the initial population will be saved as
after the first step a new file
generation_1.hdf5 will be saved. The
crystal structures contained in these files can be viewed in QuantumATK using