Heuristics-Guided Exploration of Reaction Mechanisms
For the investigation of chemical reaction networks, the efficient and accurate determination of all relevant intermediates and elementary reactions is mandatory. The complexity of such a network may grow rapidly, in particular if reactive species are involved that might cause a myriad of side reactions. Without automation, a complete investigation of complex reaction mechanisms is tedious and possibly unfeasible. Therefore, only the expected dominant reaction paths of a chemical reaction network (e.g., a catalytic cycle or an enzymatic cascade) are usually explored in practice. Here, we present a computational protocol that constructs such networks in a parallelized and automated manner. Molecular structures of reactive complexes are generated based on heuristic rules derived from conceptual electronic-structure theory and subsequently optimized by quantum chemical methods to produce stable intermediates of an emerging reaction network. Pairs of intermediates in this network that might be related by an elementary reaction according to some structural similarity measure are then automatically detected and subjected to an automated search for the connecting transition state. The results are visualized as an automatically generated network graph, from which a comprehensive picture of the mechanism of a complex chemical process can be obtained that greatly facilitates the analysis of the whole network. We apply our protocol to the Schrock dinitrogen-fixation catalyst to study alternative pathways of catalytic ammonia production.
ETH Zürich, Laboratory of Physical Chemistry,
Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland.
Complex reaction mechanisms are found in transition-metal catalysis , polymerizations , cell metabolism , flames and environmental processes , and are the objective of systems chemistry . Knowing all chemical compounds and elementary reactions of a specific chemical process is essential for its understanding in atomistic detail. Even though many chemical reactions result in the selective generation of a main product , generally, multiple reaction paths compete with each other leading to a variety of side products. In such cases, a reactive species (such as a radical, a valence-unsaturated species, a charged particle, a strong acid or base) can be involved or the energy deposited into the system may be high (e.g., due to a high reaction temperature).
For a detailed analysis of a chemical system, relevant intermediates and transition states are to be identified according to their relative energies. Manual explorations of complex reaction mechanisms employing well-established electronic-structure methods are slow, tedious, and error-prone. They are therefore limited to the search for expected dominant reaction paths. It is therefore desirable to develop a fully automated protocol for an efficient and accurate exploration of configuration spaces involving both intermediates and transition states.
Existing approaches comprise, for example, global reaction route mapping , and reactive molecular dynamics [8, 9, 10, 11]. Starting from a given structure, the global-reaction-route-mapping procedures evolves along the corresponding potential-energy surface (PES) by exploiting local curvature information. Since the dimension of a PES scales with the number of atomic nuclei, the global-reaction-route-mapping methods, though highly systematic, are not suitable for the exploration of large reactive systems. Contrary to such stationary approaches, in reactive-molecular-dynamics simulations, the nuclear equations of motion are solved to explore and sample configuration spaces. Recently, the capability of reactive ab initio molecular dynamics for studying complex chemical reactions was shown at the example of the prebiotic Urey–Miller experiment [10, 11]. To overcome the high computational demands of first-principles calculations in ab initio molecular dynamics, reactive force fields  can be employed, which accelerate calculations by some orders of magnitude . They are, however, not generally available for any type of system.
A complementary strategy is to exploit the conceptual knowledge of chemistry from quantum mechanics to explore reaction mechanisms . By applying predefined transformation rules to create new chemical species based on reactivity concepts, searches in the chemical configuration space are accelerated without resorting to expert systems as applied for synthesis planning [13, 14, 15, 16, 17, 18, 19, 20, 21]. In this context, the efficiency of heuristics-guided quantum chemistry was explored very recently [22, 23, 24, 25]. The idea behind heuristic guidance in quantum chemistry is to propose a large number of hypothetical molecular structures, which are subsequently optimized by electronic-structure methods.
Here, we describe a fully automated and parallelized exploration protocol in which the synergies of chemical concepts and electronic-structure methods are exploited. We implement a heuristics-guided setup of reactive complexes by placing a reactive species in the vicinity of a target reactive site so that a high-energy structure emerges. Hence, we construct ’reactive complexes’ as supermolecular structures of reactants that are sufficiently high up the Born–Oppenheimer potential energy surface to produce a reaction product upon structure optimization. The reactive complexes are then optimized with quantum-chemical methods to eventually yield stationary points in the vicinity of these high-energy structures. The stationary points finally enter as vertices an automatically generated reaction network. Subsequently, elementary reactions between pairs of stationary points are identified and validated by automatically launched transition-state searches and intrinsic-reaction-path calculations, respectively. Finally, the results are visualized as automatically generated network graphs endowed with thermodynamic and kinetic parameters.
To illustrate the functionality of our protocol, we apply it to an important and still not well understood problem in chemistry, that is catalytic nitrogen fixation under ambient conditions in the homogeneous phase. For this purpose, we investigate the chemical reactivity of the molybdenum complex developed by Schrock and co-workers [26, 27, 28]. This catalyst, like all others developed for this purpose [29, 30, 31], is plagued by a very low turnover number. By applying our protocol to a simplified model system, we aim to better understand the low efficiency of the catalyst.
2 Heuristic Guidance for Quantum-Chemical Structure Explorations
In the context of reaction mechanisms, heuristic rules serve to propose the constituents of a chemical reaction network, which, when optimized, will be the minimum-energy structures and transition states that are energetically accessible at a given temperature. Although this approach cannot guarantee to establish a complete resulting reaction network, heuristic methods allow for a highly efficient and directed search based on empiricism and chemical concepts.
Crucial for the construction of such heuristic rules is the choice of molecular descriptors. For the study of chemical reactions, graph-based descriptors dominate the field [16, 17, 32, 19, 33, 34, 35, 22, 23, 24, 25], which are based on the concept of the chemical bond. Zimmerman [22, 23] developed a set of rules based on the connectivity of atoms to generate molecular structures and to determine elementary reactions. Quantum-chemical structure optimizations and a growing-string transition-state-search method [36, 37, 38] were applied to study several textbook reactions in organic chemistry. Aspuru-Guzik and co-workers [24, 25] developed a methodology for testing hypotheses in prebiotic chemistry. Rules based on formal bond orders and heuristic functions inspired by Hammond’s postulate to estimate activation barriers were applied to model prebiotic scenarios and to determine their uncertainty. Very recently, a new algorithm for the discovery of elementary-reaction steps was published  that uses freezing-string and Berny-optimization methods to explore new reaction pathways of organic single-molecule systems. While graph-based descriptors perform well for many organic molecules, they may fail for transition-metal complexes, where the chemical bond is not always well defined .
Complementary to Zimmerman’s and Aspuru-Guzik’s approaches, we aim at a less context-driven method to be applied to an example of transition-metal catalysis. Clearly, such an approach must be based on information directly extracted from the electronic wave function so that no additional (ad hoc) assumptions on a particular class of molecules are required. In the first step of our heuristics-guided approach, we identify reactive sites in the chemical system. When two reactive sites are brought into close proximity, a chemical bond between the respective atoms is likely to be formed (possibly after slight activation through structural distortion). In addition, we define reactive species which can attack target species at their reactive sites. This concept is illustrated in Fig. 1.
A simple example for the first-principles identification of reactive sites is the localization of Lewis-base centers in a molecule as attractors for a Lewis acid. Lone pairs are an example for such Lewis-base centers and can be detected by inspection of an electron localization measure such as the electron localization function (ELF) by Becke and Edgecombe  or the Laplacian of the electron density as a measure of charge concentration  (see also Ref. \citenumAyers2005). Other quantum chemical reactivity indices can also be employed, such as Fukui functions , partial atomic charges [45, 46, 47], or atomic polarizabilities [48, 49]. With these descriptors, reactive sites can be discriminated, i.e., not every reactive site may be a candidate for every reactive species (indicated by the coloring in Fig. 1). For example, an electron-poor site is more likely to react with a nucleophile rather than with an electrophile. Moreover, reactive species consisting of more than one atom may have distinct reactive sites. Naturally, the spatial orientation of a reactive species toward a reactive site is important.
In the second step, reactive species are added to a target species resulting in a set of candidate structures for reactive complexes. Such compound structures should resemble reactive complexes of high energy (introduced by sufficiently tight structural positioning of the reactants, optionally activated by additional elongation of bonds in the vicinity of reactive sites) which are then optimized employing electronic-structure methods (third step). By means of standard structure-optimization techniques we search for potential reaction products for the reaction network from the high-energy reactive complexes. Several structure optimizations of distinct candidates may result in the same minimum-energy structure. Such duplicate structures must be identified and discarded to ensure the uniqueness of intermediates in the network (fourth step). It should be noted that each intermediate of a reaction network can be considered a reactive species to every other intermediate of that network.
Through a structural comparison [based on a distance criterion such as the root-mean-square deviation (RMSD)], pairs of structures which can be interconverted by an elementary reaction, i.e., a single transition state, are identified (fifth step). If no such pair can be found for a certain structure, the local configuration space in the vicinity of that structure needs to be explored further to ensure that no intermediate will be overlooked.
In the sixth step, the automatically identified elementary reactions are validated by transition-state searches and subsequent intrinsic-reaction-path calculations. Several automated search methods for transition states are available such as interpolation methods , eigenvector following [51, 52, 53, 54, 55], string methods [56, 57, 36], the scaled-hypersphere-search method , constrained optimization techniques , quasi-Newton methods [60, 61], Lanczos-subspace-iteration methods  and related techniques [63, 64], or Davidson-subspace-iteration-based algorithms ).
In the seventh and last step of our heuristics-guided approach, a chemical reaction network comprising all determined intermediates and transition states is automatically generated. The visualization of results as network graphs in which vertices and edges represent molecular structures and elementary reactions, respectively, supports understanding a chemical process in atomistic detail. The readability of a network graph can be enhanced if vertices and edges are supplemented by attributes such as colors or shapes chosen with respect to their relative energy or to other physical properties.
Even though our heuristics-guided approach aims at restricting the number of possible minimum-energy structures, the number of generated intermediates may still be exhaustively large as the following example illustrates. For a protonation reaction, we may assume that the number of different protonated intermediates can be determined from the unprotonated target species by identifying all reactive sites (RS) which a proton, the reactive species, can attack. This number is given by a sum of binomial coefficients,
where is the number of reactive sites and is the number of protons added to the target species. Even for such a simple example, the number of possible intermediates increases exponentially. For example, for a target species with ten reactive sites, intermediates will be generated. Obviously, the transfer of several protons to a single target species is not very likely from a physical point of view as charge will increase so that the acidity of the protonated species might not allow for further protonation. In the presence of a reductant, however, these species can become accessible in reduced form.
3 Construction of Complex Reaction Networks
Of all chemical species generated by the application of heuristic rules, some will be kinetically inaccessible under certain physical conditions. By defining reaction conditions (in general, a temperature ) and a characteristic time scale of the reaction under consideration, one can identify those species that are not important for the evolution of the reactive system under these conditions. Even if these intermediates are thermodynamically favored, they may not be populated on the characteristic time scale at temperature . By removing these species from the network, one can largely reduce its complexity, which in turn simplifies subsequent analyses (such as kinetics simulations as, for instance, presented in Ref. \citenumGlowacki2012).
For the following discussion, we introduce the notation that a chemical reaction network (or network) is to be understood as a connected graph built from a set of intermediates (vertices) and a set of elementary reactions (edges). A path shall denote a directed sequence of alternating vertices and edges, both of which occur only once. A subnetwork is a connected subgraph of a network uniquely representing a single PES defined by the number and type of atomic nuclei, the number of electrons, and by the electronic spin state. Subnetworks can be related to each other according to the heuristic rules which describe addition or removal of reactive species (defined by their nuclear framework, i.e., by their nuclear attraction potential and charge) and electrons. The initial structures are referred to as zeroth-generation structures, and generated structures are referred to as higher-generation structures. Substrates are species that represent the reactants of a complex chemical reaction. The initial population of all other target species is zero.
For the exclusion of non-substrate vertices from the reaction network, we propose a generic energy-cutoff rule: If each path from a substrate vertex to a non-substrate vertex comprises at least one sequence of consecutive vertices with an increase in energy larger than a cutoff , then we remove the non-substrate vertex from the network.
The application of this energy-cutoff rule is illustrated in Fig. 2. Starting from substrate 0, intermediate 1 can only be reached via a transition-state higher than , and therefore, it can be removed from the network. Since intermediates 2 and 3 can only be reached via intermediate 1, they can also be omitted. Despite being similar in energy to substrate 0, intermediate 5 can be discarded, since it can only be formed by a transition state higher than . Even though the transition state between intermediates 6 and 7 is below , the population of intermediate 7 is negligible, since, starting from substrate 0, it can only be formed through intermediate 4.
Note that this energy-cutoff rule is conservative as we compare energy differences of stable intermediates, which are a lower bound for activation energies of reactions from a low-energy intermediate to one that is higher in energy. Therefore, intermediates can be removed prior to the calculation of transition-state structures, which significantly saves computational resources. Once transition states are calculated, this rule can be reapplied to further reduce the complexity of the network in order to arrive at a minimal network of all relevant reaction steps.
The introduced kinetic cutoff depends on temperature and on the characteristic time scale of the reaction. For instance, assuming a reactive system following Eyring’s quasi-equilibrium argument , one can determine the average time for a unimolecular reaction to occur. For this purpose, the general Eyring equation,
is employed, with the rate constant , the Boltzmann constant , the Planck constant , the Gibbs free energy of activation , and the temperature . We understand the half life as the time after which a molecule has reacted with a probability of 50%. For an activation free energy of 25 kcal/mol and a temperature of K, the average time for a unimolecular reaction to occur equals three days. This time may well be considered an upper limit for a practical chemical reaction. If one can afford longer reaction times, the energy cutoff needs to be increased. Similarly, if one is interested in a range of temperatures, , the energy cutoff has to be adapted to the maximum temperature . Otherwise, intermediates would be removed from the network which are accessible at . In a conservative exploration, a reasonable choice for the maximum temperature may be the decomposition temperature of an important compound class studied.
Special attention needs to be paid to the energy differences between intermediates of different subnetworks, since our protocol divides the PES of the chemical system into various subsystem PES’s. For instance, if two intermediate structures differ by one reactive species, say a proton, the energy for supplying that reactive species by a strong acid has to be taken into account. Otherwise, different subnetworks of a network cannot be compared as the total energies to be compared depend on the number of (elementary) particles.
4 The Chatt–Schrock Network
The (generic) Chatt–Schrock cycle [26, 27] (Fig. 3) is the prominent example of catalytic nitrogen fixation . Its intermediates (referred to as Schrock intermediates hereafter) are formed by an alternating sequence of single protonation and single electron-reduction steps of Schrock’s nitrogen-ligated molybdenum complex [26, 27]. The sources of protons and electrons are 2,6-lutidinium (2,6-LutH) and decamethylchromocene (Cr), respectively. This mechanism, however, does not explain the small turnover number of the catalyst. To demonstrate our heuristic network-exploration algorithm described above, we aim at identifying competing reaction paths of the Chatt–Schrock cycle.
For details on the Computational Methodology, see the appendix.
4.1 Heuristics-Guided Structure Search
For the first and second half of the catalytic cycle (Fig. 3) [Mo]–N and [Mo]–N (Fig. 5) are taken as zeroth-generation structures, respectively. Here, [Mo] refers to the Yandulov–Schrock complex [26, 27] where the hexa-iso-propyl terphenyl (HIPT) substituents are replaced by methyl groups to reduce the computational cost. The bulky HIPT substituents can be reintroduced once the network has been established.
In this study, we only consider protons as reactive species since protonations of the amide nitrogen atoms are likely to deactivate the catalyst [28, 68]. Additionally, we take different charges of the protonated complexes into account (single electron-reduction steps from + to neutral, with being the number of protons added). However, for a more extensive exploration, H, N, NH, NH, and intermediates theirselves must also be considered as reactive species.
To determine the reactive sites of the substrates, we exploit knowledge about negative charge concentrations extracted from the electronic wave function. As an example in Fig. 4, we present the isosurface of the ELF colored with the value of the electrostatic potential for the two parent species, [Mo]-N and [Mo]-N, of the two halves of the Chatt–Schrock cycle in Fig. 3. Whereas the ELF highlights regions in space where electron density is localized, the electrostatic potential allows us to pick those regions that can function as a Lewis base (highlighted in blue in Fig. 4 and showing, e.g., lone pairs) by contrast to the other regions that are electron deficient and feature hardly any Lewis basicity (highlighted in orange in Fig. 4 and showing, e.g., C–H -bonds). The blue regions therefore define spatial areas that function as reactive sites to which protons should be added as reactive species.
Note that this procedure is solely based on the first principles of quantum mechanics, but that it is also in line with conventional chemical wisdom that, for carbon and nitrogen atoms, the formation of a tetrahedral surrounding including both bonding neighbors and reactive sites for incoming reactants (protons) represents a valence-saturated electronic situation. At the molybdenum center, three reactive sites are introduced in the plane spanned by the three amide-nitrogen atoms. We refrain from protonating the coordinating amine nitrogen atom in trans-position to the N ligand as this would produce a decomposition pathway of the catalyst that is not likely to lead to an alternative catalytic cycle. Clearly, these decomposition reactions are important to track for a complete understanding of Schrock-type dinitrogen fixation catalysis, but we devote this aspect to future work. Instead, we are less restrictive with respect to the possible protonation sites at the metal center and at the terminal nitrogen atom of N in [Mo]-N — density-functional-theory calculations are fast for the size of system under study and can be carried out in parallel so that one should not limit the number of possible reactive sites too much in order not to risk overlooking of important intermediates. All reactive sites considered as proton-acceptor sites in this work are shown in Fig. 5. Up to four protons are added combinatorially to the zeroth-generation structures. This number may be considered a chemically reasonable upper limit.
Since the Yandulov–Schrock catalyst operates in the presence of a strong reducing agent (Cr), protonated species can be readily reduced. Therefore, for a -fold protonated reactive complex, we consider the charges . The result is a total number of
structures. For [Mo]-N =6762 and for [Mo]-N =3577 structures are obtained. However, for the subsequent structure optimizations these numbers are slightly reduced as the two protonation sites exposed by each amide nitrogen atom are occupied by at most one proton to yield an amine nitrogen atom.
Decomposed reactive complexes such as those from which dihydrogen dissociated or in which the chelating ligand (partially) dissociated from the metal center, are automatically removed from the network, but stored for future investigation (see also Supporting Information).
To ensure the uniqueness of each vertex in the network, duplicate structures are removed. For this, the threefold symmetry of the catalyst was taken into account.
Subsequently, elementary reactions connecting two vertices are identified. We determine two intermediates of the same subnetwork to be constituents of an elementary reaction if they are related by a shift of a single proton, i.e., exactly one proton is required to change its position, while the change of the remaining molecular structure is below a predefined RMSD cutoff of 0.5 Å for the first half and 0.65 Å for the second half of the Chatt–Schrock cycle. For elementary reactions between subnetworks, the same RMSD cutoff was chosen to determine the shared identity of two molecular structures (apart from an added proton in case of a protonation reaction). Clearly, also in general case some criterion for structural similarity of two vertices may be used to assess whether they are connected by an elementary reaction step.
Subsequently, transition-state searches based on electronic-structure methods are performed for the proposed intra-subnetwork elementary reactions. We employed constrained optimizations to generate reasonable guess structures, which are refined by an eigenvector-following algorithm (see the Computational Methodology in the appendix for details) and verified by Mode-Tracking calculations [69, 65]. Finally, intrinsic-reaction-path calculations are performed. For intermolecular reactions (i.e., proton transfers from 2,6-LutH), no transition states were calculated.
In this study, we applied an energy cutoff of = 25 kcal/mol for the relative energy of two stable intermediates connected by an elementary reaction. Note that this threshold does not refer to a Gibbs free reaction energy but to an electronic-energy difference. Nevertheless, since only reactions of the same type are compared, one can expect only small deviations of electronic energies from Gibbs free energies for intranetwork reactions (proton shifts) and reduction steps, and assume that this simplification is also a good approximation for protonation reactions.
4.2 Network Superstructure
In Fig. 6, subnetworks are arranged according to the number of protons and electrons added. Here, the subnetworks are denoted as (H, ), where is the number of hydrogen atoms added to the substrate (1 = [Mo]-N, 2 = [Mo]-N) and is the charge of the subnetwork. An arrow pointing from subnetwork a to subnetwork b will be crossed out if all intermediates of subnetwork b are at least by energetically higher than all intermediates of subnetwork a. In such cases, our cutoff rule can be applied to entire subnetworks since the lowest possible transition barrier from subnetwork a to subnetwork b will be larger than .
Due to the energy-cutoff rule, entire subnetworks can be pruned and excluded from further analysis. For instance, starting from (0H,0), (2H,2+) cannot be reached via any other subnetwork without having to overcome a transition state that is above . Therefore, (2H,2+) can be removed from the network. In both halves of the Chatt–Schrock cycle, all subnetworks with a total charge larger than one can be neglected. The pruning of these networks largely reduces the complexity of the network since now every subnetwork can only be reached from one other subnetwork.
The energy profiles of the first and second half of the catalytic cycle are shown in Fig. 7 a) and b), respectively. In this figure, energy levels of Schrock intermediates are connected by dashed lines. An additional energy level will be shown if an intermediate lower in energy than the Schrock intermediate is part of that subnetwork. Moreover, if intermediates from which H dissociated are observed, the intermediate with the lowest energy will be shown in red.
It can be seen that most reactions of the first half of the Schrock cycle are exothermic. Especially the reductions of the Schrock intermediates of (1H,1+) and (3H,1+) are thermodynamically favorable. In addition, the dissociation of NH from (3H, 0) has a highly negative reaction energy. Note, however, that this dissociation energy holds for a specific choice of acid and reductant so that the assignment of this reaction energy solely to the breaking of the N–N bond would be misleading as discussed in our earlier work [70, 71].
There are also endothermic reactions. For example, the protonation of the Schrock intermediate of (0H,0), was calculated to have a positive reaction energy of +3.1 kcal/mol. A thermodynamically favorable alternative to the protonation of N is the protonation of the amide of the chelate ligand. This intermediate is lower in energy ( kcal/mol) than the Schrock intermediate.
Also most reactions in the second half are exothermic. The protonation of the Schrock intermediate in (1H,0) is particularly exothermic with a reaction energy of 32.2 kcal/mol. Nonetheless, there are subnetworks in which the Schrock intermediate is not the most stable species. In (3H,0), for instance, there is an intermediate which is more stable ( kcal/mol) than the respective Schrock intermediate. Furthermore, it can be seen that the dissociation of H is thermodynamically favorable in several subnetworks. For example, in (2H,0) the dissociation of H even results in the most stable intermediate.
However, we should emphasize that those structures which are very similar in terms of energy may be considered equally stable, especially when viewed in the light of the quantum chemical methodology chosen here. Moreover, one must keep in mind that the network exploration was carried out for a small model complex of the Schrock catalyst with a double-zeta basis set.
While the dissociation of NH is energetically favorable in the first half, it is very unfavorable in the second. Therefore, a four-coordinate [Mo] intermediate appears unlikely, and an associative exchange mechanism might be favored over the dissociative one as has already been discussed in the literature [71, 68, 72]. Since further protonation of (3H,0) results in low-energy intermediates, we can identify this subnetwork as a possible starting point of degradation.
The results for the Schrock intermediates reported here are in qualitative agreement with those reported earlier by us [70, 71, 68, 73, 72] and by Tuczek and coworkers [74, 75]. Numerical deviations for the Schrock intermediates are mostly due to the choice of a small model structure and the smaller basis set employed in this work.
4.3 Reaction Network
To rationalize the low efficiency and stability of Schrock’s nitrogen fixation catalyst, not only all possible intermediates but also the transition states connecting them need to be analyzed. The reaction network of the Chatt–Schrock cycle automatically generated by our program is shown in Fig. 8. Each vertex represents an intermediate, whereby the color encodes the energy difference with respect to the lowest intermediate of the subnetwork. Vertices representing a Schrock intermediate are enlarged. A collection of vertices belonging to the same subnetwork is enclosed by a solid black line. Two vertices of the same subnetwork are connected by an undirected edge if a transition state was found between them. The gray scale of such an edge serves as a visual cue indicating the height of the transition barrier with respect to the lower-energy intermediate. Light-gray edges represent high-energy barriers, dark-gray edges represent low-energy barriers. It is important to note that according to our exclusion rule, many transition states in this network need not to be optimized and can therefore be omitted. However, to illustrate the complexity of such reaction networks, transition states with an energy above were not removed. Vertices of different subnetworks are connected by undirected edges (dashed lines) for which no transition states were calculated. In addition, the molecular structures of selected intermediates are shown in Fig. 8 a) – g).
Starting from the [Mo]-N complex in (0H, 0), a proton is added to reach (1H,1+). As can be seen from the dashed lines, this reaction can result in four different intermediates. The Schrock intermediate, the [Mo]-N complex protonated at the molybdenum atom (Fig. 8 a)), the [Mo]-N complex with a proton at one of the amido groups (Fig. 8 b)), and the enantiomer of that intermediate (Fig. 8 c)). From there, each intermediate can either undergo a reduction to form an intermediate in (1H,0), or—through an intramolecular reaction—transform into another intermediate of the same subnetwork. The subsequent protonation of intermediates in (1H,0) leads to the subnetwork (2H,1+).
The inspection of the first four subnetworks already suggests a feasible alternative to the Chatt–Schrock mechanism: The [Mo]-N complex is protonated at the amido group; this intermediate undergoes reduction, protonation (of the axial N), and finally a proton shift to reach the energetically most favorable intermediate, the Schrock intermediate of (2H,1+).
The reduction of the intermediates in (2H,1+) leads to a subnetwork in which not the Schrock intermediate but intermediate d) is the most stable intermediate. This intermediate can be reached through several different cascades of transformations, which however all contain at least one that is comparatively high in energy. It can be seen in Fig. 8 that once an intermediate of (2H,0) other than the Schrock intermediate is protonated, no rearrangement reaction within (3H,1+) was found which leads to the Schrock intermediate. This also suggests that the Schrock intermediate in (3H,1+), which is relatively high in energy, does not easily transform into a more stable intermediate of the same subnetwork. Likewise, (3H,1+) and (3H,0) (not shown in Fig. 8) can be considered relevant for the process of degradation of this catalyst.
After reduction of the Schrock intermediate of (3H,1+), NH dissociates and the [Mo]-N complex is formed. Similar to the first half, the protonation of the [Mo]-N complex can lead to two different intermediates: the Schrock intermediate and the [Mo]-N complex with a proton at one of the amido groups. These two structures could give rise to two different reaction paths. Furthermore, two other subnetworks appear to be particularly prone to initiating degradation: (3H,1+) and (3H,0). In both subnetworks there are intermediates that are more stable than the Schrock intermediate, which can be reached via low-energy transition states. In (3H,1+), it is the shift of one of the three protons from one of the axial nitrogen atoms to an amido group (see structures e) and f)). Either of these structures can undergo an additional transformation where the proton bound to the amido group shifts to the molybdenum center. After reduction, this structure forms intermediate g)—the most stable conformation of (3H,0). Likewise, the Schrock intermediate of (3H,0) can undergo a proton shift to form intermediate g). As mentioned earlier, the dissociation of NH from [Mo]-NH is highly endothermic [71, 68, 72], and therefore, the exchange of NH and N via a six-coordinated complex is likely to occur. Therefore, the intermediate g) in (3H,0) can be considered particularly relevant to understanding the low turnover number of the catalyst.
It should be evident from the presentation above that our automated visualization strategy generates a presentation of chemical reaction networks that directly unveils its essence to the reader. Thereby, even complex reaction mechanisms involving many side reactions become lucid and, hence, will be comprehensible.
4.4 Alternative Pathways to the Chatt–Schrock Cycle
By applying the energy-cutoff rule together with the cutoff = 25 kcal/mol, many intermediates in the reaction network can be removed. The resulting reaction network allows for the identification of reaction pathways, other than the Chatt–Schrock cycle, that are likely to occur at ambient conditions. These pathways are shown in Fig. 9.
It can be seen that multiple pathways next to the Chatt–Schrock cycle are indeed possible. For example, two pathways running parallel to the Chatt–Schrock cycle can be identified. It is important to note that pathways which do not form a cycle, and thus lead to the degradation of the catalyst, are not shown, but will be investigated in future studies.
In this work, a heuristics-guided protocol for the automatic exploration of chemical reaction spaces is presented. Our heuristics-guided search for chemical species is based on an intuitive chemical construction principle. A target species (e.g., a catalyst scaffold) reacts with a reactive species (e.g., a radical or a charged particle) to yield an intermediate. A collection of all intermediates is arranged in a reaction network. The resulting chemical reaction network can be pruned by defining an energy cutoff, which allows for the exclusion of those intermediates which are inaccessible under a range of reasonable physical reaction conditions and on the time scale of interest.
Our protocol for finding vertices in the reaction network exploits conceptual electronic-structure theory to apply heuristic rules for the search of potential products in complex reaction mechanisms. The heuristic rules guide the construction of reactive complexes that are high-energy structures of supermolecules built from reactants from which reaction products (intermediates) are produced upon structure optimization. The structures of these intermediates enter an emerging reaction network, in which elementary reactions can be identified in an automated way.
There are several advantages of our heuristics-guided exploration protocol in practice. Many calculations of the steps involved — generation and optimization of reactive-complex structures, elementary-reaction proposition, transition-state search, and network pruning — can be carried out independently, which facilitated a trivial parallelization of the whole procedure. The setup of structures that resemble reactive complexes based on concepts from electronic-structure theory make the approach applicable to arbitrary reactions and is not limited to any sort of molecule. Our approach is also efficient for iterative or nested explorations in which the heuristic search restarts from higher-generation structures.
We applied our heuristics-guided exploration protocol to the Chatt–Schrock nitrogen-fixation cycle. Its competing reaction paths were not studied in sufficient detail. We explored a vast number of possible elementary reactions that describe protonation, proton-rearrangement, and reduction steps. The resulting network turned out to be highly complex and alternative routes that still sustain the catalytic cycle emerge. The application of an automated visualization strategy by which thermodynamic and kinetic network properties was crucial to facilitate the interpretation of such complex reaction mechanisms. In future work, we will consider the possible degradation reactions that may deactivate the catalyst and that may eventually explain its low turnover number.
Appendix: Computational Methodology
Restricted and unrestricted density-functional-theory calculations were carried out depending on the lowest spin multiplicity of a given intermediate. For this, BP86/def2-SV(P) [76, 77, 78] structure optimizations of reactive complexes were performed with the program package Turbomole  (version 6.4.0) including the resolution-of-the-identity density-fitting technique. Single-point calculations were considered to be converged when the total electronic-energy difference between two iteration steps was less then 10 Hartree. Structure optimizations were considered converged when the norm of the electronic-energy gradient with respect to the nuclear coordinates dropped below 10 Hartree/Bohr. If a structure optimization failed because a self-consistent-field calculation did not converge, the damping parameters had been changed automatically and the optimization was restarted. In those cases where a structure optimization did not converge within 1200 iterations, the corresponding data was saved and the structure was manually inspected to decide whether it should be part of the chemical reaction network or not. 9607 structure optimizations were carried out in total.
Constrained BP86/def2-SV(P) optimizations were performed wih Gaussian  (version 09, revision C.1) to obtain reasonable starting structures of transition states, which were refined with Turbomole’s trust-radius-image-based eigenvector-following optimization choosing a trust-radius of 0.2 Å. The eigenmode to follow was obtained from a Mode-Tracking calculation [69, 65]. From the converged transition states, intrinsic reaction paths were calculated with Gaussian to determine whether a desired transition state was found. We employed the default convergence criteria for all Gaussian calculations. If the constrained optimization scan with a subsequent eigenvector-following calculation did not converge to the desired transition state, the freezing-string method as implemented in Q-Chem (version 4.0.1)  was employed with subsequent EVF as described above. We identified 2318 elementary reactions for which transition states were optimized.
To shed more light on the success rate of identifying transition states for these reactions by our automated search and therefore of verifying the assumption that two intermediates are truly connected by an elementary reaction, we may add some additional details. 1082 potential elementary reactions were automatically identified for the first and 1236 for the second half of the Chatt–Schrock cycle. The transition-state search was then conducted with three different strategies yielding a total of 6954 transition-state searches. This number, however, is only an upper bound as a search was stopped once one of these strategies was successful. In the first half of the cycle, our automated protocol identified 329 out of the 1082 potential elementary reactions by optimizing the transition state. In the second half, it identified 613 out of the 1236 potential elementary reactions. For some steps, for which our implementation was not able to find a transition state, we verified by manual inspection that a transition state is not likely to exist or to be of sufficiently low energy. Hence, our algorithm produces more potential pairs that could be connected by an elementary reaction than there are. Of course, this number is determined by the structural similarity measure that we employ to relate the two structures. Obviously, our RMSD criterion produces many false positive results. However, this is actually desired as one cannot be certain to have found all relevant vertices and edges of a reaction network so that all criteria and measures should be set and selected in a conservative and therefore not too restrictive way.
The calculation of the ELF and of the electrostatic potential were performed with Molden 5.4 . In the color range of the electrostatic potential mapped onto the ELF isosurface, the most positive charge was omitted as otherwise the color differences between all other atoms would have been very small. The presentation of the data in Fig. 4 was generated with Jmol 14.0.7  from a cube file produced with Molden.
The visualization of all (sub)networks was automated in the programming language Python. For their representation, the Python software package NetworkX  was applied.
This work was financially supported by ETH Zürich (Grant ETH-20 15-1) and by the Schweizerischer Nationalfonds (SNF project 200020_156598). MB and GNS gratefully acknowledge support by PhD fellowships of the Fonds der Chemischen Industrie.
Supporting Information Available
Additional figures of the reaction network with further details on species and structures are collected in the supporting information. This information is available free of charge via the Internet at http://pubs.acs.org/.
- Masters, C. Homogeneous Transition-metal Catalysis; Springer, Dordrecht, Netherlands, 1980.
- Vinu, R.; Broadbelt, L. J. Annu. Rev. Chem. Biomol. Eng. 2012, 3, 29–54.
- Ross, J. J. Phys. Chem. A 2008, 112, 2134–2143.
- Vereecken, L.; Glowacki, D. R.; Pilling, M. J. Chem. Rev. 2015, 115, 4063–4114.
- Ludlow, R. F.; Otto, S. Chem. Soc. Rev. 2007, 37, 101–108.
- Clayden, J.; Greeves, N.; Warren, S.; Wothers, P. Organic Chemistry; Oxford University Press, Oxford, United Kingdom, 2012.
- Maeda, S.; Ohno, K.; Morokuma, K. Phys. Chem. Chem. Phys. 2013, 15, 3683–3701.
- van Duin, A. C. T.; Dasgupta, S.; Lorant, F.; Goddard, W. A. J. Phys. Chem. A 2001, 105, 9396–9409.
- Döntgen, M.; Przybylski-Freund, M.-D.; Kröger, L. C.; Kopp, W. A.; Ismail, A. E.; Leonhard, K. J. Chem. Theory Comput. 2015, 11, 2517–2524.
- Saitta, A. M.; Saija, F. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 13768–13773.
- Wang, L.-P.; Titov, A.; McGibbon, R.; Liu, F.; Pande, V. S.; Martínez, T. J. Nat. Chem. 2014, 6, 1044–1048.
- Earis, P.; Faraday Discuss., Vol. 135; Royal Society of Chemistry, 2007.
- Corey, E. J.; Jorgensen, W. L. J. Am. Chem. Soc. 1976, 98, 189–203.
- Pensak, D. A.; Corey, E. J. In Computer-Assisted Organic Synthesis; ACS Symposium Series, Vol. 61; American Chemical Society, 1977; pp 1–32.
- Gasteiger, J.; Ihlenfeldt, W. D. In Software Development in Chemistry 4; Gasteiger, P. D. J., Ed.; Springer Berlin Heidelberg, 1990; pp 57–65.
- Rücker, C.; Rücker, G.; Bertz, S. H. J. Chem. Inf. Comput. Sci. 2004, 44, 378–386.
- Todd, M. H. Chem. Soc. Rev. 2005, 34, 247–266.
- Grzybowski, B. A.; Bishop, K. J. M.; Kowalczyk, B.; Wilmer, C. E. Nat. Chem. 2009, 1, 31–36.
- Chen, J. H.; Baldi, P. J. Chem. Inf. Model. 2009, 49, 2034–2043.
- Fuller, P. E.; Gothard, C. M.; Gothard, N. A.; Weckiewicz, A.; Grzybowski, B. A. Angew. Chem. Int. Ed. 2012, 51, 7933–7937.
- Kowalik, M.; Gothard, C. M.; Drews, A. M.; Gothard, N. A.; Weckiewicz, A.; Fuller, P. E.; Grzybowski, B. A.; Bishop, K. J. M. Angew. Chem. Int. Ed. 2012, 51, 7928–7932.
- Zimmerman, P. M. J. Comput. Chem. 2013, 34, 1385–1392.
- Zimmerman, P. M. Mol. Simul. 2015, 41, 43–54.
- Rappoport, D.; Galvin, C. J.; Zubarev, D. Y.; Aspuru-Guzik, A. J. Chem. Theory Comput. 2014, 10, 897–907.
- Zubarev, D. Y.; Rappoport, D.; Aspuru-Guzik, A. Sci. Rep. 2015, 5, 8009.
- Yandulov, D. V.; Schrock, R. R. J. Am. Chem. Soc. 2002, 124, 6252–6253.
- Yandulov, D. V.; Schrock, R. R. Science 2003, 301, 76–78.
- Schrock, R. Angew. Chem. Int. Ed. 2008, 47, 5512–5522.
- Hinrichsen, S.; Broda, H.; Gradert, C.; Söncksen, L.; Tuczek, F. Annu. Rep. Prog. Chem., Sect. A: Inorg. Chem. 2012, 108, 17–47.
- Arashiba, K.; Miyake, Y.; Nishibayashi, Y. Nat. Chem. 2011, 3, 120–125.
- Lee, Y.; Mankad, N. P.; Peters, J. C. Nat. Chem. 2010, 2, 558–565.
- Chen, J. H.; Baldi, P. J. Chem. Educ. 2008, 85, 1699.
- Graulich, N.; Hopf, H.; Schreiner, P. R. Chem. Soc. Rev. 2010, 39, 1503–1512.
- Kayala, M. A.; Azencott, C.-A.; Chen, J. H.; Baldi, P. J. Chem. Inf. Model. 2011, 51, 2209–2222.
- Kayala, M. A.; Baldi, P. J. Chem. Inf. Model. 2012, 52, 2526–2540.
- Zimmerman, P. J. Chem. Theory Comput. 2013, 9, 3043–3050.
- Zimmerman, P. M. J. Chem. Phys. 2013, 138, 184102.
- Zimmerman, P. M. J. Comput. Chem. 2015, 36, 601–611.
- Suleimanov, Y. V.; Green, W. H. J. Chem. Theory Comput. 2015, DOI: 10.1021/acs.jctc.5b00407.
- Frenking, G.; Shaik, S. The Chemical Bond; Vol. 1 & 2, Wiley-VCH, Weinheim, Germany, 2014.
- Becke, A.; Edgecombe, K. E. J. Chem. Phys. 1990, 92, 5397–5403.
- Bader, R. F. W. Atoms in Molecules: A Quantum Theory; Clarendon Press, Oxford, United Kingdom, 1994; pp. 262–264.
- Ayers, P. W. J. Chem. Sci. 2005, 117, 441–454.
- Fukui, K. Science 1982, 218, 747–754.
- Mulliken, R. S. J. Chem. Phys. 1955, 23, 1833–1840.
- Mulliken, R. S. J. Chem. Phys. 1955, 23, 1841–1846.
- Löwdin, P.-O. Adv. Quant. Chem. 1970, 5, 185–199.
- Brown, R. E.; Simas, A. M. Theor. Chim. Acta 1982, 62, 1–16.
- Kang, Y. K.; Jhon, M. S. Theor. Chim. Acta 1982, 61, 41–48.
- Halgren, T. A.; Lipscomb, W. N. Chem. Phys. Lett. 1977, 49, 225–232.
- Cerjan, C. J.; Miller, W. H. J. Chem. Phys. 1981, 75, 2800–2806.
- Simons, J.; Jørgensen, P.; Taylor, H.; Ozment, J. J. Phys. Chem. 1983, 87, 2745–2753.
- Wales, D. J. J. Chem. Soc., Faraday Trans. 1992, 88, 653–657.
- Wales, D. J. J. Chem. Soc., Faraday Trans. 1993, 89, 1305–1313.
- Jensen, F. J. Chem. Phys. 1995, 102, 6706–6718.
- Henkelman, G.; Uberuaga, B. P.; Jónsson, H. J. Chem. Phys. 2000, 113, 9901–9904.
- Behn, A.; Zimmerman, P. M.; Bell, A. T.; Head-Gordon, M. J. Chem. Phys. 2011, 135, 224108.
- Ohno, K.; Maeda, S. Chem. Phys. Lett. 2004, 384, 277–282.
- Gonzalez, C.; Schlegel, H. B. J. Phys. Chem. 1990, 94, 5523–5527.
- Broyden, C. G. Math. Comput. 1967, 21, 368–381.
- Fletcher, R. Practical Methods of Optimization: Unconstrained Optimization, Vol. 1; Wiley-Blackwell, Chichester, United Kingdom, 1980; p. 107.
- Malek, R.; Mousseau, N. Phys. Rev. E 2000, 62, 7723–7728.
- Henkelman, G.; Jónsson, H. J. Chem. Phys. 1999, 111, 7010–7022.
- Munro, L. J.; Wales, D. J. Phys. Rev. B 1999, 59, 3969–3980.
- Bergeler, M.; Herrmann, C.; Reiher, M. J. Comput. Chem. 2015, 36, 1429–1438.
- Glowacki, D. R.; Liang, C.-H.; Morley, C.; Pilling, M. J.; Robertson, S. H. J. Phys. Chem. A 2012, 116, 9545–9560.
- Eyring, H. J. Chem. Phys. 1935, 3, 107–115.
- Schenk, S.; Le Guennic, B.; Kirchner, B.; Reiher, M. Inorg. Chem. 2008, 47, 3634–3650.
- Reiher, M.; Neugebauer, J. J. Chem. Phys. 2003, 118, 1634–1641.
- Reiher, M.; Le Guennic, B.; Kirchner, B. Inorg. Chem. 2005, 44, 9640–9642.
- Le Guennic, B.; Kirchner, B.; Reiher, M. Chem. Eur. J. 2005, 11, 7448–7460.
- Schenk, S.; Kirchner, B.; Reiher, M. Chem. Eur. J. 2009, 15, 5073–5082.
- Schenk, S.; Reiher, M. Inorg. Chem. 2009, 48, 1638–1648.
- Studt, F.; Tuczek, F. Angew. Chem. Int. Ed. 2005, 44, 5639–5642.
- Thimm, W.; Gradert, C.; Broda, H.; Wennmohs, F.; Neese, F.; Tuczek, F. Inorg. Chem. 2015, 54, 9248–9255.
- Becke, A. D. Phys. Rev. A 1988, 38, 3098–3100.
- Perdew, J. P. Phys. Rev. B 1986, 33, 8822–8824.
- Weigend, F.; Ahlrichs, R. Phys. Chem. Chem. Phys. 2005, 7, 3297–3305.
- Ahlrichs, R.; Bär, M.; Häser, M.; Horn, H.; Kölmel, C. Chem. Phys. Lett. 1989, 162, 165–169.
- Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, J. A., Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, J. M.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, S.; Daniels, A. D.; Farkas, Ö.; Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J. Gaussian 09, Revision D.01,; Gaussian Inc. Wallingford CT, 2009.
- Shao, Y.; Molnar, L. F.; Jung, Y.; Kussmann, J.; Ochsenfeld, C.; Brown, S. T.; Gilbert, A. T. B.; Slipchenko, L. V.; Levchenko, S. V.; O’Neill, D. P.; Jr, R. A. D.; Lochan, R. C.; Wang, T.; Beran, G. J. O.; Besley, N. A.; Herbert, J. M.; Lin, C. Y.; Voorhis, T. V.; Chien, S. H.; Sodt, A.; Steele, R. P.; Rassolov, V. A.; Maslen, P. E.; Korambath, P. P.; Adamson, R. D.; Austin, B.; Baker, J.; Byrd, E. F. C.; Dachsel, H.; Doerksen, R. J.; Dreuw, A.; Dunietz, B. D.; Dutoi, A. D.; Furlani, T. R.; Gwaltney, S. R.; Heyden, A.; Hirata, S.; Hsu, C.-P.; Kedziora, G.; Khalliulin, R. Z.; Klunzinger, P.; Lee, A. M.; Lee, M. S.; Liang, W.; Lotan, I.; Nair, N.; Peters, B.; Proynov, E. I.; Pieniazek, P. A.; Rhee, Y. M.; Ritchie, J.; Rosta, E.; Sherrill, C. D.; Simmonett, A. C.; Subotnik, J. E.; Iii, H. L. W.; Zhang, W.; Bell, A. T.; Chakraborty, A. K.; Chipman, D. M.; Keil, F. J.; Warshel, A.; Hehre, W. J.; Iii, H. F. S.; Kong, J.; Krylov, A. I.; Gill, P. M. W.; Head-Gordon, M. Phys. Chem. Chem. Phys. 2006, 8, 3172–3191.
- Schaftenaar, G.; Noordik, J. H. J. Comput.-Aided Mol. Design 2000, 14, 123–134.
- Jmol: an open-source Java viewer for chemical structures in 3D. http://www.jmol.org/ (accessed on 21.10.15).
- Hagberg, A. A.; Schult, D. A.; Swart, P. J. In Proceedings of the 7th Python in Science Conference (SciPy2008); Pasadena, CA USA, 2008; pp 11–15.