1 Introduction
###### Abstract

We explain the construction of Forcer, a Form program for the reduction of four-loop massless propagator-type integrals to master integrals. The resulting program performs parametric IBP reductions similar to the three-loop Mincer program. We show how one can solve many systems of IBP identities parametrically in a computer-assisted manner. Next, we discuss the structure of the Forcer program, which involves recognizing reduction actions for each topology, applying symmetries, and transitioning between topologies after edges have been removed. This part is entirely precomputed and automatically generated. We give examples of recent applications of Forcer, and study the performance of the program. Finally we demonstrate how to use the Forcer package and sketch how to prepare physical diagrams for evaluation by Forcer.

Forcer, a Form program for the parametric reduction of four-loop massless propagator diagrams

B. Ruijl, T. Ueda and J.A.M. Vermaseren

Nikhef Theory Group

Science Park 105, 1098 XG Amsterdam, The Netherlands

Leiden University

Science Niels Bohrweg 1, 2333 CA Leiden, The Netherlands

## 1 Introduction

Over the years particle physics experiments have become more and more precise. This creates the need for more accurate calculations of the underlying processes. In particular for QCD with its large coupling constant, three-loop calculations for processes such as the production of Higgs particles prove to be important for high precision predictions [1, 2]. This in turn necessitates the evaluation of the four-loop splitting functions to determine the parton distributions inside the proton. For a variety of reasons a complete calculation at the four-loop level is currently out of the question. The next best solution is to evaluate a number of Mellin moments as was done at the three-loop level over the past 25 years [3, 4, 5, 6, 7]. One way to obtain such moments is by converting them to massless propagator integrals by expanding in terms of the parton momentum. The computer program that could deal with the resulting three-loop integrals is called Mincer [8, 9] and its algorithms are based on Integration-By-Parts (IBP) identities [10]. To obtain higher moments, Mincer has been heavily optimized and was essential in an moment calculation for polarized scattering [11].

The construction of a similar program for four-loop propagator integrals is a far more formidable task. This has led to the exploration of different techniques, such as the expansions of Baikov [12, 13, 14]. Instead of solving the systems of IBP equations parametrically as was done in Mincer, Laporta developed a method to solve the system by substituting values for its parameters [15]. This method has been used to create generic programs that can handle integrals in a very flexible way [16, 17, 18, 19, 20]. The drawback of these programs is that it is in essence a brute-force Gaussian reduction, that needs to reduce many subsystems that will drop out of the final answer. An extra complication is the fact that the system is riddled with ‘spurious poles’ which are powers in that only cancel by the time all contributions to the coefficient of a master integral have been added. If it is not known in advance how many of these spurious poles will occur, one cannot safely perform a fixed expansions in . In the three-loop Mincer program the spurious poles could be avoided thanks to the resolved triangle formula by Tkachov [21], but for the all- calculation these spurious poles caused significant issues [22, 23, 24]. In general, spurious pole avoidance is considered too complicated and is resorted to the very slow but exact arithmetic of rational polynomials.

Similar to our work, a method capable for a parametric reduction of massless four-loop propagator integrals has been developed by R. Lee in a series of papers [25, 26]. It resulted in the LiteRed program, which is a Mathematica package that constructs reduction programs (also in Mathematica code). Although it is extremely elegant and as a method very powerful, the resulting four-loop propagator programs are too slow for most practical applications.

In this work we describe Forcer, a Form [27, 28] program that is a hybrid between various approaches. We discuss the construction of a precomputed reduction graph that describes how to reduce each topology and how to map topologies with missing propagators into each other. Most topologies have straightforward reduction rules due to known reducible substructures, such as triangles or insertions. However, 21 special cases that could not be derived automatically in an efficient way have been constructed by means of hand-guided computer programs. We provide several heuristics and give an overview of how we solved these complicated cases. During some manual derivations we ran into the previously mentioned spurious pole problem. For Forcer, this means that it can work in either of two modes: with rational polynomials in , or with a fixed expansion in which the depth has to be selected with great care.

Forcer has already been used in some large scale computations, such as new results for splitting functions at four loops [29, 30], four-loop propagator and vertex computations [31, 32, 33], and the five-loop beta function [34]. We show some benchmarks for complicated cases.

The Forcer source code can be found at https://github.com/benruijl/forcer.

The layout of the paper is as follows: In section 2 we discuss the reduction of simple substructures. Then in section 3 we discuss IBP identities and the way we manipulate them. Section 4 presents all topologies that are considered to be irreducible by lacking simple substructures. These involve the master integrals and a few more topologies that cannot be resolved in a single reduction step. In section 5 the framework of the program and its derivation are described. The usage of the program is discussed in section 6 and in section 7 we describe how to transform physical diagrams to input for Forcer. In section 8 we show examples and we study the performance. Finally, section 9 presents the conclusions and applicability. Some technical details are discussed in the appendices.

## 2 Reducing known substructures

For topologies with three specific substructures, there is no need to solve IBP equations, as efficient solutions are known. They are shown in figure 1. On the left we have a one-loop subgraph. If both propagators are massless, indicated by green (triple) lines, the loop can be integrated out and the result will be a single line with non-integer power. Reduction programs in non-Mincer approaches do not consider such an operation and consider the integral to be a master integral if it has no reduction by means of IBP equations. For the rule to be simple, the dot products in the numerator must only depend on one of the two lines in the loop. In appendix A an efficient method is described to integrate the one-loop subtopology.

In the middle of figure 1, we have the ‘carpet’ rule. It was first introduced in [10] and later used in Mincer [9]. In diagrams where there is a propagator connecting one external line to the other, the loop momentum associated to that line can be integrated out. We generalised the carpet rule to the four-loop case in appendix B.

The two aforementioned rules were not derived from IBP identities. The last reduction rule is the diamond rule (the right one in figure 1). This recently found generic ‘diamond’ shape is an extensible generalisation of the triangle rule [35]. It can reduce a green (triple) line or blue (external to ) line. The dot products should be chosen in such a way that they do not depend on red (dashed) lines. As with the triangle rule, the diamond rule can be explicitly summed, so spurious poles can be avoided.

The combination of the above rules allows us to reduce 417 out of 438 topologies in our classification at the four-loop level (see section 5). Only 21 topologies do not have these substructures, out of which 17 contain master integrals. The other 4 can be reduced using IBP relations, but also require hand-guided solving of the IBP system. One is a four-loop diagram, two are three-loop diagrams with one line having a non-integer power, and one is a two-loop diagram with two non-integer power lines.

The integrals that require a custom reduction are discussed in sections 3 and 4. Solving IBPs for diagrams with non-integer propagator powers is discussed in subsection 3.3. The values of the master integrals are given in appendix C.

## 3 IBP identities

In an -loop propagator graph we have independent vectors: the external vector and loop momenta , where . Together there are independent variables. One of them, , can be used to set the scale. Hence there are variables in the loops. Because there are at most propagators, the remaining variables will be in the numerator and there is often quite some freedom as to which variables to choose. In topologies in which there are fewer propagators there will correspondingly be more variables in the numerator. The efficiency of the reduction depends critically on the selected numerators. In the Mincer program the numerators were chosen to be dot products, such as for the ladder topology or for the Benz topology. Alternatively, one could use extra squared momenta such with . The advantage of the invariant method is that when rewriting the numerators to a new basis after a line removal, more invariants of the old basis can be a part of the new basis. The advantage of using dot products is that integration of one-loop subintegrals and the use of the rule of the triangle/diamond generates fewer terms compared to using invariants. Especially the simpler structure for integrating one-loop two-point subgraphs is important, since we apply this rule as early as possible to reduce the number of loops (and thus the number of parameters). Hence we choose to use dot products for the variables in the numerator in Forcer.

In the reduction routines we represent the integrals by a function with 14 variables (for fewer than four loops there will naturally be fewer variables) in which powers of variables in the denominators are given by positive numbers and powers in the numerator by negative numbers, as is commonly used. For example:

 Z(1,1,1,1,1,1,1,1,1,1,−1,−2,−2,−2) (1)

is a four-loop integral with four dot products. One dot product has power one, and the other three have two powers. Each of the 10 denominators has power 1. Note that all information about the topology or the choice of dot products is erased in this notation. Once the IBP relations are constructed, such information should be kept by different means. We note that some indices may be associated with propagators that have non-integer powers if insertions are involved (see section 3.3).

We define the integral in which all denominators have power one (possibly with an extra multiple of ) and all numerators have power zero to have complexity zero. For each extra power of a denominator or of a numerator the complexity is increased by one. When we construct the IBPs parametrically the variables are represented by parameters in which at least three represent numerators. Now we define the integral with just as arguments to have complexity zero and again raising the value of a denominator by one, or subtracting one from a numerator raises the complexity by one. To improve readability, we represent denominators by parameters and numerators by parameters in some examples.

We redefine by adding minus the complexity as the first argument.111 We use minus the complexity, so that Form prints the integrals with the highest complexity first. For example:

 Z(−3;2,1,1,1,1,1,1,1,1,1,0,−1,−1), (2) Z(−1;n1,n2,n3+1,n4,n5,n6,n7,n8,n9,n10,n11−1,k12+1,k13−1,k14−1). (3)

An example of a well-known solved parametric IBP system is the triangle rule for the triangle subtopology displayed in figure 2. We will show the triangle rule below in our notation. We choose as the line from the triangle that can be reduced to , and as the dot products with the momentum of :

 Z(0;n1,n2,n3,n4,n5…,k1,…,kl,…)=1D+k1+…kl−n1−n2−2n3[+n1Z(0;n1+1,n2,n3−1,n4,n5,…,k1,…,kl,…)−n1Z(0;n1+1,n2,n3,n4−1,n5,…,k1,…,kl,…)+n2Z(0;n1,n2+1,n3−1,n4,n5,…,k1,…,kl,…)−n2Z(0;n1,n2+1,n3,n4,n5−1,…,k1,…,kl,…)]. (4)

Repeated application of the rule will either bring or to 0. The advantage of the triangle rule is that it generates few terms and only depends on a local substructure. It is however not complexity reducing: for each decreased propagator power, another is always raised.

In general, the goal is to construct a rule under which the basic complexity 0 integral is expressed in terms of other complexity 0 integrals or in terms of integrals with negative complexity. In the next section we present several heuristics to find such rules.

### 3.1 Solving parametric IBP identities

For four-loop diagrams there are at first instance 20 unique IBP relations, formed from the operation , where is one of the four loop momenta and is one of the four loop momenta or the external momentum. This set of equations can often be simplified by a Gaussian elimination of the more complex integrals. We call the simplified set of equations . The most complex terms in have complexity 2 and have one raised denominator and one raised irreducible numerator. This is a direct consequence of the IBP structure.

In the set one can distinguish several types of reduction identities. The nicest identities are the ones that lower the complexity, sometimes even by more than one unit. An example is

 0= Z(−2;…,n+1,…,k−1,…)⋅n + Z(0;…,n,…,k,…)+…, (5)

where both a propagator and numerator are raised in the complexity 2 term. By shifting and , we find the reduction rule:

 Z(0;…,n,…,k,…)=−1n−1[Z(2;…,n−1,…,k+1,…)+…]. (6)

Such identities are used for the simultaneous reduction of two variables. Since the equation will vanish once or ‘overshoot’ when , it can only be used to speed up a reduction. Consequently, rules for the individual reduction of and are still required.

In what follows we will omit the last step of shifting the equation such that the highest complexity term becomes complexity 0. We also omit the coefficients of the functions when they are deemed irrelevant and we do not consider integrals with lines missing to be -integrals.

We now study relations that raise only one coefficient:

 0= Z(−1;…,n+1,…)⋅n + Z(0;…)+…. (7)

Repeated application of this relation will either take the variable down to one, or eventually create integrals in which one or more of the other lines are missing. For non-master topologies, at some point we find an equation that looks like

 0= Z(−1;…,n+1,…)⋅(ϵ+…) + Z(0;…)+…. (8)

This equation has an in the coefficient, which means it does not vanish if . As a result, it can be used to reduce to 0.

If after there is an equation in which the highest complexity is zero and the integral for which none of the parameters has been raised or lowered is present, there is a good chance that one can eliminate at least one line in that topology by repeated application of this identity, provided that there are no lines with a non-integer power. One example of such an equation is the rule of the triangle. The finding of more such equations while investigating the IBP systems of five-loop propagator diagrams led to the discovery of the diamond rule [35].

### 3.2 Reduction rules beyond S0

Even though the triangle and diamond rule can be derived from equations in the set , the set generally does not contain enough equations to reduce a topology straight away. Therefore, we expand our system by taking the set and constructing all equations in which either one denominator has been raised by one or one numerator lowered by one (which means that there is one more power of that variable because the numerators ‘count negative’). This set is called , since the IBPs are constructed from a complexity one integral. In total, we now have equations. Similarly we could construct the set by raising the complexity of one of the variables in the set in all possible ways, generating an additional equations. Usually is not needed. In some cases we may need a set like in which the complexity of one of the variables has been lowered, or even in which one has been raised and one has been lowered.

The essence of our method is to construct the combined sets and and use Gaussian elimination to remove all objects of complexities 3 and 2 from the equations. The remaining equations only have objects of complexity one or lower. Out of these equations we construct an elimination scheme by defining an order of the variables, and we select for each variable an equation to eliminate it. For a denominator variable this is ideally an equation with a single term in which a variable has been raised and all other parameters are at their default values: Once we have such an equation we can lower , with being a positive integer, in all other equations to . Since we know that after this either one of the other will be 0 (meaning the reduction is done) or , we can assume from this point on that . Thus, in all other equations we now set to 1, lowering the number of parameters by one. Similarly the numerator variables are worked up from to after which this variable is given the value zero. The order of elimination and the selection of the equations is critical: one of our early carefully selected schemes resulted in a benchmark run of 53000 seconds, whereas a scheme with a different variable order and a more sophisticated combination of the equations, performed the same test in 555 seconds.

Above we gave an example of a simple, useful equation. However, sometimes these equations are not there. Below we discuss several other types of equations one may encounter. One example is if there are more integrals of complexity one:

 0= Z(−1;…) + Z(−1;…) + Z(−1;…) + Z(−1;…,n+1,…)⋅n + Z(0;…)+…, (9)

where the last complexity one term can be used to eliminate the variable (provided that all other parameters have not been raised/lowered in this term), but this goes at the cost of increasing the number of terms with the same complexity. When the scheme is not carefully selected, the number of terms in the intermediate stages may become very large and the rational polynomials could become complicated.

A convenient subclass of the type shown in eq. (9) is one that increases an index in only a single term in the equation, independent of the complexity:

 0= Z(−1;n1+1,…)⋅n1 + Z(−1;n1,…) + Z(0;n1,…) + Z(1;n1,…) + …. (10)

As a result, the equation can be used to lower the value of this variable at any level of complexity :

 0= Z(−c;n1+1,…)⋅n1 + Z(−c;n1,…) + Z(−c+1;n1,…) + Z(−c+2;n1,…) + …. (11)

We emphasize that we apply these equations to any value of , so also to terms that look like . In Form this can be done with a pattern match:

id Z(-c?,n1?,...,n14?) =  Z(-c,n1-1,...)/(n1-1)
+ Z(-c+1,n1-1,...)/(n1-1)
+ Z(-c+2,n1-1,...)/(n1-1)
+ ...;


These equations are convenient because after applying them and after setting the variable to 1, there will not be a single term in the remaining equations in which there is a number greater than 1 in its position. We will later see why this is desirable.

The next type of equations also has more than one term at complexity one, but there is no clean reduction of a given variable:

 0= Z(−1;n1+1,n2−1,n3+1,…)⋅n1 + Z(−1;n1+1,…)⋅n1 + Z(−1;n1,…) + Z(−1;n1,…) + Z(0;…)+…. (12)

In the numerical case, one just moves the second term to the left and either will be reduced to one or will eventually become zero. However, in the derivation of the scheme one needs to apply this equation inside other parametric equations and more care is called for. One should apply the equation as many times as needed until terms either have (or , etc.) or the position has . This means that for the integral

 Z(−1;n1+1,n2+2,…) (13)

equation (12) will have to be used up to three times. Once has been set equal to one, one may end up with terms such as

 Z(−1;2,n2−1,…), (14)

which are undesirable. The solution to this problem is to try to deal with immediately after . Once we can put equal to one, becomes zero and hence it is an integral with a missing line. If one waits with the reduction and does another variable first, one risks that is raised because the equation for the other variable could have a term with and then one would end up with an integral of the type

 Z(−1;2,n2,…). (15)

This introduces either unresolved integrals or loops in the reduction scheme. It is also possible that one has reductions with two such conditions as we saw above. This requires great care in the selection of the next equation. We have not run into impossible situations at this stage.

Another case is one where the coefficient limits its application. For example

 0= Z(−1;n1+1,…)⋅(n1−n2) + Z(0;…)+… (16)

cannot be applied when . Such rules could be very compact and are therefore used as a special case while the case in which is equal to is handled by a more general rule with less favourable properties but effectively one parameter less.

By far the most difficult equations are of the type

 0= Z(−1;n1+1,n2−1,n3+1,…)⋅a(n1,n2,n3) + Z(−1;n1+1,n2+1,n3−1,…)⋅b(n1,n2,n3) + Z(−1;n1+1,…) + Z(−1;n1,…) + Z(0;…)+…, (17)

where and are coefficient functions. We call this type a yoyo. As a recursion it will never end, because the values of and will keep going up and down. There are various ways to resolve this. The first is to construct a new type of recursion. This is done by applying the equation twice:

 Z(n1+1,n2,n3,...) → +a(n1,n2,n3)Z(n1+1,n2−1,n3+1,...) +b(n1,n2,n3)Z(n1+1,n2+1,n3−1,...)+... → +a(n1,n2,n3)a(n1,n2−1,n3+1)Z(n1+1,n2−2,n3+2,...) +b(n1,n2,n3)b(n1,n2+1,n3−1)Z(n1+1,n2+2,n3−2,...) +(a(n1,n2,n3)b(n1,n2+1,n3−1)+ +b(n1,n2,n3)a(n1,n2−1,n3+1))Z(n1+1,n2,n3,...)+...

By moving the third term to the left one has a new recursion with a shift of two units. This procedure can be repeated times until both and are less than one. The price to pay for this solution is high: fractions become enormously complicated and the number of terms could become very large.

An improved solution is to find another equation with a similar yoyo and combine the equations in such a way that one of the yoyo terms is eliminated. After this, one has a regular condition. We call this ‘breaking the yoyo’. There is another way to break the yoyo that will be introduced below. We had to apply both methods of breaking the yoyo several times in the creation of the reduction schemes for the master topologies.

A final consideration is the structure of the coefficients of the integrals. In principle it is not very difficult to construct a reduction scheme from the available equations. The problem is that most schemes will end up with rational coefficients that take many megabytes to store because there are still quite a few variables in them. During the derivation this may cause problems with the limitations of the computer algebra system that is used (in our case Form). More importantly, the evaluation of such rational polynomials in the application of the reduction scheme to millions of integrals will render the reductions impossibly slow and hence useless for all practical purposes. Thus, if the coefficients are too large, an alternative reduction has to be found.

### 3.3 Identities for topologies with insertions

When a topology contains a line that does not have an integer power, the method of the previous section has to be slightly extended. Such cases occur either when the input diagram(s) can be written with a higher-order propagator in it, or when during the reduction a two-point function can be integrated out. If the resulting topology needs a custom reduction, we not only have to lower powers of denominators and numerators, but we also have to bring the powers of the non-integer lines to a canonical value, which we take to be for some positive integer . As an example, we consider the two-loop t1star05 topology (see also refs. [8, 9])

{axopicture}

(160,80)(0,0) \Line[arrow,arrowpos=0.5](40,40)(10,40) \Arc[arrow,arrowpos=0.35](70,40)(30,90,180) \Arc[arrow,arrowpos=0.65](70,40)(30,180,270) \Line(70,70)(90,70) \Line(70,10)(90,10) \Line[arrow,arrowpos=0.65](80,70)(80,10) \Arc[arrow,arrowpos=0.35,flip](90,40)(30,270,360) \Arc[arrow,arrowpos=0.65](90,40)(30,0,90) \Line[arrow,arrowpos=0.5](150,40)(120,40) \Line(76.5,46.5)(83.5,53.5) \Line(76.5,53.5)(83.5,46.5) \Vertex(40,40)1.5 \Vertex(80,70)1.5 \Vertex(80,10)1.5 \Vertex(120,40)1.5 \Text(23,48)[b]Q \Text(48,63)[rb] \Text(47,16)[rt] \Text(116,63)[lb] \Text(116,17)[lt] \Text(137,48)[b]Q \Text(86,40)[l] ,

which has an in index 5, indicated by a single cross. We call such a cross an insertion. We have the relation:

 Zt1star05(n1,n2,n3,n4,n5)=Zt1(n1,n2,n3,n4,n5+ϵ), (19)

where the topology t1 is the same two-loop topology but without any implicit non-integer powers. Since the can never be removed from the index during the reduction, we suppress it in our notation for t1star05. The IBPs for t1star5 are generated from those of t1 by a substitution . Typically, one tries to first reduce the integer indices to 1. During these reductions, the contribution to the integral complexity from could be taken as the absolute value of the difference to penalize any change of , or just be ignored to allow any change:

 {Complexity}(n5+m5)=|m5|,or{Complexity% }(n5+m5)=0. (20)

After all are 1, we reduce the remaining index to 1, which may be positive or negative at this point. To derive a rule for the positive case, the complexity of can be defined as usual for a propagator:

 {Complexity}(n5+m5)=m5,for a rule with n5>1. (21)

On the other hand, for the negative case, the complexity of can be defined as usual for a numerator:

 {Complexity}(n5+m5)=−m5,for a rule with n5<1. (22)

In this way, all integrals belonging to t1star05 can be reduced to the master integral and integrals with simpler topologies.

### 3.4 Solving strategy

The heuristics for ‘solving’ a topology can now be outlined.

1. Select a numerator basis. The quality of the IBPs will depend on this choice.

2. Construct the IBP identities.

3. (Important) Use a type of Gaussian elimination to simplify the IBP identities, minimizing the number of terms with the highest complexity. We call this set . Most of the time this simplification can be done in an automated way. Only for the most difficult cases we have applied manual interference to obtain better results.

4. Construct the set by generating all possible options of raising an index in . This gives terms of complexity 2 and 3. Use Gaussian elimination to eliminate all those terms. The remaining set of equations has terms of at most complexity 1.

5. (Important) Use the equations of the set (applying it to any complexity and configuration as in eq. (11)) to eliminate as many complexity one terms as possible. This can simplify the following task and results in simpler formulas in the final reduction program. It also breaks up some difficult yoyos.

6. Determine an order of elimination of the variables. Often the first variables are rather obvious from the presence of simple reduction equations. Some variables may not be so obvious and one may have to experiment. The resulting programs may differ by orders of magnitude in their efficiency. Here is where either human intelligence, or a cleverly written AI program can help.

7. In many cases, one cannot find a decent equation for the last variable. This can be because either the results have become extremely lengthy, or one has discarded some long equations that seemed of no further relevance. In that case, the almost complete reduction scheme is applied to the set . This will give a number of varieties of the final reduction(s). One can select the shortest one.

8. (Checking) Now apply the custom reduction scheme to the set with numbers for the variables and make sure that master integrals are indeed irreducible, and that the program does not get caught in loops. There may be equations remaining which only consist of integrals with missing lines. We did not take relations between those into account.

9. Combine all reductions and useful double reduction equations (equations that need at least two variables that are above their minimal complexity) based on or substitutions made during the Gaussian elimination. Together this forms the reduction procedure for the given topology.

In some cases the resulting schemes were still deemed too slow and more exhaustive methods were asked for. In such cases the sets and were also constructed and many different ways of combining the equations were tried automatically. Such programs could take much time because of the very complicated rational polynomials in the parameters of the integrals, but they eventually did result in a number of shorter reductions.

A number of Form procedures has been constructed to execute the above steps. The most laborious step is to determine a proper order for the elimination sequence, and which equations to use for each. Furthermore, we had a case (the bebe topology of section 4) in which there were no good reductions for two of the variables, unless we used two of the equations in the set to eliminate them with a complexity raising operation. It also reduced the number of remaining equations to 18 and hence left fewer options during the remaining parts of the derivation.

There are two major reasons why some reduction rules perform faster than others. The first reason is that even though a rule may have only one term, it could be that the sub-leading terms increase the value of a variable that was set to 1 in one of the early steps of the scheme (see eq. (14)). This forces the program back to an earlier reduction rule of the scheme, even though now at a lower complexity. The second reason is the coefficient growth: if a rule has a particularly complicated overall coefficient, it multiplies every term in the RHS and all subsequent terms will have rather lengthy rational polynomials in . Expanding in (see appendix D) can alleviate some of these problems, provided one expands deep enough to avoid issues with spurious poles.

Determining the order of elimination seems suited for AI techniques, such as Monte Carlo Tree Search (see e.g. [36]). One could use the number of top complexity terms, the number of lower complexity terms, the number of spectators and the size of the most complicated rational polynomial as parameters for an evaluation function for a given scheme and then use this in a MCTS to find an optimal scheme. This is currently under investigation. It should be noted that such type of use of AI for precisely this purpose was already hinted at in ref. [37].

## 4 The topologies that need custom reductions

Because we integrate over one-loop two-point functions, our classification of the master integrals differs from refs. [38, 39]. In general, any diagram that factorizes we do not consider a master topology. The master diagrams that contain one-loop two-point functions that cannot be factorized, will have slightly different values, since we integrate out the bubble. The full list with the values of the master integrals in our convention are given in appendix C.

There are eight four-loop master integrals, excluding the diagrams in which a 2-point function can be integrated out. For these master integrals we have to design a custom scheme in which the parameters are reduced, one by one, to the value they have in the master integral. In addition (and perhaps surprisingly) there are four non-master topologies that need such a custom reduction. Only when all but a few parameters are set to 1, do we find a relation to reduce an edge to 0. In this category there is one at the four-loop level, two at the three-loop level (with one non-integer edge) and one at the two-loop level (with two non-integer edges). In total we need 21 custom reduction schemes. All other topologies can be dealt with using generic formulas that can either eliminate a line or integrate out a loop (see section 2). We list all topologies that need a custom reduction scheme:

In order to choose the best reduction schemes for the topologies in table 1, we measure the performance of a complete calculation of the integrals with all indices raised by 1 (a complexity 14 integral at four loops). By performing a complete calculation, we confirm that the number of terms with a simpler topology created by the reduction rules does not cause bottlenecks. Additionally, we confirm that for the case where all indices are raised by 2 (a complexity 28 integral at four loops), the reduction is still performing well.

We note that the ordering of variables in the reductions scheme is not the only relevant parameter. The choice of numerators can influence the presence of non-leading terms, which after the Gaussian elimination become leading terms. Such terms can spoil the efficiency of certain reduction rules. In particular the three complicated topologies nono, bebe, and no2 are sensitive to the choice of dot products.

Most schemes could be derived using the heuristics introduced in section 3, by selecting the reduction variable that corresponds to the shortest reduction rule. However, there are a few derivations that need more care. For nono, one needs to avoid a circular path in a special way. The formulas for the last two variables, and , can only be obtained by reusing the original set . At this point one uses either combinations of nearly all equations to obtain very lengthy formulas ( lines) or one uses a relatively short formula with a term that sends the reduction back to a previous rule, because it contains a term with . This would normally introduce a loop, but by sending only this term through the unfinished scheme and combining the result with the remaining part of the formula, we obtain a compact reduction formula for (39 terms).

The bebe reduction is more complicated as it does not yield a regular reduction for and . However, in the set there are equations that can be used for their reduction, provided we are willing to raise the complexity. This does not agree with the automatic nature of our derivation tools, and hence some work needs to be guided by hand. Furthermore, we can no longer use a number of equations from the set for generating reduction rules for other variables. As a consequence, we are left with far fewer equations after the Gaussian eliminations, although their number is still sufficient for the next 11 variables. Eventually the variable has to be obtained again from the set.

For the construction of a reduction scheme the bubu topology is by far the most complicated, even though it is not a master topology. There are five different numerators and the elimination of the last numerator needs to be split into several cases, each with a rather lengthy formula involving complicated rational polynomials. In order to prevent a blow-up of terms, the order of elimination of the variables is critical, as well as using the equations obtained during the Gaussian elimination that give a direct reduction of the complexity. It took more than two months to find a first suitable reduction scheme.

We use the set and equations that come from the Gaussian elimination before we start with the 14 reduction identities of the complete schemes. This speeds up the reduction enormously, because these equations are usually much more compact and will often reduce the complexity immediately. It turns out that the final result is very sensitive to how we use these equations, because sometimes there are options when there is more than one term with the highest complexity, and also the order in which they are applied is relevant. Additionally one has to be careful with this ordering to avoid loops in the reduction. Unfortunately, it is not always possible to indicate which ordering is optimal, because some orderings may yield a faster scheme at the cost of more spectator terms and/or higher powers of in the rational polynomials.

Considering the amount of work involved in deriving the schemes, it is not excluded that better schemes will be found. It seems to be a good candidate for the application of automated AI techniques.

## 5 The Forcer framework

In essence, the Forcer program provides a method to reduce each topology to simpler ones. There is quite some freedom: sometimes multiple reduction rules can be applied, sometimes it is best to use a different set of independent momenta, etc. In order to obtain the best performance, all decisions in the Forcer program are precomputed by a script: for each topology the action is hard-coded and the momentum rewrites are known. The advantage of this method is that costly optimizations, such as choosing an optimal basis for each topology, have no runtime cost.

### 5.1 Reduction graph generation

Before going into details, we first give an overview of the program. The program structure can be viewed as a directed acyclic graph (DAG), the reduction graph, where the nodes are topologies and each edge indicates a transition from one topology to another when a propagator is removed. As a result, each node may have more than one parent. The root nodes of the reduction graph are the top-level topologies, which are topologies that only have three-point vertices. All tadpole topologies will be zero, so they are not included in the graph. To reduce the number of topologies, propagators with the same momentum are always merged.

Each node represents a topology, which consists of a graph with a certain fixed labelling of all the propagators, including momentum directions, and a fixed set of irreducible numerators. Each topology also has an action that determines how it can be reduced. They are, in order of preference: integration of a two-point function, carpet rule, triangle/diamond rule, or a custom reduction. Each topology contains transitions to other topologies for all removable edges (edges with integer power). Even though the specific rule may not be able to nullify any propagator in the graph, the dot product rewrites may, so therefore we generate all possible transitions. If there are lines missing, in most cases the topology action is not executed and the topology is automatically rewritten to another. The exception is for integrating insertions: insertions are guaranteed to reduce the number of loops, which simplifies the dot product basis. Thus, first rewriting the dot products to a new topology would be wasteful.

In figure 3 the reduction graph is displayed for three-loop massless propagator graphs. The names of the topologies are automatically generated. Every arrow denotes a transition that occurs when a propagator is removed. Multiply arrows could point to the same node if the resulting diagram is isomorphic. An example of this is t1star5 (same as t1star05 in section 4), where removing any of the four outside lines results in the same topology. The central line cannot be removed, since it has a non-integer power. The four-loop reduction graph, with over 400 nodes, is far too large to display.

The reduction graph is generated from the top-level topologies down. For every topology, a new one is generated where a particular line is missing, For this new topology, we determine its action. Next, we generate a dot product basis that is compatible with the action, e.g. for the insertion rule all dot products should only involve at most one of the two momenta. We also determine its automorphisms (graph symmetries), so that we can map every topology instance to the same unique form (we will go into more detail about this in the next section).

The dot product basis is chosen according to the following three rules: 1) it is compatible with the action, 2) it minimizes the number of terms created when rewriting from the parent topology. As a criterion we choose the sum of the square of the number of terms that are created in rewriting each dot product. 3) The dot products are chosen in line with the symmetries of the topology.

We summarize the generation of the reduction graph in algorithm 1:

Input : top-level topologies Output : reduction tree ; foreach  do determine action and automorphisms; while  do        pop ;        foreach propagator  do              new topology without ;              if  isomorphic to  then                    construct mapping from ;                                else                    determine action for ;                    generate dot product basis for ;                    generate automorphisms for ;                    generate mapping of dot products from to ;                    ;                    ;                                 end if                     end foreach        end while Algorithm 1 Reduction graph generation.

The reduction graph is generated with a Python script, using igraph [40] for a basic graph representation of the topologies and for the isomorphism algorithm. Since by default only simple graphs (without self-edges and duplicate edges) are supported by the isomorphism algorithm, we merge all double edges and use a custom function to determine if the topologies are truly isomorphic (one could view duplicate edges with possible insertions as a special edge colouring). This function enforces that the number of duplicate edges is the same, and that the distribution of insertions over duplicate edges is the same. Additionally, we generate all possible permutations over similar duplicate edges, to generate the edge isomorphisms. The reduction graph contains 438 topologies and requires lines of Form code.

### 5.2 Reduction graph execution

So far, we have discussed the generation of the reduction graph. Now we consider how the graph is processed in runtime.

As input, we have integrals that are labelled by the name of their topology in a symbol. In contrast to Mincer, the input expressions can contain multiple topologies. In Forcer, every topology is put in a separate expression and is hidden. The topologies are processed one by one, in the order of the number of edges. When a topology is treated, the expression is unhidden, the integrals are symmetrized using automorphisms, the topology action is executed, and finally, the resulting integrals are rewritten to their new topology. The topologies in the output are either master integrals, which require no further reductions, or topologies with fewer lines. These topologies will be merged into the designated expression for that topology. All the masters integrals are stored in their own expression.

After rewriting dot products, multiple edges could have vanished. Some of the integrals that remain could have become massless tadpoles, which are zero in dimensional regularisation. A table is used that maps the topology and a list of missing edges to zero if the resulting topology is a tadpole.

The execution of the reduction graph is summarized in algorithm 2:

Input : input integrals Input : reduction graph convert to Forcer topologies ; foreach  do  put in its own expression and deactivate; for  to  do        foreach  with edges do              activate expression with topologies ;              symmetrize terms (apply automorphisms) ;              perform reduction operation (triangle, carpet, etc.) ;              rewrite result with missing lines to Forcer topologies ;              move the terms with topology to ;                     end foreach        end for Algorithm 2 Reduction graph execution.

### 5.3 Example

Below we give an example of the treatment of a topology. The topology is depicted in figure 4, and is internally called d366.

In the input, the integral is represented by a compact notation in terms of symbols only:

ΨMd366/i1/i2/i3/i4^2/i5/i6/i7/i8/i9*i10*i11*i13;


where Md366 is the marker of the topology and the powers of represent the propagator and numerator powers. In this example we have three additional powers: , , , and . Since all rules are precomputed, the information of the topology such as the vertex structure, momentum flow, non-integer powers of lines and which dot products are in the basis, is never stored in the terms that are processed. Instead, the topology marker Md366 will be used to call the correct routines.

When treating topology d366, we first apply symmetries to make sure that similar configurations of d366 are merged. We use the automorphisms of the graph, of which there are four: . However, since there may be dot products in these momenta, the symmetry may be broken unless the set of dot products maps into itself. For the symmetry , the dot products , , should be absent. The other symmetry can only be applied when is absent.

To find the smallest isomorphism, we hash the powers of the i, and take the smallest. In code we have:

if (match(Md366*<1/i1^n1?$n1>*...*<1/i14^n14?$n14>));
if (($n12==0)&&($n13==0)&&($n14==0)); #call hash(0,$n14,$n13,$n12,$n11,$n10,$n9,$n8,$n7,$n6,$n5,$n4,$n3,$n1,$n2) #call hash(1,$n14,$n13,$n12,$n10,$n11,$n9,$n5,$n3,$n4,$n8,$n6,$n7,$n1,$n2) endif; if (($n12==0));
#call hash(2,$n14,$n13,$n12,$n11,$n10,$n9,$n8,$n7,$n6,$n5,$n4,$n3,$n2,$n1)
#call hash(3,$n13,$n14,$n12,$n10,$n11,$n9,$n5,$n3,$n4,$n8,$n6,$n7,$n2,$n1)
endif;
* stores best hash in $bestiso #call smallesthash(0,1,2,3) if ($bestiso == 0); Multiply replace_(i1,i2,i2,i1);
elseif ($bestiso == 1); Multiply sign_($n10+$n11+$n13+$n14) *replace_(i1,i2,i2,i1,i3,i7,i4,i6,i5,i8,i6,i4,i7,i3,i8,i5,i10,i11,i11,i10); elseif ($bestiso == 3); Multiply sign_($n10+$n11+$n13+$n14)
*replace_(i3,i7,i4,i6,i5,i8,i6,i4,i7,i3,i8,i5,i10,i11,i11,i10,i13,i14,i14,i13);
endif;
endif;


The action that will be performed in d366 is the integration of the left bubble, and . As can be seen in figure 4, all relevant dot products are written only in terms of and none in terms of , in alignment with the insertion rule. The dot products that involve can all be re-expressed in terms of inverse propagators after integrating the insertion. The two dot products that remain, , and (represented by i10 and i11 respectively) have to be rewritten to the new topology.

The new topology is called d118:

{axopicture}

(100,80)(-20,-40) \Arc[arrow,flip](60,0)(30,0,90) \Arc[arrow,flip](60,0)(30,90,180) \Arc[arrow](60,0)(30,180,230) \Arc[arrow,flip](60,0)(30,230,320) \Arc[arrow,flip](60,0)(30,320,360) \CArc[arrow,arrowpos=0.75](90,-60)(60,90,143) \CCirc(70,-2)8WhiteWhite \CArc[arrow,arrowpos=0.3](120,30)(60,180,233) \Line[arrow](10,0)(30,0) \Line[arrow](90,0)(110,0) \Vertex(30,0)1.5 \Vertex(90,0)1.5 \Vertex(60,30)1.5 \Vertex(84,-18)1.5 \Vertex(42,-24)1.5 \Text(35,28) \Text(94,25) \Text(70,15) \Text(50,-2) \Text(30,-20) \Text(60,-37) \Text(97,-10)

 n8=Q⋅p4 n9=Q⋅p7

where we have suppressed the power of the external line.

Below is the mapping from d366 to d118, which includes rewriting the old dot products:

Multiply replace_(i3,j2,i4,j1,i5,j7,i6,j6,i7,j3,i8,j4,i9,j5);
id i10 =  Q^2/2+j2/2-j3/2-j9;
id i11 = -Q^2/2+j2/2-j3/2+j8;
Multiply replace_(Md366,Md118,<j1,i1>,...,<j7,i7>,j8,-i8,j9,i9);


## 6 Usage

The Forcer program can be downloaded from https://github.com/benruijl/forcer. Currently, the latest development version of Form is required, which can be obtained from https://github.com/vermaseren/form. The generation scripts require Python 2.7, Python 3 or higher as well igraph [40], numpy [41] and sympy [42].

An example of Forcer input is the following program:

#-
#include forcer.h

L F =
+1/<p1.p1>/.../<p6.p6>*Q.p3*Q.p4*vx(Q,p1,p5,p6)*vx(-p1,p2,p3)
*vx(-p5,-p6,p4)*vx(-Q,-p2,-p3,-p4)
+1/<p1.p1>/.../<p5.p5>*vx(-Q,p2,p3)*vx(p1,-p2,p5)*vx(-p1,p4,Q)
*vx(-p3,-p4,-p5)*ex(p1,p4)
;

#call Forcer(msbarexpand=4)
B ep;
P +s;
.end


After forcer.h is included, the input integral can be defined. This is done by specifying the vertex structure using vx. The external momentum should be called Q. The propagators and momenta can simply be multiplied in, as shown in the example above. Insertions on lines can be specified using the ex function. In the second integral above ex(p1,p4) means that there is a single on the propagator associated with momentum , and one on . The topologies will automatically be matched to Forcer’s internal topologies. The dot products will also automatically be rewritten (see subsection 7.4).

By calling Forcer, the integrals are computed. The optional argument msbarexpand can give the (unrenormalized) answer expanded in . Otherwise, the result will be given exactly in terms of the master integrals and rational coefficients (see appendix C). Other options include polyratfunexpand=div and polyratfunexpand=maxpow, which enable the expansions of rational coefficients in at intermediate steps using the Form statement PolyRatFun (see appendix. D).

Alternatively, one could execute each of the tree transition steps to Forcer individually: first, the topologies need to be matched. Next, the momentum substitution to the Forcer topology basis for a given topology ‘TOPO’ has to be performed. Finally, the reduction is executed. The steps are sketched below:

#call loadTopologies()
#call matchTopologies(1)

#call momsubs(‘TOPO’,1,1)

#call DoForcer()


## 7 From physical diagrams to Forcer

The interface provided in the previous section expects scalar integrals as input. In order to compute Feynman diagrams, process-specific preprocessing has to be performed. Below we discuss some general optimizations that could be applied there. Since the actual implementation is highly dependent on conventions, we will only sketch certain parts.

The program Qgraf [43] provides a convenient way to generate the Feynman graphs that are needed for the actual calculations, because it can generate Form compatible output. However, the challenge remains of converting the diagrams as presented by Qgraf to something that the Forcer program can deal with. This involves mapping the topology and momenta of the diagrams to Forcer’s internal notation. For this purpose, the Python program that generates the reduction graph also generates a file called notation.h which contains a specification of all topologies in such a way that a conversion program can use it for

1. topology recognition,

2. labelling the momenta and their directions for each line,

3. using symmetry transformations.

Each topology is represented by a term in Form notation. Two typical terms are

  +vx(-Q,p4,p5)
*vx(p3,-p4,p11)
*vx(p6,p7,p10)
*vx(p2,-p3,-p10)
*vx(p1,-p2,p9)
*vx(-p5,-p6,-p9)
*vx(-p7,p8,-p11)
*vx(-p1,-p8,Q)
*SYM()
*SYM(Q,-Q,p1,-p5,p2,p6,p3,-p7,p4,-p8,p5,-p1,p6,p2,p7,-p3,p8,-p4
,p9,-p9,p10,-p10,p11,-p11)
*SYM(Q,-Q,p1,-p4,p2,-p3,p3,-p2,p4,-p1,p5,-p8,p6,p7,p7,p6,p8,-p5
,p9,p11,p11,p9)
*SYM(p1,p8,p2,p7,p3,-p6,p4,p5,p5,p4,p6,-p3,p7,p2,p8,p1,p9,-p11
,p10,-p10,p11,-p9)
*TOPO(Mno2)

+vx(-Q,p3,p4)
*vx(p2,-p3,p7)
*vx(p1,-p2,p6)
*vx(-p1,p5,Q)
*vx(-p4,-p5,-p6,-p7)
*ex(p2)
*SYM()
*SYM(Q,-Q,p1,-p3,p2,-p2,p3,-p1,p4,p5,p5,p4,p6,p7,p7,p6)
*TOPO(Mfastar2)


The first term indicates the no2 topology. The function vx indicates the vertices and the momenta belonging to that vertex. Negative momenta are incoming. The function TOPO has a symbol as an argument that indicates the topology. In the Forcer program terms that are in the notation of a given topology are labelled with one power of the corresponding symbol. The function SYM describes a symmetry operation of the topology. The Form statement

    id,once,SYM(?a) = replace_(?a);


will execute such an operation. In practice one could use it in the following way:

    id  vx(?a) = f1(vx(?a));
repeat id f1(x1?)*f1(x2?) = f1(x1*x2);
repeat id SYM(?a)*f1(x?) = f1(x)*f2(x*replace_(?a));
id  f1(x1?)*f2(x2?) = x2;
id  f2(x?) = 1;


This process makes for each occurrence of the function SYM a copy of the contents of the function f1 in which the corresponding symmetry operation has been applied. Because the normal ordering algorithm of Form puts the smallest of the functions f2 first, we end up with the smallest representation of the term. If this is applied at a later stage in the program more statements may be needed, because there may be more objects than vx.

The notation file includes more topologies than actually exist in the Forcer reduction graph, since physical diagrams can have duplicate momenta. If this is the case, the term in the notation file also contains a function ID, for example ID(p4,-p5), indicating that and are actually the same momentum. After the topology is matched and the labelling is done, the ID function can be applied:

id ID(p1?,p2?) = replace_(p1,p2);


The first step in determining the topology of a diagram is to read the notation.h file, number its topologies, and store each of them in a dollar variable with a name that is labelled by this number. We also store the names of the topologies in such an array of dollar variables. The topology of a diagram can now be determined by trying whether one of the topologies can be substituted in the term. If this pattern matching involves wildcards, and the match of the wildcards is stored inside dollar variables we can use this to relabel the diagram itself and bring it to the notation of the topology. The main problem is creating the match structure, since we need wildcards for all the momenta followed by the name of a dollar variable. This issue is resolved with the dictionary feature of Form. The essential part of the code is:

    #OpenDictionary wildmom
#do i = 1,‘$MAXPROPS’ #add p‘i’: "p‘i’?$p‘i’"
#enddo
#CloseDictionary

#do i = 1,‘$MAXPROPS’$p‘i’ = p‘i’;
#enddo

#UseDictionary wildmom($) #do i = 1,‘$numtopo’
if ( match(‘$topo‘i’’) );$toponum = ‘i’;
goto caught;
endif;
#enddo
#CloseDictionary
label caught;

Multiply replace_(Q,Q,<$p1,p1>,...,<$p‘$MAXPROPS’,p‘$MAXPROPS’>)*
topo($toponum);  When we try to match, the printing of the ‘$topo‘i’’ variable will result in objects like vx(p1?$p1,p2?$p2,p3?$p3)*... rather than the vx(p1,p2,p3)*... that it actually contains. This way the$-variables get the value of the momenta in the diagram for which we want to determine the topology and the notation. The final replace substitutes these momenta by the value they have in the topology file.

It is possible to speed up the process considerably by hashing the topologies by the number of vertices and by first stripping the signs of the momenta. These signs can be recovered in a later step.

### 7.1 Self-energy filtering

Another optimization is to filter self-energy insertions from the Qgraf output. Here we present an algorithm that can detect one particle reducible propagator insertions.

1. Select a representative for a one-loop propagator. A representative is a single diagram that occurs in this propagator. For the ghost and the quark propagators this is trivial, since there is only a single diagram. For the gluon we select the diagram with the ghost loop (not forgetting the minus sign).

2. In the propagators we indicate the number of loops with an extra parameter. Adjacent loop representatives are combined and their number of loops is the sum of those parameters. This means that the representative of a three-loop gluon propagator is a chain of three one-loop diagrams, each with a ghost loop.

3. Next we make a copy of all remaining vertices into a function acc. In this function we remove all vertices that have an external line.

4. In the function acc we start selecting one vertex in all possible ways.

5. If this special vertex has more than two lines, it ‘consumes’ in all possible ways one of its neighbouring vertices, removing the connecting momentum. If the same momentum connects twice to the new vertex, it is removed as well.

6. We keep doing this until either the super-vertex in one of the terms has two lines remaining in which case we can eliminate the whole diagram as it is part of a propagator, or we cannot remove any more lines. If all possibilities end in the last way we keep the diagram.

Let us show this diagrammatically for a non-trivial diagram:

{axopicture}

(100,60)(0,0) \Line(5,30)(15,30) \Line(85,30)(95,30) \Line(40,55)(60,55) \Line(40,5)(60,5) \Line(50,55)(50,45) \Line(50,5)(50,15) \Line(35,30)(65,30) \Arc(40,30)(25,90,270) \Arc(60,30)(25,270,90) \Arc(50,30)(15,0,360) {axopicture}(30,60)(0,0) \Line[arrow,arrowpos=1](5,30)(25,30) \Text(15,35)[b]Step 3 {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(25,55)(25,45) \Line(25,5)(25,15) \Line(10,30)(40,30) \Arc(25,30)(15,0,360)

{axopicture}

(40,60)(0,0) \Line[arrow,arrowpos=1](0,30)(17,30) \Text(8.5,36)[b]Step 4 \Text(25,30)[l] {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(25,55)(25,45) \Line(25,5)(25,15) \Line(10,30)(40,30) \Arc(25,30)(15,0,360) \Vertex(25,55)2 {axopicture}(20,60)(0,0) \Text(1,30)[l] {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(25,55)(25,45) \Line(25,5)(25,15) \Line(10,30)(40,30) \Arc(25,30)(15,0,360) \Vertex(25,45)2 {axopicture}(20,60)(0,0) \Text(1,30)[l] {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(25,55)(25,45) \Line(25,5)(25,15) \Line(10,30)(40,30) \Arc(25,30)(15,0,360) \Vertex(10,30)2

{axopicture}

(40,60)(0,0) \Line[arrow,arrowpos=1](0,30)(17,30) \Text(8.5,36)[b]Step 5,1 \Text(25,30)[l] {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(25,5)(25,25) \Line(10,40)(40,40) \Arc(25,40)(15,0,360) \Vertex(25,55)2 {axopicture}(20,60)(0,0) \Text(1,30)[l] {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(25,55)(25,45) \Line(25,5)(25,15) \Arc(25,30)(15,0,360) \Arc(40,45)(15,180,270) \Vertex(25,45)2 {axopicture}(20,60)(0,0) \Text(1,30)[l] {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(25,55)(25,45) \Line(25,5)(25,15) \Arc(25,37.5)(7.5,0,360) \Arc(25,22.5)(7.5,0,360) \Vertex(25,30)2

{axopicture}

(40,60)(0,0) \Line[arrow,arrowpos=1](0,30)(17,30) \Text(8.5,36)[b]Step 5,2 \Text(25,30)[l] {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(25,5)(25,25) \Arc(25,40)(15,0,360) \Arc(40,55)(15,180,270) \Vertex(25,55)2 {axopicture}(26,60)(0,0) \Text(1,30)[l] {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(25,55)(25,40) \Line(25,5)(25,20) \Arc(25,30)(10,0,360) \Vertex(25,40)2 {axopicture}(20,60)(0,0) \Text(1,30)[l] {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(20,55)(20,5) \Line(20,30)(40,30) \Arc(30,30)(10,0,360) \Vertex(20,30)2

{axopicture}

(45,60)(0,0) \Line[arrow,arrowpos=1](0,30)(17,30) \Text(8.5,36)[b]Step 5,3 \Text(25,30)[l] {axopicture}(50,60)(0,0) \Line(5,55)(45,55) \Line(5,5)(45,5) \Line(25,5)(25,35) \Arc(25,45)(10,0,360) \Vertex(25,55)2 {axopicture}(26,60)(0,0) \Text(1,30)[l] {axopicture}(50,60)(0,0) \Line(5,30)(45,30) \Line(5,5)(45,5) \Line(25,5)(25,50) \Arc(25,40)(10,0,360) \Vertex(25,30)2 {axopicture}(26,60)(0,0) \Text(1,30)[l] {axopicture}(50,60)(0,0) \Line(5,30)(45,30) \Line(5,5)(25,30) \Line(45,5)(25,30) \Line(25,30)(25,50) \Arc(25,40)(10,0,360) \Vertex(25,30)2 {axopicture}(26,60)(0,0) \Text(1,30)[l] {axopicture}(50,60)(0,0) \Line(5,45)(45,45) \Line(5,15)(45,15) \Line(25,45)(25,15) \Vertex(25,30)2

{axopicture}

(50,40)(0,0) \Line[arrow,arrowpos=1](0,20)(20,20) \Text(8.5,26)[b]Step 5,4 {axopicture}(50,40)(0,0) \Line[arrow,arrowpos=1](0,20)(17,20) \Text(8.5,26)[b]Step 5,5 \Text(25,20)[l] {axopicture}(50,40)(0,0) \Line(5,35)(25,20) \Line(25,20)(45,35) \Line(5,5)(25,20) \Line(25,20)(45,5) \Vertex(25,20)2 {axopicture}(26,40)(0,0) \Text(1,20)[l] {axopicture}(50,40)(0,0) \Line(5,35)(45,35) \Line(5,5)(45,5) \Line(25,35)(25,5) \Vertex(25,20)2

In the example, the diagram can be eliminated at the moment the super-vertex with just two lines appears. This is at step 5,3. We did not stop at that point because we wanted to show how the other possibilities develop for diagrams that would survive.

The above algorithm can be programmed rather easily in Form with the new id,all option of the id statement. For instance step 4 is just the statement

    id,all,v(?a) = w(?a);


in which v represents the vertices and w is the super-vertex. This is followed by a symmetrization to reduce the number of different diagrams. A complete procedure that works for all types of diagrams, independent of the number of external lines or loops contains 30 Form statements. The elimination of insertions simplifies the calculation considerably, because multi-loop gluon propagator insertions have many diagrams. This is particularly important when calculating moments of splitting and coefficient functions in DIS.

### 7.2 Colour split-off

We split each diagram in its colour part and its ‘Lorentz’ part before applying the Feynman rules. The 4-gluon vertex is split up into three terms with their own overall colour factor. Technically it is not required to do the split-off at this stage, but the remaining program will be considerably faster when the colour is a global factor.

To compute the colour factor we use a modified version of the color package of ref. [44], which is available on the Form pages (http://www.nikhef.nl/form). It has been observed that even when one may have diagrams or more, there are usually at most a few hundred different colour factors to be worked out. Hence the way to process these factors is by pulling all colour objects into a function color and then, after using colour projectors on the external lines, only working out the colour bracket:

   Normalize color;
B   color;
.sort: Prepare color;
Keep brackets;
Argument color;
#call color
#call simpli
EndArgument;


By replacing every .sort by

   #procedure SORT(text)
EndArgument;
B   color;
.sort:‘text’;
Keep Brackets;
Argument color;
#endprocedure


we guarantee that each different colour object is worked out only once.

### 7.3 Diagram database

Diagrams with the same topology and colour factor are grouped together in superdiagrams. The superdiagrams provide a convenient way to distribute the work over multiple computers. This grouping can speed up the calculation by a modest factor (typically ).

We use the minos database program provided (with its source code) in the Form pages to store the superdiagrams. After each superdiagram is computed, it is multiplied with its colour factors. Finally, the values of all superdiagrams are added. Only at this stage do we substitute the formulas for the insertion propagators and the master integrals. Up until the substitution of the master integrals the results are exact to all orders in if one uses rational polynomials in for the coefficients of the terms.

### 7.4 Momentum substitutions

After the Feynman rules have been applied, the integrals are in a form in which they can be converted to Forcer’s basis for the topologies. The reduction to this basis needs to be done with great care as it is very easy to generate an extremely large number of terms. This process is split up in two components: rewriting the momenta to a momentum basis and rewriting the dot products to Forcer’s basis.

The momentum basis should contain all the momenta of the irreducible dot products belonging to this Forcer topology. The other basis elements are obtained by an exhaustive search that tries to minimize the number of terms that will be created when rewriting to the basis. The optimization criterion is the sum of the square of the number of terms that get created for all the momentum and dot product rewrites.

In order to prevent a blow-up in the number of terms, we create a layered rewrite of momenta. This layering is constructed automatically and makes the momentum rewrites order dependent:

 p9.p?!{p9,}=+p2.p+p7.p+p11.p-Q.p; p5.p?!{p5,}=-p11.p-p3.p+Q.p; p6.p?!{p6,}=-p2.p+p3.p-p7.p; p1.p?!{p1,p4}=+Q.p-p8.p; p10.p?!{p10,}=+p2.p-p3.p; p4.p?!{p4,p1}=+p11.p+p3.p; p7.p?!{p7,}=+p8.p-p11.p; → p9.p?!{p9,}=-p6.p-p5.p; p5.p?!{p5,}=-p4.p+Q.p; p6.p?!{p6,}=-p10.p-p7.p; p1.p?!{p1,p4}=+Q.p-p8.p; p10.p?!{p10,}=+p2.p-p3.p; p4.p?!{p4,p1}=+p11.p+p3.p; p7.p?!{p7,}=+p8.p-p11.p; 

Because some terms will merge during the momentum rewrites, the layered approach is much faster. Note that dot products will not be rewritten if they are elements of the dot product basis.

Finally, the dot products are rewritten, straight to the internal Forcer notation:

id Q.p1  = Q^2/2+i1/2-i8/2;
id p1.p2 = i1/2+i2/2-i9/2;
id p2.p3 = -i10/2+i2/2+i3/2;
id Q.p4  = Q^2/2+i4/2-i5/2;
id p3.p4 = -i11/2+i3/2+i4/2;
id p1.p3 = -Q^2/2+i11/2+i13+i14-i4/2+i5/2-i7/2+i8/2;
id p2.p4 = -Q^2/2-i1/2+i12+i13+i5/2-i6/2+i8/2+i9/2;


We note that in the actual code there will be .sort statements between the id statements and that there are extra optimizations in place to prevent excessive term generation.

## 8 Examples and performance

The Forcer program has recently been used in many large calculations. As a first demonstration of its capabilities, the four-loop QCD beta function has been recomputed [31, 32], and it agrees with refs. [45, 46].

Since Forcer can compute the finite pieces to any power of , the QCD propagators and three-point vertices with one nullified momentum could straightforwardly be computed [33]. The most expensive computation was the triple-gluon vertex, which took one week on a single machine with 24 cores.

The Forcer program has also been extensively used in the computation of Mellin moments of structure functions [29, 30, 47]. For the non-singlet case, up to six Mellin moments have been computed. For the simpler leading and sub-leading diagrams, up to the 40th Mellin moment was computed, which allowed for an analytical reconstruction. Even though most of these diagrams are essentially three loops, the complexity of the integrals, as defined in section 2, is more than 80. Using the OPE method, the moments of the non-singlet splitting functions have been computed up to ; in the large- limit has been reached. The latter result has proven sufficient for reconstructing and validating their all- form [47, 48]. From this quantity the four-loop cusp anomalous dimension in the large- limit has been computed, and it agrees with the calculation of the same quantity via the quark form factor [49]. The high complexities integrals involved in the aforementioned computations exceed what current Laporta-style methods can compute by a large margin. Even with Forcer, some computations can take weeks. In order to improve performance, the momentum basis was chosen in such a way that it aligns with the -flow through the diagram. Additionally, expansions in were used (see appendix D).

Most recently, the Forcer program was essential in computing the five-loop beta function for Yang-Mills theory with fermions [34], which confirmed the SU(3) result of [50]. To compute the poles of five-loop diagrams, infrared rearrangement was used to rewrite any integral to a carpet integral [51]. This process generates a large number of counterterms with dot products. The complete computation took three days on a cluster.

Below we demonstrate some benchmarks of the Forcer program. We start with some specific configurations, displayed in table 2. We have chosen top-level topologies for the benchmark, since these are the most time-consuming ones. In their reduction, many other master topologies (and thus custom reductions) are encountered. The topology la4 is the four-loop ladder topology.

Next, we compute samples of configurations with a specific complexity of the top-level non-planar master integral no1 and no2. In figure 5, we show the total wall-clock time of computing 10 and 100 samples for a given complexity at the same time, using 4 cores. We observe that even though the difference in number of samples is a factor 10, the computation time increases only by about 20%. This demonstrates that the Forcer program makes use of symmetries and grouping, which cause shared configurations deeper in the reduction process to merge. Additionally, the graph shows that the computation time scales exponentially in the complexity with a base of about .

Finally, figure 6 shows the timings for computing four-loop QCD self-energies for a certain maximum power of the (unrenormalized) gauge parameter . Here corresponds to the Feynman gauge. In our setup, all techniques discussed in section 7 are applied.

The background field propagator in figure 6 can be used to obtain the beta function without computing an additional propagator and a vertex [52, 53]. Interestingly, the curve for the background-gluon is quite similar to that for the gluon, even though one may expect that the background-gluon to be more time consuming than the gluon propagator because of extra vertices. The high performance can be understood by the fact that we are using superdiagrams; as we have seen in figure 5, the increase of the number of terms does not matter much, provided complexities of integrals are similar, and there are many chances for merges and cancellations of coefficients of the integrals at intermediate stages in the reduction.

Using the background field method, we are able to compute the four-loop beta function for Yang-Mills theory with fermions in less than three minutes in the Feynman gauge, and in about four hours for all powers of the gauge parameter on a single machine with 32 cores.

## 9 Conclusions and outlook

We have shown how the Forcer program has been constructed, what algorithms it uses and demonstrated its performance. In addition, we have summarized how Forcer may be used for computing physical diagrams. The predominantly automatic construction of the program is not limited to four loops. We have run the fully automatic pieces for a five-loop program. Even though there are over 200 missing topology actions that require a custom reduction (and the masters integrals have not yet been determined), about 30% of the diagrams of the gluon propagator can be computed with this five-loop Forcer equivalent. This means that if in the future the master topologies can be worked out automatically as well, a five-loop program could be constructed shortly after. The idea is quite challenging: the number of parameters that have to be reduced, grows from 14 at four loops to 20 at five loops.

The Forcer program has already been used for some large calculations at the four- and five-loop levels. For Yang-Mills theory with fermions, the propagators and three-point functions with one vanishing momentum have been determined exactly in terms of master integrals and with a rational coefficient in  [33]. Recently, the Forcer program was utilized in the computation of the five-loop beta function with generic colour group [51]. This computation took three days on a cluster. Additionally, Forcer has been used for the determination of a number of Mellin moments of splitting and coefficient functions in DIS [29, 30, 47]. When we use the same methods as in ref. [5] for the three-loop moments, the calculations can become very demanding when the moment is six or higher. With the use of operator vertices, the calculations are considerably less needy of resources and already some rather high moments () for the non-singlet splitting functions have been obtained [47, 48]. We hope to be able to do this for the singlet splitting functions as well. Eventually it should contribute to a more precise determination of Higgs boson production at the LHC.

## Acknowledgements

This work has been supported by the ERC Advanced Grant no. 320651, HEPGAME.

We are much indebted to A. Vogt for helping with the testing of the programs and making many useful suggestions about the efficiency.

## Appendix A One-loop sub integrals

In programs such as Reduze and LiteRed reductions to master integrals are performed only through IBP identities. The Forcer approach is similar to the Mincer approach in which one reduces the integrals to a point where one-loop sub integrals can be performed. For these, one has the formula [54]

 ∫dDP(2π)DPn(P)P2α(P−Q)2β= (23) 1(4π)2(Q2)D/2−α−β[n/2]∑σ≥0G(α,β,n,σ)Q2σ{1σ!(□4)σPn(P)}P=Q,

in which

 Pn(P)=Pμ1Pμ2⋯Pμn. (24)

is the dimension of space-time and is also given by , and can be expressed in terms of -functions:

 G(α,β,n,σ)=(4π)ϵΓ(α+β−σ−D/2)