Towards Intuitive Reasoning in Axiomatic Geometry

Towards Intuitive Reasoning in Axiomatic Geometry

Maximilian Doré Ludwig Maximilian University
Munich, Germany Imperial College
London, UK
   Krysia Broda Imperial College
London, UK

Proving lemmas in synthetic geometry is often a time-consuming endeavour since many intermediate lemmas need to be proven before interesting results can be obtained. Improvements in automated theorem provers (ATP) in recent years now mean they can prove many of these intermediate lemmas. The interactive theorem prover Elfe accepts mathematical texts written in fair English and verifies them with the help of ATP. Geometrical texts can thereby easily be formalized in Elfe, leaving only the cornerstones of a proof to be derived by the user. This allows for teaching axiomatic geometry to students without prior experience in formalized mathematics.



\IfBooleanT #1 {internallinenumbers} #2 \IfBooleanT#1

1 Introduction

Formalizing mathematical proofs is something students usually do not apply or even learn before their graduate studies. Various tools for guiding proof construction in first order logic have been proposed, for instance Pandora [6] is an interactive tool students can use to construct correct natural deduction proofs, and more recently a Sequent calculus trainer has been developed [10]. A recent experimental project is underway with first year students at Imperial College to use the Lean prover [13] as a vehicle for teaching both how to prove theorems and the use of theorem provers to verify such proofs [7]. Another recent approach teaches students mathematical reasoning by first introducing them to the proof assistant Coq [20], and then transferring the skills they learned to informal proofs [3]. While these successful projects create an understanding of mathematical reasoning, none of them allows the user to write proofs in English, as is done in typical undergraduate classes. Therefore, in order to ease the introduction to interactive theorem proving and constructing proofs, we have developed the Elfe system [8]. It has been previously used for proofs in discrete mathematics, such as sets and relations, but recently, we have added a library to allow working within synthetic geometry, which is the focus of this paper.


* Lemma: for all a,b,c,d,m. midpoint(m,b,c) and a-b-c and b-c-d and a-b c-d and b c implies midpoint(m,a,d).


x    Assume midpoint(m,b,c) and a-b-c and b-c-d and a-b c-d and b c.

x    Then a-m m-d since b-m m-c and a-b c-d.

x    Note a-m-d: Then b-m-c by DefMidpoint.

x    x    Then a-b-m since a-b-c and b-m-c. Then m-c-d since b-m-c and b-c-d.

x    qed.

x    Hence midpoint(m,a,d).


Figure 1: A simple proof in elementary geometry

Consider the exemplary Elfe text in Figure 1. The lemma in line 1 states that if a line has a midpoint m, then we can extend the line on both sides by the same distance such that the line between the outer points has the same midpoint m. In the statement, we use a-b-c to denote that the point b lies in between the points a and c; the expression a-b c-d expresses that the lines a-b and c-d are equally long. The precise proposition of the lemma is therefore the following: given a line b-c with midpoint m, adding points a and d on both sides of the line with an equal distance to b , respectively c, will respect that m is also the midpoint of the new line between a-d. The proof of the lemma is given in an intuitive way: We observe in line 5 that the distance between a and m is the same as between d and m. Furthermore, the point m lies between a and d as established in lines 6-8. We will understand in Section 3 how the proof works and how it is checked by the Elfe system.

Geometry is an attractive candidate for teaching formalized mathematics due to its intuitive character — even high school students can understand basic lemmas involving parallel lines or midpoints. Consider the intuition behind the previous proof, which is depicted below in Figure 2. Since we extended the line between a and c on both sides by the same length , the new line a-d must have the same midpoint as the line b-c. When transforming this kind of diagrammatic reasoning to an axiomatic proof, the resulting mathematical text should be as close to the informal proof as possible.






Figure 2: The intuition behind the lemma from Figure 1

The project GeoCoq [2] has undertaken the effort to formalize a large body of elementary geometry in the proof assistant Coq. The formalization makes use of different axiomatizations of synthetic geometry, among these an axiom system thought out by Alfred Tarski. The axiomatization is relatively straightforward and only involves two basic predicates and around a dozen axioms. The proofs in GeoCoq turn out to be quite long and complex. Many intermediate lemmas need to be proven until interesting results can be obtained. The Elfe system in contrast uses automated theorem provers (ATP) in the background to free its users from proving laborious steps. This approach has turned out to be very useful in synthetic geometry, as steps in a proof that are obvious to a human prover can be simply checked by the ATP in the background. The user can therefore focus on the aspects of a proof that she wants to understand better. The resulting proof texts are significantly shorter than the proofs in Coq, which goes hand in hand with a reduced level of detail. This level of detail is in some cases crucial, but when teaching students we believe that a more high-level view of proofs can be beneficial. We will see that the proof style of Elfe is similar to the style of Isar [23], a popular language on top of the proof assistant Isabelle [15]; and shares some resemblance with the Mizar system [21]. In contrast to these systems, the user can more freely choose proof paths and leave proof steps out since Elfe utilizes the power of current ATP to check omitted details.

In the following, we will first introduce the concepts of formal axiomatic geometry necessary for our purposes in Section 2 before further analyzing the above proof text in Section 3. We will see that the system can also be used for more complex proofs in Section 4 before giving an overview of related work in Section 5 and ideas of further developments in Section 6.

2 Background

We will first give a short overview of a first-order axiomatization of elementary geometry in Section 2.1, before turning to the theorem prover used in this paper in Section 2.2.

2.1 Axiomatic Geometry

The endeavour of axiomatically capturing geometry was already pursued by Euclid in his Elements. Hilbert and Tarski, among others, undertook the effort of giving axiom systems for geometry in first-order logic. We will present the axiom system of Tarski [19] in the following.


*Notation between: a-b-c.

Notation equidistant: a-b c-d.

Axiom CongrRefl: for all a,b. a-b b-a.

Axiom CongrIdent: for all a,b,c. a-b c-c implies a = b.

Axiom CongrTrans: for all a,b,p,q,r,s. a-b p-q and a-b r-s implies p-q r-s.

Axiom SegmentConstr: for all a,b,c,d. exists e. b-e c-d and a-b-e.

Axiom FiveSegment: for all a,b,c,d,a’,b’,c’,d’. (a-b-c and a’-b’-c’ and a-b a’-b’ and b-c b’-c’ and a-d a’-d’ and b-d b’-d’ and not a = b) implies c-d c’-d’.

Axiom BetwIdent: for all a,b. a-b-a implies a = b.

Axiom Pasch: for all a,b,c,p,q. a-p-c and b-q-c implies exists x. p-x-b and q-x-a.

Axiom LowerDim: exists a,b,c. not a-b-c and not b-c-a and not c-a-b.

Axiom Euclid: for all a,b,c,d,t. exists x,y.

x    (a-d-t and b-d-c and not a = d) implies (a-b-x and a-c-y and x-t-y).

Figure 3: Tarski’s axioms in Elfe

Figure 3 presents the axiom system in the Elfe language. For now, we do not need to know anything about the language except that it is a version of first-order logic. The only language feature we need to know about are notations: Two new notations for predicates are introduced in lines 1-2. The predicate between(a,b,c) expresses that three points a,b and c are collinear, and that b lies between the other two points. By introducing the notation we can write a-b-c instead of between(a,b,c). Similarly, the notation a-b c-d stands for the predicate equidistant(a,b,c,d). This predicate expresses that the line a-b has the same length as c-d. These two predicates are sufficient to build a complete axiom system for elementary geometry. Note that at no point can we access the coordinates of a single point; instead we will only talk about collinearities and lengths of lines in relation to other lengths of lines.

The first three axioms CongrRefl, CongrIdent, CongrTrans specify the behaviour of equidistance: a line a-b is as long as its inverse definition b-a; two points a and b collapse if the length of the line between them is as long as the line c-c for another point; and equidistance is transitive. The axiom SegmentConstr is illustrated in Figure 4. It states that we can extend a line a-b by a point e, with the line b-e having the same length as another line c-d.







Figure 4: The axiom SegmentConstr

We will refer to [18] for an introduction to the other axioms. Note only that the axiom LowerDim in line 11 asserts that there is at least one proper triangle, and we therefore live at least in a 2-dimensional space. Conversely, one can introduce an axiom to assert that there is no point outside of the plane. For our purposes this is not necessary since the following proof texts hold in arbitrary spaces with dimension greater or equal to 2.

The inspiration for our work on axiomatic geometry came from the project GeoCoq [2]. GeoCoq attempts to formalize geometry in Coq. Coq utilizes a dependently typed programming language to formalize mathematics: via an extension of the Curry-Howard correspondence, types of this language are understood as mathematical propositions and programs of a type as proofs of the respective proposition. The proofs in GeoCoq are therefore constructive, except of proofs which require decidability of point equality, which is assumed as an axiom in the formalization. GeoCoq explores several axiom systems, among them the axiom system by Tarski introduced above. An excerpt of the resulting Coq axiom set can be found in Figure 5. Since ’betweenness’ is a 3-ary predicate, the type of Bet maps three points to a truth-value, i.e., Prop. Similarly, Cong maps four points to a truth-value. The axioms such as cong_pseudo_reflexivity are then straightforward versions of the axioms we already got to know in Figure 3. A current overview of the status of GeoCoq can be found in [4, 5].


* Tpoint : Type;

Bet : Tpoint Tpoint Tpoint Prop;

Cong : Tpoint Tpoint Tpoint Tpoint Prop;

cong_pseudo_reflexivity : forall A B, Cong A B B A;

cong_inner_transitivity : forall A B C D E F, Cong A B C D Cong A B E F Cong C D E F;

cong_identity : forall A B C, Cong A B C C A = B;

segment_construction : forall A B C D, exists E, Bet A B E Cong B E C D;

Figure 5: Tarski’s axioms in GeoCoq [2]

2.2 The Elfe Prover

After we already got to know how Elfe texts look we will sketch the inner workings of the system. The Elfe system attempts at giving a theorem prover that verifies mathematical proofs that are close to informal pen-and-paper proofs. Its general mode of operation is inspired by the System for Automated Deduction (SAD) [22].

Command line

Web interface





Figure 6: Architecture of the Elfe system

Proof texts can be entered via a command line interface or a web interface into the system as depicted in Figure 6. A text that should be checked is first parsed and transformed to a sequence of first-order formulas. For example, the notation a-b-c used in the initial proof will be transformed to a first-order predicate . Notations like this are not hard-coded in the system, but can be defined by the user in the proof text. For example, after stating

Notation subset: A B.

in the proof text, the user can simply write the Unicode sign between all kinds of sets. This allows for a rich input language that closely resembles pen-and-paper mathematics.

After the syntactic sugar of notations is removed, the proof text is transformed into a special data structure, the so-called statement sequence. This intermediary data structure implies certain proof obligations, which are checked by ATP in the background. Some ATP such as Vampire [16] or E Prover [17] are called in parallel; additionally, ATP are called to construct countermodels if an obligation is wrong. The result of this verification process is then returned to the user, either by telling them which derivations were correct or, if a derivation step is incorrect, by presenting a countermodel. An in-depth explanation of the Elfe system can be found in [8].

3 Geometry in Elfe

With the axiom system given in Figure 3 it is possible to prove interesting lemmas in elementary geometry. We will in the following build a library on top of the axiom system and then analyze the introductory proof further.

The Elfe text depicted in Figure 7 introduces definitions of important geometrical concepts such as parallelism, solely built on top of the two predicates between and equidistant.

  • The definition of DefCol in line 1 generalizes the notion of betweenness: Three points are collinear if they lie on the same line, independent of the order of the points.

  • A point m is the midpoint of the line a-b if it lies between a and b and is equidistant to both points, as defined in DefMidpoint in lines 2-3.

  • Since we are working in an arbitrary dimension greater than 1, four points do not necessarily form a plane. The definition of DefCoplanar in lines 4-5 characterizes four points that do form a plane: If there is an intersection x of two lines formed by some combination of the four points a, b, c or d, then we know that the four points lie in the same plane.


*Definition DefCol: for all a,b,c. col(a,b,c) iff a-b-c or b-c-a or c-a-b.

Definition DefMidpoint: for all a,b,m.

x    midpoint(m,a,b) iff a-m-b and a-m m-b.

Definition DefCoplanar: for all a,b,c,d. coplanar(a,b,c,d) iff exists x.

x    (col(a,b,x) and col(c,d,x)) or (col(a,c,x) and col(b,d,x)) or (col(a,d,x) and col(b,c,x)).

Notation parstr: a-bc-d.

Definition DefParallelStrict: for all a,b,c,d. a-bc-d iff

x    (a b and c d and coplanar(a,b,c,d) and not exists x. col(x,a,b) and col(x,c,d)).

Notation parallel: a-bc-d.

Definition DefParallel: for all a,b,c,d. a-bc-d iff

x    a-bc-d or (a b and c d and col(a,c,d) and col(b,c,d)).

Figure 7: Basic definitions upon Tarski’s axioms

Note that all these are explicit definitions, i.e., they do not extend the theory built up by the axiom system in Figure 3, but instead only combine the two predicates used in the axioms. On top of these defined predicates, we can build a notion of parallelism. DefParallelStrict in lines 7-8 captures the intuitive understanding of parallelism: Two lines a-b and c-d are parallel if they are coplanar and do not intersect, i.e., there exists no point x that is collinear with both lines.

When defining parallelism we also have to take into account a degenerate case: Two lines can be parallel if they have an intersection, more precisely, if they intersect completely and in fact describe the same line. This is captured in the definition DefParallel in lines 10-11: Two lines are parallel either if they are strictly parallel, or all their points are collinear. In line 6, resp. 9, we introduced additional notations such that we can write a-bc-d for strict parallelism and a-bc-d for general parallelism.

With this geometry library in the background we can now further analyze the exemplary proof of the introduction. Consider Figure 8, which repeats the introductory proof.

In line 1 the command Include allows use of the geometry library. All axioms and definitions previously stated are therefore in the context of our proof text. We give the proposition stated by the lemma, and give it the name MidpointExtension in line 2. In order to prove the lemma, we assume the antecedent of its main implication in line 5 and try to derive the consequent in line 12. The lemma is universally quantified, fixing specific constants for the points in its proof is left implicit, which is common practice in informal proofs.

In order to show that m is the midpoint of the line a-d, we have to show that the length of a-m is equally long as m-d, and that all three points indeed lie on a line. The first statement is derived in line 6, it directly follows from the axioms for equidistance and requires no further proof. We emphasize that it follows from the two equidistances b-m m-c and a-b c-d with the keyword since. Internally, the statement after since will be checked for validity by the ATP and, if correct, be put in the context of the statement before the since.

The proof that all points lie on the same line requires more work. In line 7, we start a subproof of the statement a-m-d with the keyword Note. All statements in lines 8-10 are then only in the scope of this subproof; later in the proof in line 12 only the statement a-m-d remains in the context. With the construction by DefMidpoint in line 8 we limit the premises that are given to the background provers. Only the definition of midpoint is required to derive the statement from the properties of the constants. With this construction the user can speed up the search for the background provers and ensure that he understands which premises make a statement true. The derivations in line 9 and 10 do not limit the premises, thus the whole context will be given to the ATP. If the background provers find proofs for all proof obligations previously stated, the text is proven to consist only of valid statements.


* Include geometry.

Lemma MidpointExtension: for all a,b,c,d,m. midpoint(m,b,c) and a-b-c and b-c-d and a-b c-d and b c implies midpoint(m,a,d).


x    Assume midpoint(m,b,c) and a-b-c and b-c-d and a-b c-d and b c.

x    Then a-m m-d since b-m m-c and a-b c-d.

x    Note a-m-d:

x    x    Then b-m-c by DefMidpoint.

x    x    Then a-b-m since a-b-c and b-m-c.

x    x    Then m-c-d since b-m-c and b-c-d.

x    qed.

x    Hence midpoint(m,a,d).


Figure 8: A simple proof in geometry

After removing the syntactic sugar of the notations, the text only consists of structured first-order formulas, which is depicted in Figure 9.

When the Elfe language is transformed to first-order logic, the system internally keeps track of the variables in the proof. Since the user gave no other information, the variables are assumed to be universally quantified, as given in line 2. Inside the proof, the user fixes specific constants for the variables. We have to take care of the change of variables to constants when internally representing the structure of the proof, which we do below.


* …



x    Assume .

x    Then since .

x    Note :

x    x    Then by DefMidpoint.

x    x    Then since .

x    x    Then since .

x    qed.

x    Hence .


Figure 9: The proof in desugared first-order logic





Figure 10: The statement sequence of MidpointExtension

After the syntactic sugar is removed and the statements in an Elfe text are in pure first-order logic, the text is transformed to an internal data structure, the so-called statement sequences, which provide a basic representation of structured mathematical proofs. Consider Figure 10, which shows the statement sequence corresponding to the proof of MidpointExtension. A statement is represented with a box which has a label in the upper left corner, e.g., the outer statement is called . Inside each statement is a mathematical sentence in first-order logic which is called the goal of the statement. For example, the goal of statement is main statement of the lemma. Below the goal is the proof of the goal, which can be of different kinds. In this instance, the proof of statement consists of another statement, namely .

The goal of is similar to the main lemma, only that the universally quantified variables are fixed to constants, which are coloured blue. One can easily see correctness of this construction by the -introduction rule of natural deduction: If a statement can be proven with constants that are subject to no assumptions, the statement is also true for universally quantified variables. The user did not have to explicitly state that he proves the statement for a set of fixed constants, the system automatically inferred this and created the statements accordingly.

The proof then employs a common proof tactic: In order to prove an implication, the antecedent is assumed, additional derivations are made and finally, the consequent follows. Accordingly, the proof of consists of the statement sequence and , which are connected with an arrow. This illustrates that statement is in the context of and can be used in its derivations. The proof of is simply Assumed, meaning that the background provers can just take it as a premise in the following.

The proof of consists in turn of the statement sequence , and . The statement repeats the goal of — namely, that m is indeed a midpoint of line a-d. In statement however, the proof is ByContext: Its goal will be checked by the background provers to see if they can derive it from the context. Crucially, the context for statement consists of the goals of statements , and . If we look back at the original proof text in Figure 8, these statements correspond to the following sentences: The assumption of line 5 and the derivations of lines 6 and 7 are given to the background provers to see if they can derive the sentence in line 12.

To sustain correctness, we of course also have to check if the sentences in lines 6 and 7 are correct. In the statement sequence this is ensured by requiring a proof sequence for both and . The proof sequence of consists of two statements, which are both checked by the background provers. Statement corresponds to the intermediary observation behind since in the original proof text. This statement is then in the context of , which checks the main statement of line 6.

The proof of statement consists similarly of several intermediary derivations, we will omit it here. Note that these derivations, corresponding to lines 8-10 in the proof text, are only local for proving . Afterwards, only this statement is in the context for . This allows for scoping of derivation steps — many steps are only relevant for a sub proof and the background provers do not need to know about them subsequently.

Figure 11: The terminal output when verifying MidpointExtension

Figure 11 shows an excerpt of the terminal output when checking the proof text from Figure 8. In the first line, we can see one assumption resulting from a definition in the background library, namely the definition of parallelism. Below, statement is assumed, i.e., the antecedent of the lemma. Underneath, all statements that were labelled with ByContext in Figure 10 are checked by the background provers. For example, in the last line before the result we can see that statement was proved by E Prover. Below the verification process, the overall result of the verification process is summarized. Since the text is correct and all proof obligations could be verified by the ATP, the result is positive. If the user made a mistake or no proof was found for an obligation, the statement and its context is printed. Finally, some statistics of the verification process are given.

The web interface can be used for a more intuitive access to the Elfe system. Figure 12 shows the web interface of the prover after checking the lemma MidpointExtension. Since all lines are coloured green, the system has accepted the proof. The user can then inspect the verification process by setting the focus on a line. In the screenshot, the user has selected line 7. Below the input field we can see more information about the sentence: Two obligations have been checked by the background provers, in both cases E Prover has found a proof. The first obligation corresponds to the main statement of the sentence, the second obligation to the intermediary derivation step after the since. The user can see the internal representation of both statements in first-order logic, after removing the syntactic sugar from notations. An instance of the web interface can be found online.111
In order to inspect the introductory proof go to:
Click on "Verify" on the right to check the lemma and set the focus in the text to inspect the verification process.

Figure 12: The web interface of Elfe

4 A Midpoint Theorem

We will now turn to proving a more intricate lemma. Consider the following statement:

for all a,b,a’,b’,m. a b and midpoint(m,a,a’) and midpoint(m,b,b’) implies a-ba’-b’.

The Elfe sentence states that if two lines a-a’ and b-b’ intersect in a point m, which is also a midpoint of both lines, then the lines a-b and a’-b’ must be parallel to each other.











Figure 13: Parallelism of lines a-b and a’-b’

Consider the left illustration in Figure 13: Lines a-a’ and b-b’ cross each other in their respective midpoint, the lines a-b and a’-b’, coloured blue, are therefore parallel. When proving the lemma, we also have to consider a degenerate case, which is shown on the right hand side of Figure 13: If both lines a-a’ and a’-b’ lie on top of each other, then all lines formed by the points are parallel to each other, in particular a-b and a’-b’.

{internallinenumbers} Lemma: for all a,b,a’,b’,m. a b and midpoint(m,a,a’) and midpoint(m,b,b’) implies a-ba’-b’. Proof: x    Assume a b and midpoint(m,a,a’) and midpoint(m,b,b’). x    Case col(a,b,b’): x    x    Then a’ b’ and col(a,a’,b’) and col(b,a’,b’) by MidpointCol. x    x    Then a-ba’-b’ by DefParallel. x    qed. x    Case not col(a,b,b’): x    x    …{internallinenumbers} x    qed. x    Hence a-ba’-b’. qed

Figure 14: The two main cases of the lemma

We will now prove the above statement. Consider Figure 14. In line 1 we state the lemma and give its proof below. We assume the left hand side of the statement in line 3 and want to derive the conclusion in line 35. Since we have to take into account the degenerate case, we will introduce a case distinction:

If points a, b and b’ are collinear, we are in the degenerate case. The proof of this case is given in lines 5-6: We use the following lemma to derive that also a,a’ and b’ and respectively a’,b’ and b must be collinear:

Lemma MidpointCol: for all a,b,a’,b’,m. a b and midpoint(m,a,a’) and midpoint(m,b,b’) and col(a,b,b’) implies a’ b’ and col(a,a’,b’) and col(b,a’,b’).

We will not give a proof of the lemma MidpointCol here. This proves the degenerate case of the definition DefParallel.

{internallinenumbers} Case not col(a,b,b’): x    Note a’ b’: x    x    Assume a’ = b’. x    x    Then a’-b’-m and m-a’ m-b’. x    x    Then m-a-b and m-a m-b. x    x    Then a = b by BetweenCong. x    x    Hence contradiction. x    qed. x    Note coplanar(a,b,a’,b’): x    x    Then a-m-a’ and b-m-b’ by DefMidpoint. x    x    Then col(a,a’,m) and col(b,b’,m) by ColPerm, DefCol. x    x    Then coplanar(a,b,a’,b’) by DefCoplanar. x    qed. x    Note not exists x. col(x,a,b) and col(x,a’,b’): x    x    …{internallinenumbers} x    qed. x    Then a-ba’-b’ by DefParallelStrict. x    Then a-ba’-b’ by DefParallel. qed.

Figure 15: Strict parallelism in the non-degenerate case

In order to prove the non-degenerate case, i.e., if a,b and b’ are not collinear, we have to give a more intricate proof, its sketch is given in Figure 15. We will have to prove that both lines are strictly parallel, the definition of strict parallelism as given in Figure 7 requires that we prove three properties:

  • a b and a’ b’: The first inequality already holds by assumption, we thus only need to prove the latter inequality. This is done in lines 10-14. We assume that a’ is equal to b’ and want to derive a contradiction. We will use another auxiliary lemma for this:

    Lemma BetweenCong: for all a,b,c. a-b-c and a-b a-c implies b = c.

    This lemma states that if we have a line a-b-c and the distance between a and b respectively c is the same, then a must be equal to b. A proof of this lemma is given in [9]. We can make use of it in line 13 by unifying c with m, which is given in line 12. This statement in turn follows from the fact that we also have a’-b’-m and m-a’ m-b’, and a is a’ mirrored on m, and b is b’ mirrored on m.

  • All points are coplanar: To prove that all points lie in the same plane we have to find an intersection of two lines formed by a combination of all four points. This witness of coplanarity is conveniently given by the midpoint m. By the definition of a midpoint m must lie between a and a’, respectively b and b’ as stated in line 17. Then in particular the weaker notion of collinearity holds and we can can conclude that all four points are collinear in line 19.

  • The remaining property we need to prove is that both lines a-b and a’-b’ have no intersection, which we will do in the following.

{internallinenumbers} Note not exists x. col(x,a,b) and col(x,a’,b’): x    Assume exists x. col(x,a,b) and col(x,a’,b’). x    Take x such that col(x,a,b) and col(x,a’,b’). x    Take x’ such that x-m-x’ and m-x’ m-x by SegmentConstr. x    Then col(a,b,x’) and col(a’,b’,x’). x    Then col(b’,x,x’) since col(b’,a’,x) and col(b’,a’,x’). x    Then col(b,x,x’) since col(b,a,x) and col(b,a,x’). x    Then col(b,x,b’) since col(b,x,x’) and col(b’,x,x’). x    Then col(a,b,b’) since col(b,x,b’) and col(b,x,a). x    Hence contradiction. qed.

Figure 16: Proving there is no intersection of both lines

You can find the corresponding Elfe text in Figure 16. We suppose that there is one intersection, and will derive a contradiction from that.








Figure 17: Deriving a contradiction

In order to derive a contradiction, we first fix a point x that evinces both collinearities with a-b and a’-b’ in line 23. The supposed situation is depicted in Figure 17. We have to draw curves to represent all collinearities — this already suggests that we are indeed deriving a contradiction. In line 24 we then employ the axiom SegmentConstr to construct a point x’ that is x mirrored on m.








Figure 18: Mirroring a line around a midpoint

In line 25 we observe that the point x’ must lie on the same line as a-b and respectively a’-b’. This is the case since we have col(a’,b’,x) respectively col(a,b,x) and then can mirror all three points on a common point m to derive col(a,b,x’) and col(a’,b’,x’). The illustration in Figure 18 demonstrates the intuitive reasoning behind this observation. In our proof text we do not have to give more derivation steps since the background provers are able to derive all collinearities.

In lines 26-29 we employ the following lemma four times:

Lemma ColTrans: for all a,b,c,d. (not a = b) and col(a,b,c) and col(a,b,d) implies col(a,c,d).

In line 29 we therefore have a, b and b’ are collinear, which contradicts our initial assumption of the non-degenerate case, namely that these points form a proper triangle. This completes our proof of the lemma.

5 Related Work

The venture of using formal systems for teaching mathematics has been employed manifold, e.g., the Pandora system [6] included an automated tutor that students could ask for help if they were stuck. Another project [3] uses the Coq proof assistant in an interesting way. When teaching students, they keep strictly to the Coq syntax in the first step. Only in the second step they encourage students to write English comments detailing the steps being performed, but in the manner of an ordinary textbook proof. Interestingly, this is a different perspective from the one we take in Elfe — we jump in straight away with a textbook proof, gradually building up notation as students become more familiar.

This is due to the fact that the Elfe system is the product of a different line of research. We have already mentioned SAD [22]. SAD works similarly to Elfe since it takes as input mathematical texts in a language fairly close to natural language before checking them with the help of ATP. The primary aim of SAD was to provide a text verifier in particular for mathematical researchers. Elfe uses the same mode of operation, but focuses with its interface and axiomatizations on mathematical beginners.

SAD in turn was influenced by other systems, in particular, the Mizar [21] system and the Isabelle [15] prover. Mizar was already developed as soon as 1973 and is still under active development. The first version of Isabelle dates back to 1986. The Isar language, which aims at giving an intuitive proof language for Isabelle, was developed in 1999 [23]. Since then, Isar has become the standard for mathematical texts in Isabelle.

In the following, we will put Elfe in the context of the development of other proof assistants in Section 5.1. We then take a closer look at the proof style of Elfe in comparison with Mizar and Isabelle in Section 5.2.

5.1 Comparison of Proof Assistants

Table 19 depicts a comparison of several popular theorem provers and the Elfe system. We can differentiate between two main lines of provers: Mizar influenced the development of Isabelle, SAD and Elfe. Another tradition builds on the Curry-Howard correspondence that interprets types as propositions and programs of a type as their proof. The prover Coq [20] builds on top of the Calculus of Constructions , a dependently typed programming language. Lean [13] and Agda [14] present further elaborations of that logical system and have been developed in the last decade.

Mizar Isabelle Coq Lean Agda SAD Elfe
Typing ++ ++ +++ +++ +++ + +
ATP for proof search
Premise annotation necc. necc. necc. necc. necc. not possible possible
Reasoning capabilities
Figure 19: Comparison of different provers

As shown in the first line, Isabelle uses a higher-order logic with simple types. The dependently typed provers Coq, Lean and Agda have a more expressive type system that employs dependent types, i.e., types that depend on values. SAD and Elfe use first-order logic internally. Mizar adds typing on top of second-order logic, which can mostly be represented with predicates of first-order logic. This type system allows for a simple notion of dependent types, but its expressive power is not comparable to dependently typed systems in the spirit of Coq. Elfe has also a primitive version of typing, as variables can be defined to fall under a certain predicate and afterwards be used without further annotation.

Automated theorem provers can be employed for proof search in Mizar, and in Isabelle via its extension Sledgehammer. SAD and Elfe trust the results of the ATP and do not need annotations in the proof. On the other hand, Isabelle and Mizar translate proofs generated by ATP in their own language that is then checked with their own reasoning mechanisms. Most systems, including Elfe, allow for using Unicode characters for mathematical notations.

5.2 Mizar, Isabelle and Elfe Compared

As we have seen in the previous section, Elfe stands in the history of Mizar and Isabelle. We will take a look at an exemplary proof in the different systems.

Consider Figure 20, which depicts two proofs of the same lemma in Elfe. The proposition of the lemma is a simple observation about the betweenness relation.


Lemma: for all a,b,c,d. a-b-d and b-c-d implies a-b-c.


x     Assume a-b-d and b-c-d.

x     Take x such that b-x-b and c-x-a by Pasch.

x     Then b = x since b-x-b.

x     Then c-b-a since c-x-a and b = x.

x     Hence a-b-c.



Lemma: for all a,b,c,d. a-b-d and b-c-d implies a-b-c.


x     Assume a-b-d and b-c-d.

x     Take x such that b-x-b and c-x-a.

x     Hence a-b-c.


Figure 20: Two possible proofs of the same lemma in Elfe

The first proof in Elfe is very similar to the proofs in Mizar and Isabelle depicted in Figure 21, which have been constructed by [11] and [12]. Elfe also allows for other proofs, such as the second proof in Figure 20. Here, we only need to observe that we can constrcut a point x with appropriate properties to show the lemma, we also do not need to state that the Pasch axiom is needed for that derivation. While this text is not necessarily instructive, it can be used by a student working on the text as first step in constructing a more verbose proof. The student can step by step refine the proof, whereas in Mizar and Isabelle it is not immediately obvious if the chosen proof path will succeed.


theorem LineExtension:

x   for S being TarskiGeometryStruct

x   for a, b, c, d being POINT of S

x   st between a,b,d & between b,c,d holds

x   between a,b,c


x   let S be TarskiGeometryStruct ; :: thesis:

x   let a, b, c, d be POINT of S; :: thesis:

x   assume H1: between a,b,d ; :: thesis:

x   assume between b,c,d ; :: thesis:

x   then consider x being POINT of S such that

x   X1: ( between b,x,b & between c,x,a ) by H1, A7;

x   b = x by X1, A6;

x   hence between a,b,c by Bsymmetry, X1; :: thesis:



theorem line_extension:

x   assumes "B a b d" and "B b c d"

x   shows "B a b c"

proof -

x   from ‘B a b d‘ and ‘B b c d‘ and A7’ [of a b d b c]

x   obtain x where "B b x b" and "B c x a" by auto

x   from ‘B b x b‘ have "b = x" by (rule A6’)

x   with ‘B c x a‘ have "B c b a" by simp

x   thus "B a b c" by (rule th3_2)


Figure 21: The proof in Mizar [11] and Isar/Isabelle [12]

The proof texts in Elfe, Mizar and Isabelle for the simple lemma are all quite legible and provide insight into why the proof succeeds. The proof texts of the latter two provers seem more technical, which is partly owed to the fact that the systems are more sophisticated and allow for annotating proof steps. Proofs in Elfe rely on ATP and can therefore be shorter than the other proofs. It is at the discretion of the user to decide when he considers a proof complete. Thereby, the Elfe system allows for more freely exploring a theory generated by an axiom system.

The Elfe system is implemented in a lean code base and allows for simple extensions — in particular, new proof structures and language constructs can simply be added to the sytem if they can be represented with statement sequences. Background provers can also simply be integrated by changing the configuration of the system with a single line.

New axiom systems can be easily created since no background mathematical system is assumed. When using existing libraries, it is not necessary to memorize possible premises since all can be used by the ATP to find proof for a derivation step.

By focusing on mathematical beginners as users, the prover can be used as a valuable didactic device, and thus we think that the Elfe system fills a gap in the zoo of proof assistants. The system certainly has not reached the level of sophistication of Isabelle, but this allows the system to be used as a test bed for lean axiomatizations and highly legible proof texts.

6 Outlook

In this paper we have seen how one can formally prove complex lemmas in elementary geometry with the help of the interactive theorem prover Elfe. By giving most of the low-level proof work to automated theorem provers, the resulting proof texts are concise and give a good overview of the main proof ideas, which makes Elfe proofs more similar to informal pen-and-paper proofs. In order to more deeply investigate different axiom systems, the lack of detail in Elfe proofs might be hindering since it is not necessary to specify which premises are used, and where, in a proof. It is therefore not always apparent which exact constructions make a derivation true.

On top of the already present Elfe libraries for working with sets, relations and functions, it might be interesting to investigate other domains in discrete mathematics such as graph theory. Synthetic characterizations of non-discrete domains like topology make them also possible candidates for a formalization in Elfe.

Geometry is a great show case for formalized mathematics due to its illustrative power — its statements can be understood already by high-school students. Geometrical proof texts in the Elfe system can provide an introduction to formal mathematics and thereby lower the barrier of entrance to the field. So far, we have only conducted pilot experiments with students of mathematics and computer science, an exemplary training session for working with relations can be found online.222 It remains to be investigated more thoroughly if and how Elfe can be used to teach (formal) mathematics to young undergraduates and high-school students.


  • [1]
  • [2] Michael Beeson, Pierre Boutry, Gabriel Braun, Charly Gries & Julien Narboux: GeoCoq. Available at
  • [3] Sebastian Böhne & Christoph Kreitz (2018): Learning how to Prove: From the Coq Proof Assistant to Textbook Style. In Pedro Quaresma & Walther Neuper, editors: Proceedings 6th International Workshop on Theorem proving components for Educational software, Gothenburg, Sweden, 6 Aug 2017, Electronic Proceedings in Theoretical Computer Science 267, Open Publishing Association, pp. 1–18, doi:
  • [4] Pierre Boutry, Gabriel Braun & Julien Narboux (2018): Formalization of the Arithmetization of Euclidean Plane Geometry and Applications. Journal of Symbolic Computation, p. 23, doi: Available at To appear.
  • [5] Gabriel Braun & Julien Narboux (2017): A synthetic proof of Pappus’ theorem in Tarski’s geometry. Journal of Automated Reasoning 58(2), p. 23, doi: Available at
  • [6] Krysia Broda, Jiefei Ma, Gabrielle Sinnadurai & Alexander Summers (2007): Pandora: A Reasoning Toolbox using Natural Deduction Style. Logic Journal of the IGPL 15(4), pp. 293–304, doi:
  • [7] Kevin Buzzard: Xena. Available at
  • [8] Maximilian Doré & Krysia Broda (2018): The ELFE System - Verifying Mathematical Proofs of Undergraduate Students. In: Proceedings of the 10th International Conference on Computer Supported Education (CSEDU), 2, INSTICC, SciTePress, pp. 15–26, doi:
  • [9] Maximilian Doré & Krysia Broda (2018): Intuitive reasoning in formalized mathematics with Elfe. In: Communications in Computer Science, Springer. Forthcoming.
  • [10] Arno Ehle, Norbert Hundeshagen & Martin Lange (2018): The Sequent Calculus Trainer with Automated Reasoning - Helping Students to Find Proofs. In Pedro Quaresma & Walther Neuper, editors: Proceedings 6th International Workshop on Theorem proving components for Educational software, Gothenburg, Sweden, 6 Aug 2017, Electronic Proceedings in Theoretical Computer Science 267, Open Publishing Association, pp. 19–37, doi:
  • [11] Adam Grabowski (2016): Tarski’s geometry modelled in Mizar computerized proof assistant. In: Computer Science and Information Systems (FedCSIS), 2016 Federated Conference on, IEEE, pp. 373–381, doi:
  • [12] T. J. M. Makarios (2012): The independence of Tarski’s Euclidean axiom. Archive of Formal Proofs., Formal proof development.
  • [13] Leonardo de Moura, Soonho Kong, Jeremy Avigad, Floris Van Doorn & Jakob von Raumer (2015): The Lean theorem prover (system description). In: International Conference on Automated Deduction, Springer, pp. 378–388, doi:
  • [14] Ulf Norell (2008): Dependently typed programming in Agda. In: International School on Advanced Functional Programming, Springer, pp. 230–266, doi:
  • [15] Lawrence C Paulson (1994): Isabelle: A generic theorem prover. 828, Springer Science & Business Media, doi:
  • [16] Alexandre Riazanov & Andrei Voronkov (2002): The design and implementation of Vampire. AI Commun. 15(2, 3), pp. 91–110.
  • [17] Stephan Schulz (2013): System Description: E 1.8. In Ken McMillan, Aart Middeldorp & Andrei Voronkov, editors: Proc. of the 19th LPAR, Stellenbosch, LNCS 8312, Springer, doi:
  • [18] Wolfram Schwabhäuser, Wanda Szmielew & Alfred Tarski (1983): Metamathematische metGoogle sagthoden in der geometrie. Springer, doi:
  • [19] Alfred Tarski & Steven Givant (1999): Tarski’s System of Geometry. Bulletin of Symbolic Logic 5(2), p. 175–214, doi:
  • [20] The Coq Development Team: Coq. Available at
  • [21] Andrzej Trybulec & Howard A Blair (1985): Computer Assisted Reasoning with MIZAR. In: IJCAI, 85, pp. 26–28.
  • [22] Konstantin Verchinine, Alexander Lyaletski & Andrei Paskevich (2007): System for Automated Deduction (SAD): a tool for proof verification. In: Proc. CADE-21, Springer, pp. 398–403, doi:
  • [23] Markus Wenzel (1999): Isar – a Generic Interpretative Approach to Readable Formal Proof Documents. In: TPHOLs ’99 Proceedings of the 12th International Conference on Theorem Proving in Higher Order Logics, Springer-Verlag, pp. 167–184, doi:
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description