Fermat, Leibniz, Euler, and the gang: The true history of the concepts of limit and shadow
Abstract.
Fermat, Leibniz, Euler, and Cauchy all used one or another form of approximate equality, or the idea of discarding “negligible” terms, so as to obtain a correct analytic answer. Their inferential moves find suitable proxies in the context of modern theories of infinitesimals, and specifically the concept of shadow. We give an application to decreasing rearrangements of real functions.
Contents
1. Introduction
The theories as developed by European mathematicians prior to 1870 differed from the modern ones in that none of them used the modern theory of limits. Fermat develops what is sometimes called a “precalculus” theory, where the optimal value is determined by some special condition such as equality of roots of some equation. The same can be said for his contemporaries like Descartes, Huygens, and Roberval.
Leibniz’s calculus advanced beyond them in working on the derivative function of the variable . He had the indefinite integral whereas his predecessors only had concepts more or less equivalent to it. Euler, following Leibniz, also worked with such functions, but distinguished the variable (or variables) with constant differentials , a status that corresponds to the modern assignment that is the independent variable, the other variables of the problem being dependent upon it (or them) functionally.
Fermat determined the optimal value by imposing a condition using his adequality of quantities. But he did not really think of quantities as functions, nor did he realize that his method produced only a necessary condition for his optimisation condition. For a more detailed general introduction, see chapters 1 and 2 of the volume edited by GrattanGuinness (Bos et al. 1980 [19]).
The doctrine of limits is sometimes claimed to have replaced that of infinitesimals when analysis was rigorized in the 19th century. While it is true that Cantor, Dedekind and Weierstrass attempted (not altogether successfully; see Ehrlich 2006 [32]; Mormann & Katz 2013 [79]) to eliminate infinitesimals from analysis, the history of the limit concept is more complex. Newton had explicitly written that his ultimate ratios were not actually ratios but, rather, limits of prime ratios (see Russell 1903 [89, item 316, p. 338339]; Pourciau 2001 [84]). In fact, the sources of a rigorous notion of limit are considerably older than the 19th century.
In the context of Leibnizian mathematics, the limit of as tends to can be viewed as the “assignable part” (as Leibniz may have put it) of where is an “inassignable” infinitesimal increment (whenever the answer is independent of the infinitesimal chosen). A modern formalisation of this idea exploits the standard part principle (see Keisler 2012 [67, p. 36]).
In the context of ordered fields , the standard part principle is the idea that if is a proper extension of the real numbers , then every finite (or limited) element is infinitely close to a suitable . Such a real number is called the standard part (sometimes called the shadow) of , or in formulas, . Denoting by the collection of finite elements of , we obtain a map
Here is called finite if it is smaller (in absolute value) than some real number (the term finite is immediately comprehensible to a wide mathematical public, whereas limited corresponds to correct technical usage); an infinitesimal is smaller (in absolute value) than every positive real; and is infinitely close to in the sense that is infinitesimal.
Briefly, the standard part function “rounds off” a finite element of to the nearest real number (see Figure 1).
The proof of the principle is easy. A finite element defines a Dedekind cut on the subfield (alternatively, on ), and the cut in turn defines the real via the usual correspondence between cuts and real numbers. One sometimes writes down the relation
to express infinite closeness.
We argue that the sources of such a relation, and of the standard part principle, go back to Fermat, Leibniz, Euler, and Cauchy. Leibniz would discard the inassignable part of to arrive at the expected answer, , relying on his law of homogeneity (see Section 4). Such an inferential move is mirrored by a suitable proxy in the hyperreal approach, namely the standard part function.
Fermat, Leibniz, Euler, and Cauchy all used one or another form of approximate equality, or the idea of discarding “negligible” terms. Their inferential moves find suitable proxies in the context of modern theories of infinitesimals, and specifically the concept of shadow.
The last two sections present an application of the standard part to decreasing rearrangements of real functions and to a problem on divergent integrals due to S. Konyagin.
This article continues efforts in revisiting the history and foundations of infinitesimal calculus and modern nonstandard analysis. Previous efforts in this direction include Bair et al. (2013 [6]); Bascelli (2014 [7]); Błaszczyk et al. (2013 [15]); Borovik et al. (2012 [16], [17]); Kanovei et al. (2013 [55]); Katz, Katz & Kudryk (2014 [61]); Mormann et al. (2013 [79]); Sherry et al. (2014 [92]); Tall et al. (2014 [97]).
2. Methodological remarks
To comment on the historical subtleties of judging or interpreting past mathematics by presentday standards,^{1}^{1}1Some reflections on this can be found in (Lewis 1975 [76]). note that neither Fermat, Leibniz, Euler, nor Cauchy had access to the semantic foundational frameworks as developed in mathematics at the end of the 19th and first half of the 20th centuries. What we argue is that their syntactic inferential moves ultimately found modern proxies in Robinson’s framework, thus placing a firm (relative to ZFC)^{2}^{2}2The Zermelo–Fraenkel Set Theory with the Axiom of Choice. semantic foundation underneath the classical procedures of these masters. Benacerraf (1965 [10]) formulated a related dichotomy in terms of mathematical practice vs mathematical ontology.
For example, the Leibnizian laws of continuity (see Knobloch 2002 [69, p. 67]) and homogeneity can be recast in terms of modern concepts such as the transfer principle and the standard part principle over the hyperreals, without ever appealing to the semantic content of the technical development of the hyperreals as a punctiform continuum; similarly, Leibniz’s proof of the product rule for differentiation is essentially identical, at the syntactic level, to a modern infinitesimal proof (see Section 4).
2.1. Atrack and Btrack
The crucial distinction between syntactic and semantic aspects of the work involving mathematical continua appears to have been overlooked by R. Arthur who finds fault with the hyperreal proxy of the Leibnizian continuum, by arguing that the latter was nonpunctiform (see Arthur 2013 [5]). Yet this makes little difference at the syntactic level, as explained above. Arthur’s brand of the syncategorematic approach following Ishiguro (1990 [52]) involves a reductive reading of Leibnizian infinitesimals as logical (as opposed to pure) fictions involving a hidden quantifier à la Weierstrass, ranging over “ordinary” values. This approach was critically analyzed in (Katz & Sherry 2013 [65]); (Sherry & Katz 2013 [92]); (Tho 2012 [101]).
Robinson’s framework poses a challenge to traditional historiography of mathematical analysis. The traditional thinking is often dominated by a kind of Weierstrassian teleology. This is a view of the history of analysis as univocal evolution toward the radiant Archimedean framework as developed by Cantor, Dedekind, Weierstrass, and others starting around 1870, described as the Atrack in a recent piece in these Notices (see Bair et al. 2013 [6]).
Robinson’s challenge is to point out not only the possibility, but also the existence of a parallel Bernoullian^{3}^{3}3Historians often name Johann Bernoulli as the first mathematician to have adhered systematically and exclusively to the infinitesimal approach as the basis for the calculus. track for the development of analysis, or Btrack for short. The Btrack assigns an irreducible and central role to the concept of infinitesimal, a role it played in the work of Leibniz, Euler, mature Lagrange,^{4}^{4}4In the second edition of his Mécanique Analytique dating from 1811, Lagrange fully embraced the infinitesimal in the following terms: “Once one has duly captured the spirit of this system [i.e., infinitesimal calculus], and has convinced oneself of the correctness of its results by means of the geometric method of the prime and ultimate ratios, or by means of the analytic method of derivatives, one can then exploit the infinitely small as a reliable and convenient tool so as to shorten and simplify proofs”. See (Katz & Katz 2011 [58]) for a discussion. Cauchy, and others.
The caliber of some of the response to Robinson’s challenge has been disappointing. Thus, the critique by Earman (1975 [30]) is marred by a confusion of secondorder infinitesimals like and secondorder hyperreal extensions like ; see (Katz & Sherry 2013 [65]) for a discussion.
Victor J. Katz (2014 [66]) appears to imply that a Btrack approach based on notions of infinitesimals or indivisibles is limited to “the work of Fermat, Newton, Leibniz and many others in the 17th and 18th centuries”. This does not appear to be Felix Klein’s view. Klein formulated a condition, in terms of the mean value theorem,^{5}^{5}5The Klein–Fraenkel criterion is discussed in more detail in Kanovei et al. (2013 [55]). for what would qualify as a successful theory of infinitesimals, and concluded:
I will not say that progress in this direction is impossible, but it is true that none of the investigators have achieved anything positive (Klein 1908 [68, p. 219]).
Klein was referring to the current work on infinitesimalenriched systems by LeviCivita, Bettazzi, Stolz, and others. In Klein’s mind, the infinitesimal track was very much a current research topic; see Ehrlich (2006 [32]) for a detailed coverage of the work on infinitesimals around 1900.
2.2. Formal epistemology: Easwaran on hyperreals
Some recent articles are more encouraging in that they attempt a more technically sophisticated approach. K. Easwaran’s study (2014 [31]), motivated by a problem in formal epistemology,^{6}^{6}6The problem is concerned with saving philosophical Bayesianism, a popular position in formal epistemology, which appears to require that one be able to find on every algebra of doxastically relevant propositions some subjective probability assignment such that only the impossible event () will be assigned an initial/uninformed subjective probability, or credence, of . attempts to deal with technical aspects of Robinson’s theory such as the notion of internal set, and shows an awareness of recent technical developments, such as a definable hyperreal system of Kanovei & Shelah (2004 [57]).
Even though Easwaran, in the tradition of Lewis (1980 [77]) and Skyrms (1980 [94]), tries to engage seriously with the intricacies of employing hyperreals in formal epistemology,^{7}^{7}7For instance, he concedes: “And the hyperreals may also help, as long as we understand that they do not tell us the precise structure of credences.” (Easwaran 2014 [31], Introduction, last paragraph). not all of his findings are convincing. For example, he assumes that physical quantities cannot take hyperreal values.^{8}^{8}8Easwaran’s explicit premise is that “All physical quantities can be entirely parametrized using the standard real numbers.” (Easwaran 2014 [31, Section 8.4, Premise 3]). However, there exist physical quantities that are not directly observable. Theoretical proxies for unobservable physical quantities typically depend on the chosen mathematical model. And not surprisingly, there are mathematical models of physical phenomena which operate with the hyperreals, in which physical quantities take hyperreal values. Many such models are discussed in the volume by Albeverio et al. (1986 [1]).
For example, certain probabilistic laws of nature have been formulated using hyperrealvalued probability theory. The construction of mathematical Brownian motion by Anderson (1976 [4]) provides a hyperreal model of the botanical counterpart. It is unclear why (and indeed rather implausible that) an observer A, whose degrees of belief about botanical Brownian motion stem from a mathematical model based on the construction of mathematical Brownian motion by Wiener (1923 [104]) should be viewed as being more rational than another observer B, whose degrees of belief about botanical Brownian motion stem from a mathematical model based on Anderson’s construction of mathematical Brownian motion.^{9}^{9}9One paradoxical aspect of Easwaran’s methodology is that, despite his antihyperreal stance in (2014 [30]), he does envision the possibility of useful infinitesimals in an earlier joint paper (Colyvan & Easwaran 2008 [27]), where he cites John Bell’s account (Bell’s presentation of Smooth Infinitesimal Analysis in [9] involves a categorytheoric framework based on intuitionistic logic); but never the hyperreals. Furthermore, in the 2014 paper he cites the surreals as possible alternatives to the real number–based description of the “structure of physical space” as he calls it; see Subsection 2.5 below for a more detailed discussion.
Similarly problematic is Easwaran’s assumption that an infinite sequence of probabilistic tests must necessarily be modeled by the set of standard natural numbers (this is discussed in more detail in Subsection 2.5). Such an assumption eliminates the possibility of modeling it by a sequence of infinite hypernatural length. Indeed, once one allows for infinite sequences to be modeled in this way, the problem of assigning a probability to an infinite sequence of coin tosses that was studied in (Elga 2004 [33]) and (Williamson 2007 [105]) allows for an elegant hyperreal solution (Herzberg 2007 [48]).
Easwaran reiterates the common objection that the hyperreals are allegedly “nonconstructive” entities. The bitter roots of such an allegation in the radical constructivist views of E. Bishop have been critically analyzed in (Katz & Katz 2011 [59]), and contrasted with the liberal views of the Intuitionist A. Heyting, who felt that Robinson’s theory was “a standard model of important mathematical research” (Heyting 1973 [51, p. 136]). It is important to keep in mind that Bishop’s target was classical mathematics (as a whole), the demise of which he predicted in the following terms:
Very possibly classical mathematics will cease to exist as an independent discipline (Bishop 1968 [14, p. 54]).
2.3. Zermelo–Fraenkel axioms and the Feferman–Levy model
In his analysis, Easwaran assigns substantial weight to the fact that “it is consistent with the ZF [Zermelo–Fraenkel set theory] without the Axiom of Choice” that the hyperreals do not exist (Easwaran 2014 [30, Section 8.4]); see Figure 2. However, on the same grounds, one would have to reject parts of mathematics with important applications. There are fundamental results in functional analysis that depend on the Axiom of Choice such as the Hahn–Banach theorem; yet no one would suggest that mathematical physicists or mathematical economists should stop exploiting them.
Most real analysis textbooks prove the additivity (i.e., countable additivity) of Lebesgue measure, but additivity is not deducible from ZF, as shown by the Feferman–Levy model; see (Feferman & Levy 1963 [36]); (Jech 1973 [54, chapter 10]). Indeed, it is consistent with ZF that the following holds:

the continuum of real numbers is a countable union of countable sets .
See (Cohen 1966 [26, chapter IV, section 4]) for a description of a model of ZF in which holds.^{10}^{10}10Property may appear to be asserting the countability of the continuum. However, in order to obtain a bijective map from a countable collection of countable sets to (and hence, by diagonalization, to ), the Axiom of Choice (in its “countable” version which allows a countablyinfinite sequence of independent choices) will necessarily be used. Note that implies that the Lebesgue measure is not countably additive, as all countable sets are null sets whereas is not a null set. Therefore, countable additivity of the Lebesgue measure cannot be established in ZF.
Terence Tao wrote:
By giving up countable additivity, one loses a fair amount of measure and integration theory, and in particular the notion of the expectation of a random variable becomes problematic (unless the random variable takes only finitely many values). (Tao 2013 [100])
Tao’s remarks suggest that deducibility from ZF is not a reasonable criterion of mathematical plausibility by any modern standard.
There are models of ZF in which there are infinitesimal numbers, if properly understood, among the real numbers themselves. Thus, there exist models of ZF which are also models of Nelson’s (1987 [82]) radically elementary mathematics, a subsystem of Nelson’s (1977 [81]) Internal Set Theory. Here radically elementary mathematics is an extension of classical set theory (which may be understood as ZF^{11}^{11}11Even though Nelson would probably argue for a much weaker system; see Herzberg (2013 [49, Appendix A.1]), citing Nelson (2011 [83]). ) by a unary predicate, to be interpreted as
“… is a standard natural number”,
with additional axioms that regulate the use of the new predicate (notably external induction for standard natural numbers) and ensure the existence of nonstandard numbers. Nelson (1987 [82, Appendix]) showed that a major part of the theory of continuoustime stochastic processes is in fact equivalent to a corresponding radically elementary theory involving infinitesimals, and indeed, radically elementary probability theory has seen applications in the sciences; see for example (Reder 2003 [85]).
In sum, mathematical descriptions of nontrivial natural phenomena involve, by necessity, some degree of mathematical idealisation, but Easwaran has not given us a good reason why only such mathematical idealisations that are feasible in every model of ZF should be acceptable. Rather, as we have already seen, there are very good arguments (e.g., from measure theory) against such a high reverence for ZF.
2.4. Skolem integers and Robinson integers
Easwaran recycles the wellknown claim by A. Connes that a hypernatural number leads to a nonmeasurable set. However, the criticism by Connes^{12}^{12}12Note that Connes relied on the HahnBanach theorem, exploited ultrafilters, and placed a nonconstructive entity (namely the Dixmier trace) on the front cover of his magnum opus; see (Katz & Leichtnam 2013 [62]) and (Kanovei et al. 2013 [55]) for details. is in the category of dressing down a feature to look like a bug, to reverse a known dictum from computer science slang.^{13}^{13}13See https://en.wikipedia.org/wiki/Undocumented_feature This can be seen as follows. The Skolem nonstandard integers are known to be purely constructive; see Skolem (1955 [93]) and Kanovei et al. (2013 [55]). Yet they imbed in Robinson’s hypernaturals :
(2.1) 
Viewing a purely constructive Skolem hypernatural
as a member of via the inclusion (2.1), one can apply the transfer principle to form the set
where is the natural extension of . The set is not measurable. What propels the set into existence is not a purported weakness of a nonstandard integer itself, but rather the remarkable strength of both the ŁośRobinson transfer principle and the consequences it yields.
2.5. Williamson, complexity, and other arguments
Easwaran makes a number of further critiques of hyperreal methodology. His section 8.1, entitled “Williamson’s Argument”, concerns infinite coin tosses. Easwaran’s analysis is based on the model of a countable sequence of coin tosses given by Williamson [105]. In this model, it is assumed that
… for definiteness, [the coin] will be flipped once per second, assuming that seconds from now into the future can be numbered with the natural numbers (Easwaran 2014 [31, section 8.1]).
What is lurking behind this is a double assumption which, unlike other “premises”, is not made explicit by Easwaran. Namely, he assumes that

a vast number of independent tests is best modeled by a temporal arrangement thereof, rather than by a simultaneous collection; and

the collection of seconds ticking away “from now [and] into the future” gives a faithful representation of the natural numbers.
These two premises are not selfevident and some research mathematicians have very different intuitions about the matter, as much of the literature on applied nonstandard analysis (e.g., Albeverio et al. 1986 [1]; Reder 2003 [85]) illustrates.
It seems that in Easwaran’s model, an agent can choose not to flip the coin at some seconds, thus giving rise to events like “a coin that is flipped starting at second 2 comes up heads on every flip”. However, in all applications we are aware of, this additional structure used to rule out the use of hyperreals as range of probability functions seems not to be relevant.
Williamson and Easwaran appear to be unwilling to assume that, once one decides to use hyperreal infinitesimals, one should also replace the original algebra “of propositions in which the agent has credences” with an internal algebra of the hyperreal setting. In fact, such an additional step allows one to avoid both the problems raised by Williamson’s argument in his formulation using conditional probability, and those raised by Easwaran in section 8.2 of his paper.
A possible model with hyperreal infinitesimals for an infinite sequence of coin tosses is given by representing every event by means of a sequence , where represents the outcome of the th flip and is a fixed hypernatural number. In this model, consider the events “ Heads for ”, that we will denote , and “ Heads for ”, that we will denote . In such a setting, events and are not isomorphic, contrary to what was argued in (Williamson [105, p. 3]). This is due to the fact that hypernatural numbers are an elementary extension of the natural numbers, for which the formula always holds. Moreover, the probability of is the infinitesimal , while the probability of is the strictly greater infinitesimal , thus obeying the well known rule for conditional probability.
Easwaran’s section 8.4 entitled “The complexity argument” is based on four premises. However, his premise 3, to the effect that “all physical quantities can be entirely parametrized using the standard real numbers”, is unlikely to lead to meaningful philosophical conclusions based on “first principles”. This is because all physical quantities can be entirely parametrized by the usual rational numbers alone, due to the intrinsic limits of our capability to measure physical quantities. A clear explanation of this limitation was given by Dowek. In particular, since
a measuring instrument yields only an approximation of the measured magnitude, […] it is therefore impossible, except according to this idealization, to measure more than the first digits of a physical magnitude. […] According to this principle, this idealization of the process of measurement is a fiction. This suggests the idea, reminiscent of Pythagoras’ views, that Physics could be formulated with rational numbers only. We can therefore wonder why real numbers have been invented and, moreover, used in Physics. A hypothesis is that the invention of real numbers is one of the many situations, where the complexity of an object is increased, so that it can be apprehended more easily. (Dowek 2013 [29])
Related comments by Wheeler (1994 [103, p. 308]), Brukner & Zeilinger (2005 [22, p. 59]), and others were analyzed by Kanovei et al. (2013 [55, Section 8.4]). See also Jaroszkiewicz (2014 [53]).
If all physical quantities can be entirely parametrized by using rational numbers, there should be no compelling reason to choose the real number system as the value range of our probability measures. However, Easwaran is apparently comfortable with the idealisation of exploiting a larger number system than the rationals for the value range of probability measures. What we argue is that the real numbers are merely one among possible idealisations that can be used for this purpose. For instance, in hyperreal models for infinite sequence of coin tosses developed by Benci, Bottazzi & Di Nasso (2013 [11]), all events have hyperrational probabilities. This generalizes both the case of finite sequences of coin tosses, and the Kolmogorovian model for infinite sequences of coin tosses, where a realvalued probability is generated by applying Caratheodory’s extension theorem to the rationalvalued probability measure over the cylinder sets.
Given Easwaran’s firm belief that “the function relating credences to the physical is not so complex that its existence is independent of ZermeloFraenkel set theory” (see his section 8.4, premise 2), it is surprising to find him suggesting that
the surreal numbers seem more promising as a device for future philosophers of probability to use (Easwaran 2014 [31, Appendix A.3]).
However, while the construction of the surreals indeed “is a simultaneous generalization of Dedekind’s construction of the real numbers and von Neumann’s construction of the ordinals”, as observed by Easwaran, it is usually carried out in the Von Neumann–Bernays–Gödel set theory (NBG) with Global Choice; see, for instance, the “Preliminaries” section of (Alling 1987 [3]). The assumption of the Global Axiom of Choice is a strong foundational assumption.
The construction of the surreal numbers can be performed within a version of NBG that is a conservative extension of ZFC, but does not need Limitation of Size (or Global Choice). However, NBG clearly is not a conservative extension of ZF; and if one wishes to prove certain interesting features of the surreals one needs an even stronger version of NBG that involves the Axiom of Global Choice. Therefore, the axiomatic foundation that one needs for using the surreal numbers is at least as strong as the one needed for the hyperreals.
2.6. Infinity and infinitesimal: let both pretty severely alone
At the previous turn of the century, H. Heaton wrote:
I think I know exactly what is meant by the term zero. But I can have no conception either of infinity or of the infinitesimal, and I think it would be well if mathematicians would let both pretty severely alone (Heaton 1898 [47, p. 225]).
Heaton’s sentiment expresses an unease about a mathematical concept of which one may have an intuitive grasp^{14}^{14}14The intuitive appeal of infinitesimals make them an effective teaching tool. The pedagogical value of teaching calculus with infinitesimals was demonstrated in a controlled study by Sullivan (1976 [96]). but which is not easily formalizable. Heaton points out several mathematical inconsistencies or illchosen terminology among the conceptions of infinitesimals of his contemporaries. This highlights the brilliant mathematical achievement of a consistent “calculus” for infinitesimals attained through the work of Hewitt (1948 [50]), Łoś (1955 [78]), Robinson (1961 [87]), and Nelson (1977 [81]), but also of their predecessors like Fermat, Euler, Leibniz, and Cauchy, as we analyze respectively in Sections 3, 4, 5, and 6.
3. Fermat’s adequality
Our interpretation of Fermat’s technique is compatible with those by Strømholm (1968 [95]) and Giusti (2009 [43]). It is at variance with the interpretation by Breger (1994 [21]), considered by Knobloch (2014 [70]) to have been refuted.
Adequality, or (parisotēs) in the original Greek of Diophantus, is a crucial step in Fermat’s method of finding maxima, minima, tangents, and solving other problems that a modern mathematician would solve using infinitesimal calculus. The method is presented in a series of short articles in Fermat’s collected works. The first article, Methodus ad Disquirendam Maximam et Minimam, opens with a summary of an algorithm for finding the maximum or minimum value of an algebraic expression in a variable . For convenience, we will write such an expression in modern functional notation as .
3.1. Summary of Fermat’s algorithm
One version of the algorithm can be broken up into six steps in the following way:

Introduce an auxiliary symbol , and form ;

Set adequal the two expressions (the notation “” for adequality is ours, not Fermat’s);

Cancel the common terms on the two sides of the adequality. The remaining terms all contain a factor of ;

Divide by (see also next step);

In a parenthetical comment, Fermat adds: “or by the highest common factor of ”;

Among the remaining terms, suppress all terms which still contain a factor of . Solving the resulting equation for yields the extremum of .
In modern mathematical language, the algorithm entails expanding the difference quotient
in powers of and taking the constant term.^{15}^{15}15Fermat also envisions a more general technique involving division by a higher power of as in step (5). The method (leaving aside step (5)) is immediately understandable to a modern reader as the elementary calculus exercise of finding the extremum by solving the equation . But the real question is how Fermat understood this algorithm in his own terms, in the mathematical language of his time, prior to the invention of calculus by Barrow, Leibniz, Newton, and others.
There are two crucial points in trying to understand Fermat’s reasoning: first, the meaning of “adequality” in step (2), and second, the justification for suppressing the terms involving positive powers of in step (6). The two issues are closely related because interpretation of adequality depends on the conditions on . One condition which Fermat always assumes is that is positive. He did not use negative numbers in his calculations.^{16}^{16}16This point is crucial for our argument below using the transverse ray. Since Fermat is only working with positive values of his , he only considers a ray (rather than a full line) starting at a point of the curve. The convexity of the curve implies an inequality, which Fermat transforms into an adequality without giving much explanation of his procedure, but assuming implicitly that the ray is tangent to the curve. But a transverse ray would satisfy the inequality no less than a tangent ray, indicating that Fermat is relying on an additional piece of geometric information. His procedure of applying the defining relation of the curve itself, to a point on the tangent ray, is only meaningful when the increment is small (see Subsection 3.2).
Fermat introduces the term adequality in Methodus with a reference to Diophantus of Alexandria. In the third article of the series, Ad Eamdem Methodum (Sur la Même Méthode), he quotes Diophantus’ Greek term , which he renders following Xylander and Bachet, as adaequatio or adaequalitas (see A. Weil [102, p. 28]).
3.2. Tangent line and convexity of parabola
Consider Fermat’s calculation of the tangent line to the parabola (see Fermat [38, p. 122123]). To simplify Fermat’s notation, we will work with the parabola , or
To understand what Fermat is doing, it is helpful to think of the parabola as a level curve of the twovariable function .
Given a point on the parabola, Fermat wishes to find the tangent line through the point. Fermat exploits the geometric fact that by convexity, a point
on the tangent line lies outside the parabola. He therefore obtains an inequality equivalent in our notation to , or . Here , and is Fermat’s magic symbol we wish to understand. Thus, we obtain
(3.1) 
At this point Fermat proceeds as follows:

he writes down the inequality , or ;

he invites the reader to adégaler (to “adequate”);

he writes down the adequality ;

he uses an identity involving similar triangles to substitute
where is the distance from the vertex of the parabola to the point of intersection of the tangent to the parabola at with the axis of symmetry,

he cross multiplies and cancels identical terms on right and left, then divides out by , discards the remaining terms containing , and obtains as the solution.^{17}^{17}17In Fermat’s notation , . Step (v) can be understood as requiring the expression to have a double root at , leading to the solution or in Fermat’s notation .
What interests us here are steps (i) and (ii). How does Fermat pass from an inequality to an adequality? Giusti noted that
Comme d’habitude, Fermat est autant détaillé dans les exemples qu’il est réticent dans les explications. On ne trouvera donc presque jamais des justifications de sa règle des tangentes (Giusti 2009 [43]).
In fact, Fermat provides no explicit explanation for this step. However, what he does is to apply the defining relation for a curve to points on the tangent line to the curve. Note that here the quantity , as in , is positive: Fermat did not have the facility we do of assigning negative values to variables. Strømholm notes that Fermat
never considered negative roots, and if was a solution of an equation, he did not mention it as it was nearly always geometrically uninteresting (Strømholm 1968 [95, p. 49]).
Fermat says nothing about considering points “on the other side”, i.e., further away from the vertex of the parabola, as he does in the context of applying a related but different method, for instance in his two letters to Mersenne (see [95, p. 51]), and in his letter to Brûlart [39].^{18}^{18}18This was noted by Giusti (2009 [43]). Now for positive values of , Fermat’s inequality (3.1) would be satisfied by a transverse ray (i.e., secant ray) starting at and lying outside the parabola, just as much as it is satisfied by a tangent ray starting at . Fermat’s method therefore presupposes an additional piece of information, privileging the tangent ray over transverse rays. The additional piece of information is geometric in origin: he applies the defining relation (of the curve itself) to a point on the tangent ray to the curve, a procedure that is only meaningful when the increment is small.
In modern terms, we would speak of the tangent line being a “best approximation” to the curve for a small variation ; however, Fermat does not explicitly discuss the size of . The procedure of “discarding the remaining terms” in step (v) admits of a proxy in the hyperreal context. Namely, it is the standard part principle (see Section 1). Fermat does not elaborate on the justification of this step, but he is always careful to speak of the suppressing or deleting the remaining term in , rather than setting it equal to zero. Perhaps his rationale for suppressing terms in consists in ignoring terms that don’t correspond to an actual measurement, prefiguring Leibniz’s inassignable quantities. Fermat’s inferential moves in the context of his adequality are akin to Leibniz’s in the context of his calculus; see Section 4.
3.3. Fermat, Galileo, and Wallis
While Fermat never spoke of his as being infinitely small, the technique was known both to Fermat’s contemporaries like Galileo (see Bascelli 2014 [7], [8]) and Wallis (see Katz & Katz [60, Section 24]) as well as Fermat himself, as his correspondence with Wallis makes clear; see Katz, Schaps & Shnider (2013 [63, Section 2.1]).
Fermat was very interested in Galileo’s treatise De motu locali, as we know from his letters to Marin Mersenne dated apr/may 1637, 10 august, and 22 october 1638. Galileo’s treatment of infinitesimals in De motu locali is discussed by Wisan (1974 [106, p. 292]) and Settle (1966 [91]).
Alexander (2014 [2]) notes that the clerics in Rome forbade the doctrine of the infinitely small on 10 august 1632 (a month before Galileo was put on trial over heliocentrism); this may help explain why the catholic Fermat might have been reluctant to speak of the infinitely small explicitly.^{19}^{19}19See a related discussion at http://math.stackexchange.com/questions/661999/areinfinitesimalsdangerous
In a recent text, U. Felgner analyzes the Diophantus problems which exploit the method of , and concludes that
Aus diesen Beispielen wird deutlich, dass die Verben und adaequare nicht ganz dasselbe ausdrücken. Das griechische Wort bedeutet, der Gleichheit nahe zu sein, während das lateinische Wort das Erreichender Gleichheit (sowohl als vollendeten als auch als unvollendeten Prozeß) ausdrückt (Felgner 2014 [37]).
Thus, in his view, even though the two expressions have slightly different meanings, the Greek meaning “being close to equality” and the Latin meaning “equality which is reached (at the end of either a finite or an infinite process),” they both involve approximation. Felgner goes on to consider some of the relevant texts from Fermat, and concludes that Fermat’s method has nothing to do with differential calculus and involves only the property of an auxiliary expression having a double zero:
Wir hoffen, deutlich gemacht zu haben, dass die fermatsche “Methode der Adaequatio” gar nichts mit dem DifferentialKalkül zu hat, sondern vielmehr im Studium des Wertverlaufs eines Polynoms in der Umgebung eines kritischen Punktes besteht, und dabei das Ziel verfolgt zu zeigen, dass das Polynom an dieser Stelle eine doppelte Nullstelle besitzt (ibid.)
However, Felgner’s conclusion is inconsistent with his own textual analysis which indicates that the idea of approximation is present in the methods of both Diophantus and Fermat. As Knobloch (2014 [70]) notes, “Fermat’s method of adequality is not a single method but rather a cluster of methods.” Felgner failed to analyze the examples of tangents to transcendental curves, such as the cycloid, in which Fermat does not study the order of the zero of an auxiliary polynomial. Felgner mistakenly asserts that in the case of the cycloid Fermat did not reveal how he thought of the solution: “Wie FERMATsich die Lösung dachte, hat er nicht verraten.” (ibid.) Quite to the contrary, as Fermat explicitly stated, he applied the defining property of the curve to points on the tangent line:
Il faut donc adégaler (à cause de la propriété spécifique de la courbe qui est à considérer sur la tangente)
(see Katz et al. (2013 [63]) for more details). Fermat’s approach involves applying the defining relation of the curve, to a point on a tangent to the curve. The approach is consistent with the idea of approximation inherent in his method, involving a negligible distance (whether infinitesimal or not) between the tangent and the original curve when one is near the point of tangency. This line of reasoning is related to the ideas of the differential calculus. Note that Fermat does not say anything here concerning the multiplicities of zeros of polynomials. As Felgner himself points out, in the case of the cycloid the only polynomial in sight is of first order and the increment “” cancels out. Fermat correctly solves the problem by obtaining the defining equation of the tangent.
For a recent study of 17th century methodology, see the article (Carroll et al. 2013 [23]).
4. Leibniz’s Transcendental law of homogeneity
In this section, we examine a possible connection between Fermat’s adequality and Leibniz’s Transcendental Law of Homogeneity (TLH). Both of them enable certain inferential moves that play parallel roles in Fermat’s and Leibniz’s approaches to the problem of maxima and minima. Note the similarity in titles of their seminal texts: Methodus ad Disquirendam Maximam et Minimam (Fermat, see Tannery [98, pp. 133]) and Nova methodus pro maximis et minimis … (Leibniz 1684 [72] in Gerhardt [42]).
4.1. When are quantities equal?
Leibniz developed the TLH in order to enable inferences to be made between inassignable and assignable quantities. The TLH governs equations involving differentials. H. Bos interprets it as follows:
A quantity which is infinitely small with respect to another quantity can be neglected if compared with that quantity. Thus all terms in an equation except those of the highest order of infinity, or the lowest order of infinite smallness, can be discarded. For instance,
(4.1)
etc. The resulting equations satisfy this […] requirement of homogeneity (Bos 1974 [18, p. 33] paraphrasing Leibniz 1710 [75, p. 381382]).
The title of Leibniz’s 1710 text is Symbolismus memorabilis calculi algebraici et infinitesimalis in comparatione potentiarum et differentiarum, et de lege homogeneorum transcendentali. The inclusion of the transcendental law of homogeneity (lex homogeneorum transcendentalis) in the title of the text attests to the importance Leibniz attached to this law.
The “equality up to an infinitesimal” implied in TLH was explicitly discussed by Leibniz in a 1695 response to Nieuwentijt, in the following terms:
Caeterum aequalia esse puto, non tantum quorum differentia est omnino nulla, sed et quorum differentia est incomparabiliter parva; et licet ea Nihil omnino dici non debeat, non tamen est quantitas comparabilis cum ipsis, quorum est differentia (Leibniz 1695 [73, p. 322]) [emphasis added–authors]
We provide a translation of Leibniz’s Latin:
Besides, I consider to be equal not only those things whose difference is entirely nothing, but also those whose difference is incomparably small: and granted that it [i.e., the difference] should not be called entirely Nothing, nevertheless it is not a quantity comparable to those whose difference it is.
4.2. Product rule
How did Leibniz use the TLH in developing the calculus? The issue can be illustrated by Leibniz’s justification of the last step in the following calculation:
(4.2)  
The last step in the calculation (4.2) depends on the following inference:
Such an inference is an application of Leibniz’s TLH. In his 1701 text Cum Prodiisset [74, p. 4647], Leibniz presents an alternative justification of the product rule (see Bos [18, p. 58]). Here he divides by , and argues with differential quotients rather than differentials. The role played by the TLH in these calculations is similar to that played by adequality in Fermat’s work on maxima and minima. For more details on Leibniz, see Guillaume (2014 [45]); Katz & Sherry (2012 [64]), (2013 [65]); Sherry & Katz [92]; Tho (2012 [101]).
5. Euler’s Principle of Cancellation
Some of the Leibnizian formulas reappear, not surprisingly, in his student’s student Euler. Euler’s formulas like
(5.1) 
where “is any finite quantity” (see Euler 1755 [35, § § 86,87]) are consonant with a Leibnizian tradition as reported by Bos; see formula (4.1) above. To explain formulas like (5.1), Euler elaborated two distinct ways (arithmetic and geometric) of comparing quantities, in the following terms:
Since we are going to show that an infinitely small quantity is really zero, we must meet the objection of why we do not always use the same symbol 0 for infinitely small quantities, rather than some special ones…[S]ince we have two ways to compare them, either arithmetic or geometric, let us look at the quotients of quantities to be compared in order to see the difference.
If we accept the notation used in the analysis of the infinite, then indicates a quantity that is infinitely small, so that both and , where is any finite quantity. Despite this, the geometric ratio is finite, namely . For this reason, these two infinitely small quantities, and , both being equal to , cannot be confused when we consider their ratio. In a similar way, we will deal with infinitely small quantities and (ibid., § 86, p. 5152) [emphasis added–the authors].
Having defined the arithmetic and geometric comparisons, Euler proceeds to clarify the difference between them as follows:
Let be a finite quantity and let be infinitely small. The arithmetic ratio of equals is clear: Since , we have
On the other hand, the geometric ratio is clearly of equals, since
(5.2) From this we obtain the wellknown rule that the infinitely small vanishes in comparison with the finite and hence can be neglected [with respect to it] [35, §87] [emphasis in the original–the authors].
Like Leibniz, Euler considers more than one way of comparing quantities. Euler’s formula (5.2) indicates that his geometric comparison is procedurally identical with the Leibnizian TLH.
To summarize, Euler’s geometric comparision of a pair of quantities amounts to their ratio being infinitely close to a finite quantity, as in formula (5.2); the same is true for TLH. Note that one has in this sense for an appreciable , but not for (in which case there is equality only in the arithmetic sense). Euler’s “geometric” comparison was dubbed “the principle of cancellation” in (Ferraro [40, pp. 47, 48, 54]).
Euler proceeds to present the usual rules of infinitesimal calculus, which go back to Leibniz, L’Hôpital, and the Bernoullis, such as
(5.3) 
provided “since vanishes compared with ” ([35, § 89]), relying on his “geometric” comparison. Euler introduces a distinction between infinitesimals of different order, and directly computes^{20}^{20}20Note that Euler does not “prove that the expression is equal to 1”; such indirect proofs are a trademark of the approach. Rather, Euler directly computes (what would today be formalized as the standard part of) the expression. a ratio of the form
of two particular infinitesimals, assigning the value to it (ibid., § 88). Euler concludes:
Although all of them [infinitely small quantities] are equal to 0, still they must be carefully distinguished one from the other if we are to pay attention to their mutual relationships, which has been explained through a geometric ratio (ibid., § 89).
The Eulerian hierarchy of orders of infinitesimals harks back to Leibniz’s work (see Section 4). Euler’s geometric comparision, or “principle of cancellation”, is yet another incarnation of the idea at the root of Fermat’s adequality and Leibniz’s Transcendental Law of Homogeneity. For further details on Euler see Bibiloni et al. (2006 [13]); Bair et al. (2013 [6]); Reeder (2013 [86]).
6. What did Cauchy mean by “limit”?
Laugwitz’s detailed study of Cauchy’s methodology places it squarely in the Btrack (see Section 2). In conclusion, Laugwitz writes:
The influence of Euler should not be neglected, with regard both to the organization of Cauchy’s texts and, in particular, to the fundamental role of infinitesimals (Laugwitz 1987 [71, p. 273]).
Thus, in his 1844 text Exercices d’analyse et de physique mathématique, Cauchy wrote:
…si, les accroissements des variables étant supposés infiniment petits, on néglige, visàvis de ces accroissements considérés comme infiniment petits du premier ordre, les infiniment petits des ordres supérieurs au premier, les nouvelles équations deviendront linéaires par rapport aux accroissements petits des variables. Leibniz et les premiers géomètres qui se sont occupés de l’analyse infinitésimale ont appelé différentielles des variables leurs accroissements infiniment petits, … (Cauchy 1844 [25, p. 5]).
Two important points emerge from this passage. First, Cauchy specifically speaks about neglecting (“on néglige”) higher order terms, rather than setting them equal to zero. This indicates a similarity of procedure with the Leibnizian TLH (see Section 4). Like Leibniz and Fermat before him, Cauchy does not set the higher order terms equal to zero, but rather “neglects” or discards them. Furthermore, Cauchy’s comments on Leibniz deserve special attention.
6.1. Cauchy on Leibniz
By speaking matteroffactly about the infinitesimals of Leibniz specifically, Cauchy reveals that his (Cauchy’s) infinitesimals are consonant with Leibniz’s. This is unlike the differentials where Cauchy adopts a different approach.
On page 6 of the same text, Cauchy notes that the notion of derivative
représente en réalité la limite du rapport entre les accrossements infiniment petits et simultanés de la fonction et de la variable (ibid., p. 6) [emphasis added–the authors]
The same definition of the derivative is repeated on page 7, this time emphasized by means of italics. Note Cauchy’s emphasis on the point that the derivative is not a ratio of infinitesimal increments, but rather the limit of the ratio.
Cauchy’s use of the term “limit” as applied to a ratio of infinitesimals in this context may be unfamiliar to a modern reader, accustomed to taking limits of sequences of real numbers. Its meaning is clarified by Cauchy’s discussion of “neglecting” higher order infinitesimals in the previous paragraph on page 5 cited above. Cauchy’s use of “limit” is procedurally identical with the Leibnizian TLH, and therefore similarly finds its modern proxy as extracting the standard part out of the ratio of infinitesimals.
On page 11, Cauchy chooses infinitesimal increments and , and writes down the equation
(6.1) 
Modulo replacing Cauchy’s symbol “lim.” by the modern one “st” or “sh”, Cauchy’s formula (6.1) is identical to the formula appearing in any textbook based on the hyperreal approach, expressing the derivative in terms of the standard part function (shadow).
6.2. Cauchy on continuity
On page 17 of his 1844 text, Cauchy gives a definition of continuity in terms of infinitesimals (an infinitesimal increment necessarily produces an infinitesimal increment). His definition is nearly identical with the italicized definition that appeared on page 34 in his Cours d’Analyse (Cauchy 1821 [24]), 23 years earlier, when he first introduced the modern notion of continuity. We will use the translation by Bradley & Sandifer (2009 [20]). In his Section 2.2 entitled Continuity of functions, Cauchy writes:
If, beginning with a value of contained between these limits, we add to the variable an infinitely small increment , the function itself is incremented by the difference .
Cauchy goes on to state that
the function is a continuous function of between the assigned limits if, for each value of between these limits, the numerical value of the difference decreases indefinitely with the numerical value of .
He then proceeds to provide an italicized definition of continuity in the following terms:
the function is continuous with respect to between the given limits if, between these limits, an infinitely small increment in the variable always produces an infinitely small increment in the function itself.
In modern notation, Cauchy’s definition can be stated as follows. Denote by the halo of , i.e., the collection of all points infinitely close to . Then is continuous at if
(6.2) 
Most scholars hold that Cauchy never worked with a pointwise definition of continuity (as is customary today) but rather required a condition of type (6.2) to hold in a range (“between the given limits”). It is worth recalling that Cauchy never gave an definition of either limit or continuity (though (type arguments occasionally do appear in Cauchy). It is a widespread and deeply rooted misconception among both mathematicians and those interested in the history and philosophy of mathematics that it was Cauchy who invented the modern definitions of limit and continuity; see, e.g., Colyvan & Easwaran (2008 [27, p. 88]) who err in attributing the formal definition of continuity to Cauchy. That this is not the case was argued by Błaszczyk et al. (2013 [15]); Borovik et al. (2012 [17]); Katz & Katz (2011 [58]); Nakane (2014 [80]); Tall et al. (2014 [97]).
7. Modern formalisations: a case study
To illustrate the use of the standard part in the context of the hyperreal field extension of , we will consider the following problem on divergent integrals. The problem was recently posed at SE, and is reportedly due to S. Konyagin.^{21}^{21}21http://math.stackexchange.com/questions/408311/improperintegraldiverges The solution exploits the technique of a monotone rearrangement of a function , shown by Ryff to admit a measurepreserving map such that . In general there is no “inverse” such that ; however, a hyperreal enlargement enables one to construct a suitable (internal) proxy for such a , so as to be able to write ; see formula (8.2) below.
Theorem 7.1.
Let be a realvalued function continuous on . Then there exists a number such that the integral
(7.1) 
diverges.
A proof can be given in terms of a monotone rearrangement of the function (see Hardy et al. [46]). We take a decreasing rearrangement of the function . If is continuous, then the function will also be continuous. If is not constant on any set of positive measure, one can construct by setting
(7.2) 
Ryff (1970 [90]) showed that there exists a measurepreserving transformation^{22}^{22}22However, see Section 8 for a hyperfinite approach avoiding measure theory altogether. that relates and as follows:
(7.3) 
Finding a map such that is in general impossible (see Bennett & Sharpley [12, p. 85, example 7.7] for a counterexample). This difficulty can be circumvented using a hyperfinite rearrangement (see Section 8). By measure preservation, we have
(for every ).^{23}^{23}23Here one needs to replace the function by the family of its truncations , and then let increase without bound.
To complete the proof of Theorem 7.1, apply the result that every monotone function is a.e. differentiable.^{24}^{24}24In fact, one does not really need to use the result that monotone functions are a.e. differentiable. Consider the convex hull in the plane of the graph of the monotone function , and take a point where the graph touches the boundary of the convex hull (other than the endpoints and ). Setting equal to the coordinate of the point does the job. Take a point where the function is differentiable. Then the number yields an infinite integral (7.1), since the difference can be bounded above in terms of a linear expression.^{25}^{25}25Namely, for near such a point , we have , hence , yielding a lower bound in terms of a divergent integral.
8. A combinatorial approach to decreasing rearrangements
The existence of a decreasing rearrangement of a function continuous on admits an elegant proof in the context of its hyperreal extension , which we will continue to denote by .
We present a combinatorial argument showing that the decreasing rearrangement obeys the same modulus of uniformity as the original function.^{26}^{26}26A function on is said to satisfy a modulus of uniformity , if . The argument actually yields an independent construction of the decreasing rearrangement (see Proposition 8.1) that avoids recourse to measure theory. It also yields an “inverse up to an infinitesimal,” (see formula (8.2)), to the function such that . For a recent application of combinatorial arguments in a hyperreal framework, see Benci et al. (2013 [11]).
In passing from the finite to the continuous case of rearrangements, Bennett and Sharpley [12] note that
nonnegative sequences and are equimeasurable if and only if there is a permutation of such that for . … The notion of permutation is no longer available in this context [of continuous measure spaces] and is replaced by that of a “measurepreserving transformation” (Bennett and Sharpley 1988 [12, p. 79]).
We show that the hyperreal framework allows one to continue working with combinatorial ideas, such as the “inverse” function , in the continuous case as well.
Let , let for . By the Transfer Principle (see e.g., Davis [28]; Herzberg [49]; Kanovei & Reeken [56]), the nonstandard domain of internal sets satisfies the same basic laws as the usual, “standard” domain of real numbers and related objects. Thus, as for finite sets, there exists a permutation of the hyperfinite grid
(8.1) 
by decreasing value of (here is the maximal value). We assume that equal values are ordered lexicographically so that . Hence we obtain an internal function
(8.2) 
Here is (perhaps nonstrictly) decreasing on the grid of (8.1). The internal sequences and , where , are equinumerable in the sense above.
Proposition 8.1.
Let be an arbitrary continuous function. Then there is a standard continuous real function such that for all , where denotes the standard part of a hyperreal .
Proof.
Let . We claim that is Scontinuous (microcontinuous), i.e., for each pair , if is infinitesimal then so is . To prove the claim, we will prove the following stronger fact:
for every there are such that and .
The sets and are nonempty and there are at most points which are not in . Let and be such that is minimal. All integers between and are not in . Hence there are at most such integers, and therefore . By definition of and , we obtain , which proves the claim. Thus is indeed Scontinuous.
This allows us to define, for any standard , the value to be the standard part of the hyperreal for any hyperinteger such that is infinitely close to , and then is a continuous^{27}^{27}27The argument shows in fact that the modulus of uniformity of is bounded by that of ; see footnote 26. and (nonstrictly) monotone real function equal to the decreasing rearrangement of (7.2). ∎
The hyperreal approach makes it possible to solve Konyagin’s problem without resorting to standard treatments of decreasing rearrangements which use measure theory. Note that the rearrangement defined by the internal permutation preserves the integral of (as well as the integrals of the truncations of ), in the following sense. The righthand Riemann sums satisfy
(8.3) 
where . Thus transforms a hyperfinite Riemann sum of into a hyperfinite Riemann sum of . Since and , we conclude that and have the same integrals, and similarly for the integrals of ; see footnote 23.
The first equality in (8.3) holds automatically by the transfer principle even though is an infinite permutation. (Compare with the standard situation where changing the order of summation in an infinite sum generally requires further justification.) This illustrates one of the advantages of the hyperreal approach.
9. Conclusion
We have critically reviewed several common misrepresentations of hyperreal number systems, not least in relation to their alleged nonconstructiveness, from a historical, philosophical, and settheoretic perspective. In particular we have countered some of Easwaran’s recent arguments against the use of hyperreals in formal epistemology. A hyperreal framework enables a richer syntax better suited for expressing proxies for procedural moves found in the work of Fermat, Leibniz, Euler, and Cauchy. Such a framework sheds light on the internal coherence of their procedures which have been often misunderstood from a whiggish postWeierstrassian perspective.
Acknowledgments
The work of Vladimir Kanovei was partially supported by RFBR grant 130100006. M. Katz was partially funded by the Israel Science Foundation grant no. 1517/12. We are grateful to Thomas Mormann and to the anonymous referee for helpful suggestions, and to Ivor GrattanGuinness for contributing parts of the introduction.
References
 [1] Albeverio, S.; HøeghKrohn, R.; Fenstad, J.; Lindstrøm, T. Nonstandard Methods in Stochastic Analysis and Mathematical Physics. Pure and Applied Mathematics, 122. Academic Press, Inc., Orlando, FL, 1986.
 [2] Alexander, A. Infinitesimal: How a Dangerous Mathematical Theory Shaped the Modern World. Farrar, Straus and Giroux, 2014.
 [3] Alling, N. Foundations of Analysis over Surreal Number Fields. NorthHolland Mathematical Library, 1987.
 [4] Anderson, R. A nonstandard representation for Brownian motion and Itô integration. Israel Journal of Mathematics 25 (1976), no. 12, 15–46.
 [5] Arthur, R. Leibniz’s syncategorematic infinitesimals. Arch. Hist. Exact Sci. 67 (2013), no. 5, 553–593.
 [6] Bair, J.; Błaszczyk, P.; Ely, R.; Henry, V.; Kanovei, V.; Katz, K.; Katz, M.; Kutateladze, S.; McGaffey, T.; Schaps, D.; Sherry, D.; Shnider, S. Is mathematical history written by the victors? Notices of the American Mathematical Society 60 (2013) no. 7, 886904. See http://www.ams.org/notices/201307/rnotip886.pdf and http://arxiv.org/abs/1306.5973
 [7] Bascelli, T. Galileo’s quanti: understanding infinitesimal magnitudes. Arch. Hist. Exact Sci. 68 (2014), no. 2, 121–136.
 [8] Bascelli, T. Infinitesimal issues in Galileo’s theory of motion. Revue Roumaine de Philosophie 58 (2014), no. 1, 2341. Tiziana Bascelli, ”Infinitesimal Issues in Galileo’s Theory of Motion”, in Rev. Roum. Philosophie, 58,
 [9] Bell, J. The Continuous and the Infinitesimal in Mathematics and Philosophy. Polimetrica, 2006.
 [10] Benacerraf, P. What numbers could not be. Philos. Rev. 74 (1965), 47–73.
 [11] Benci, V.; Bottazzi E.; Di Nasso, M. Elementary numerosity and measures, preprint (2013).
 [12] Bennett, C.; Sharpley, R. Interpolation of operators. Pure and Applied Mathematics 129. Academic Press, Boston, MA, 1988.
 [13] Bibiloni, L.; Viader, P.; Paradís, J. On a series of Goldbach and Euler. Amer. Math. Monthly 113 (2006), no. 3, 206–220.
 [14] Bishop, E. Mathematics as a numerical language. 1970 Intuitionism and Proof Theory (Proc. Conf., Buffalo, N.Y., 1968) pp. 53–71. NorthHolland, Amsterdam.
 [15] Błaszczyk, P.; Katz, M.; Sherry, D. Ten misconceptions from the history of analysis and their debunking. Foundations of Science, 18 (2013), no. 1, 4374. See http://dx.doi.org/10.1007/s1069901292858 and http://arxiv.org/abs/1202.4153
 [16] Borovik, A.; Jin, R.; Katz, M. An integer construction of infinitesimals: Toward a theory of Eudoxus hyperreals. Notre Dame Journal of Formal Logic 53 (2012), no. 4, 557570. See http://dx.doi.org/10.1215/002945271722755 and http://arxiv.org/abs/1210.7475
 [17] Borovik, A.; Katz, M. Who gave you the Cauchy–Weierstrass tale? The dual history of rigorous calculus. Foundations of Science 17 (2012), no. 3, 245276. See http://dx.doi.org/10.1007/s106990119235x and http://arxiv.org/abs/1108.2885
 [18] Bos, H. J. M. Differentials, higherorder differentials and the derivative in the Leibnizian calculus. Arch. History Exact Sci. 14 (1974), 1–90.
 [19] Bos, H. J. M.; Bunn, R.; Dauben, J.; GrattanGuinness, I.; Hawkins, T.; Pedersen, K. M. From the calculus to set theory, 1630–1910. An introductory history. Edited by I. GrattanGuinness. Gerald Duckworth & Co. Ltd., London, 1980.
 [20] Bradley, R.; Sandifer, C. Cauchy’s Cours d’analyse. An annotated translation. Sources and Studies in the History of Mathematics and Physical Sciences. Springer, New York, 2009.
 [21] Breger, H. The mysteries of adaequare: a vindication of Fermat. Arch. Hist. Exact Sci. 46 (1994), no. 3, 193–219.
 [22] Brukner, Č.; Zeilinger, A.: Quantum physics as a science of information, in Quo vadis quantum mechanics?, 4761, Frontiers Collection, Springer, Berlin, 2005.
 [23] Carroll, M.; Dougherty, S.; Perkins, D. Indivisibles, Infinitesimals and a Tale of SeventeenthCentury Mathematics. Mathematics Magazine 86 (2013), no. 4, 239–254.
 [24] Cauchy, A. L. Cours d’Analyse de L’Ecole Royale Polytechnique. Première Partie. Analyse algébrique. Paris: Imprimérie Royale, 1821. Online at http://books.google.com/books?id=_mYVAAAAQAAJ&dq=cauchy&lr=&source=gbs_navlinks_s
 [25] Cauchy, A. L. Exercices d’analyse et de physique mathématique (vol. 3). Paris, Bachelier, 1844.
 [26] Cohen, P. Set theory and the continuum hypothesis. W. A. Benjamin, New YorkAmsterdam, 1966.
 [27] Colyvan, M.; Easwaran, K. Mathematical and physical continuity. Australas. J. Log. 6 (2008), 87–93.
 [28] Davis, M. Applied nonstandard analysis. Pure and Applied Mathematics. WileyInterscience [John Wiley & Sons], New YorkLondonSydney, 1977. Reprinted: Dover, NY, 2005, see http://store.doverpublications.com/0486442292.html
 [29] Dowek, G. Real numbers, chaos, and the principle of a bounded density of information. Invited paper at International Computer Science Symposium in Russia, 2013. See https://who.rocq.inria.fr/Gilles.Dowek/Publi/csr.pdf
 [30] Earman, J. Infinities, infinitesimals, and indivisibles: the Leibnizian labyrinth. Studia Leibnitiana 7 (1975), no. 2, 236–251.
 [31] Easwaran, K. Regularity and hyperreal credences. Philosophical Review 123 (2014), No. 1, 1–41.
 [32] Ehrlich, P. The rise of nonArchimedean mathematics and the roots of a misconception. I. The emergence of nonArchimedean systems of magnitudes. Arch. Hist. Exact Sci. 60 (2006), no. 1, 1–121.
 [33] Elga, A. Infinitesimal chances and the laws of nature. Australasian Journal of Philosophy 82 (2004), no. 1, 6776.
 [34] Euler, L. Institutiones Calculi Differentialis. SPb, 1755.
 [35] Euler, L. Foundations of Differential Calculus. English translation of Chapters 1–9 of ([34]) by D. Blanton, Springer, N.Y., 2000.
 [36] Feferman, S.; Levy, A. Independence results in set theory by Cohen’s method II. Notices Amer. Math. Soc. 10 (1963), 593.
 [37] Felgner, U. Der Begriff der ”Angleichung” (, adaequatio) bei Diophant und Fermat (2014), preprint.
 [38] Fermat, P. Méthode pour la recherche du maximum et du minimum. p. 121156 in Tannery’s edition [99].
 [39] Fermat, P. Letter to Brûlart. Oeuvres, Vol. 5, pp. 120125.
 [40] Ferraro, G. Differentials and differential coefficients in the Eulerian foundations of the calculus. Historia Mathematica 31 (2004), no. 1, 34–61. See http://dx.doi.org/10.1016/S03150860(03)000302.
 [41] Gerhardt:, C. I. (ed.) Historia et Origo calculi differentialis a G. G. Leibnitio conscripta, ed. C. I. Gerhardt, Hannover, 1846.
 [42] Gerhardt, C. I. (ed.) Leibnizens mathematische Schriften (Berlin and Halle: Eidmann, 18501863).
 [43] Giusti, E. Les méthodes des maxima et minima de Fermat. Ann. Fac. Sci. Toulouse Math. (6) 18 (2009), Fascicule Special, 59–85.
 [44] Goldenbaum U.; Jesseph D. (Eds.) Infinitesimal Differences: Controversies between Leibniz and his Contemporaries. BerlinNew York: Walter de Gruyter, 2008.
 [45] Guillaume, M. “Review of Katz, M.; Sherry, D. Leibniz’s infinitesimals: their fictionality, their modern implementations, and their foes from Berkeley to Russell and beyond. Erkenntnis 78 (2013), no. 3, 571–625.” Mathematical Reviews (2014). See http://www.ams.org/mathscinetgetitem?mr=3053644
 [46] Hardy, G.; Littlewood, J. Pólya, G. Inequalities. Second edition. Cambridge University Press, 1952.
 [47] Heaton, H. Infinity, the Infinitesimal, and Zero. American Mathematical Monthly 5 (1898), no. 10, 224–226.
 [48] Herzberg, F. Internal laws of probability, generalized likelihoods and Lewis’ infinitesimal chances—a response to Adam Elga. British Journal for the Philosophy of Science 58 (2007), no. 1, 2543.
 [49] Herzberg, F. Stochastic calculus with infinitesimals. Lecture Notes in Mathematics, 2067. Springer, Heidelberg, 2013.
 [50] Hewitt, E. Rings of realvalued continuous functions. I. Trans. Amer. Math. Soc. 64 (1948), 45–99.
 [51] Heijting, A. Address to Professor A. Robinson. At the occasion of the Brouwer memorial lecture given by Prof. A. Robinson on the 26th April 1973. Nieuw Archief voor Wiskunde (3) 21 (1973), 134–137.
 [52] Ishiguro, H. Leibniz’s philosophy of logic and language. Second edition. Cambridge University Press, Cambridge, 1990.
 [53] Jaroszkiewicz, G. Principles of Discrete Time Mechanics. Cambridge Monographs on Mathematical Physics. Cambridge University Press, 2014.
 [54] Jech, T. The axiom of choice. Studies in Logic and the Foundations of Mathematics, Vol. 75. NorthHolland Publishing Co., AmsterdamLondon; Amercan Elsevier Publishing Co., Inc., New York, 1973.
 [55] Kanovei, V.; Katz, M.; Mormann, T. Tools, Objects, and Chimeras: Connes on the Role of Hyperreals in Mathematics. Foundations of Science 18 (2013), no. 2, 259–296. See http://dx.doi.org/10.1007/s1069901293165 and http://arxiv.org/abs/1211.0244
 [56] Kanovei, V.; Reeken, M. Nonstandard analysis, axiomatically. Springer Monographs in Mathematics, Berlin: Springer, 2004.
 [57] Kanovei, V.; Shelah, S. A definable nonstandard model of the reals. Journal of Symbolic Logic 69 (2004), no. 1, 159–164.
 [58] Katz, K.; Katz, M. Cauchy’s continuum. Perspectives on Science 19 (2011), no. 4, 426452. See http://arxiv.org/abs/1108.4201 and http://www.mitpressjournals.org/doi/abs/10.1162/POSC_a_00047
 [59] Katz, K.; Katz, M. Meaning in classical mathematics: is it at odds with Intuitionism? Intellectica 56 (2011), no. 2, 223–302. See http://arxiv.org/abs/1110.5456
 [60] Katz, K.; Katz, M. A Burgessian critique of nominalistic tendencies in contemporary mathematics and its historiography. Foundations of Science 17 (2012), no. 1, 51–89. See http://dx.doi.org/10.1007/s1069901192231 and http://arxiv.org/abs/1104.0375
 [61] Katz, K.; Katz, M.; Kudryk, T. Toward a clarity of the extreme value theorem. Logica Universalis 8 (2014), no. 2, 193214. See http://dx.doi.org/10.1007/s1178701401028 and http://arxiv.org/abs/1404.5658
 [62] Katz, M.; Leichtnam, E. Commuting and noncommuting infinitesimals. American Mathematical Monthly 120 (2013), no. 7, 631–641. See http://dx.doi.org/10.4169/amer.math.monthly.120.07.631 and http://arxiv.org/abs/1304.0583
 [63] Katz, M.; Schaps, D.; Shnider, S. Almost Equal: The Method of Adequality from Diophantus to Fermat and Beyond. Perspectives on Science 21 (2013), no. 3, 283324. See http://www.mitpressjournals.org/doi/abs/10.1162/POSC_a_00101 and http://arxiv.org/abs/1210.7750
 [64] Katz, M.; Sherry, D. Leibniz’s laws of continuity and homogeneity. Notices of the American Mathematical Society 59 (2012), no. 11, 15501558. See http://www.ams.org/notices/201211/ and http://arxiv.org/abs/1211.7188
 [65] Katz, M.; Sherry, D. Leibniz’s infinitesimals: Their fictionality, their modern implementations, and their foes from Berkeley to Russell and beyond. Erkenntnis 78 (2013), no. 3, 571–625. See http://dx.doi.org/10.1007/s106700129370y and http://arxiv.org/abs/1205.0174
 [66] Katz, V. “Review of Bair et al., Is mathematical history written by the victors? Notices Amer. Math. Soc. 60 (2013), no. 7, 886–904.” Mathematical Reviews (2014). See http://www.ams.org/mathscinetgetitem?mr=3086638
 [67] Keisler, H. J. Elementary Calculus: An Infinitesimal Approach. Second Edition. Prindle, Weber & Schimidt, Boston, 1986. Revision from february 2012 online at http://www.math.wisc.edu/keisler/calc.html
 [68] Klein, F. Elementary Mathematics from an Advanced Standpoint. Vol. I. Arithmetic, Algebra, Analysis. Translation by E. R. Hedrick and C. A. Noble [Macmillan, New York, 1932] from the third German edition [Springer, Berlin, 1924]. Originally published as Elementarmathematik vom höheren Standpunkte aus (Leipzig, 1908).
 [69] Knobloch, E. Leibniz’s rigorous foundation of infinitesimal geometry by means of Riemannian sums. Foundations of the formal sciences, 1 (Berlin, 1999). Synthese 133 (2002), no. 12, 59–73.
 [70] Knobloch, E. “Review of: Katz, M.; Schaps, D.; Shnider, S. Almost equal: the method of adequality from Diophantus to Fermat and beyond. Perspectives on Science 21 (2013), no. 3, 283–324.” Mathematical Reviews (2014). See http://www.ams.org/mathscinetgetitem?mr=3114421
 [71] Laugwitz, D. Infinitely small quantities in Cauchy’s textbooks. Historia Mathematica 14 (1987), no. 3, 258–274.
 [72] Leibniz, G. Nova methodus pro maximis et minimis …, in Acta Erud., Oct. 1684. See Gerhardt [42], V, pp. 220226.
 [73] Leibniz, G. (1695) Responsio ad nonnullas difficultates a Dn. Bernardo Niewentiit circa methodum differentialem siu infinitesimalem motas. In Gerhardt [42], V, p. 320–328.
 [74] Leibniz, G. (1701) Cum Prodiisset…mss “Cum prodiisset atque increbuisset Analysis mea infinitesimalis …” in Gerhardt [41], pp. 39–50.
 [75] Leibniz, G. (1710) Symbolismus memorabilis calculi algebraici et infinitesimalis in comparatione potentiarum et differentiarum, et de lege homogeneorum transcendentali. In Gerhardt [42, vol. V, pp. 377382].
 [76] Lewis, A. Bringing modern mathematics to bear on historical research. Historia Mathematica 2 (1975), 8485.
 [77] Lewis, D. A subjectivist’s guide to objective chance. In Studies in Inductive Logic and Probability (ed. Richard C. Jeffrey), pp. 263–293. University of California Press, 1980.
 [78] Łoś, J. Quelques remarques, théorèmes et problèmes sur les classes définissables d’algèbres. In Mathematical interpretation of formal systems, 98–113, NorthHolland Publishing Co., Amsterdam, 1955.
 [79] Mormann, T.; Katz, M. Infinitesimals as an issue of neoKantian philosophy of science. HOPOS: The Journal of the International Society for the History of Philosophy of Science 3 (2013), no. 2, 236280. See http://www.jstor.org/stable/10.1086/671348 and http://arxiv.org/abs/1304.1027
 [80] Nakane, M. Did Weierstrass’s differential calculus have a limitavoiding character? His definition of a limit in  style. BSHM Bulletin: Journal of the British Society for the History of Mathematics, 29 (2014), no. 1, 5159. See http://dx.doi.org/10.1080/17498430.2013.831241
 [81] Nelson, E. Internal set theory: a new approach to nonstandard analysis. Bulletin of the American Mathematical Society 83 (1977), no. 6, 1165–1198.
 [82] Nelson, E. Radically elementary probability theory. Annals of Mathematics Studies, vol. 117. Princeton University Press, 1987.
 [83] Nelson, E. Warning signs of a possible collapse of contemporary mathematics. In Infinity (eds. Michael Heller, W. Hugh Woodin), pp. 76–85. Cambridge University Press, 2011.
 [84] Pourciau, B. Newton and the notion of limit. Historia Mathematica 28 (2001), no. 1, 18–30.
 [85] Reder, C. Transient behaviour of a GaltonWatson process with a large number of types. Journal of Applied Probability 40 (2003), no. 4 1007–1030.
 [86] Reeder, P. Internal Set Theory and Euler’s Introductio in Analysin Infinitorum. MSc Thesis, Ohio State University, 2013.
 [87] Robinson, A. Nonstandard analysis. Nederl. Akad. Wetensch. Proc. Ser. A 64 = Indag. Math. 23 (1961), 432–440 [reprinted in Selected Works, see item [88], pp. 311]
 [88] Robinson, A. Selected papers of Abraham Robinson. Vol. II. Nonstandard analysis and philosophy. Edited and with introductions by W. A. J. Luxemburg and S. Körner. Yale University Press, New Haven, Conn., 1979.
 [89] Russell, B. The Principles of Mathematics. Routledge. London 1903.
 [90] Ryff, J. Measure preserving transformations and rearrangements. J. Math. Anal. Appl. 31 (1970), 449–458.
 [91] Settle, T. Galilean science: essays in the mechanics and dynamics of the Discorsi. Ph.D. dissertation, Cornell University, 1966, 288 pages.
 [92] Sherry, D.; Katz, M. Infinitesimals, imaginaries, ideals, and fictions. Studia Leibnitiana 44 (2012), no. 2, 166192. See http://arxiv.org/abs/1304.2137
 [93] Skolem, T. Peano’s axioms and models of arithmetic. In Mathematical interpretation of formal systems, pp. 1–14. NorthHolland Publishing Co., Amsterdam, 1955.
 [94] Skyrms, B. Causal Necessity: A Pragmatic Investigation of the Necessity of Laws. Yale University Press, 1980.
 [95] Strømholm, P. Fermat’s methods of maxima and minima and of tangents. A reconstruction. Arch. History Exact Sci. 5 (1968), no. 1, 47–69.
 [96] Sullivan, K. Mathematical Education: The Teaching of Elementary Calculus Using the Nonstandard Analysis Approach. Amer. Math. Monthly 83 (1976), no. 5, 370–375.
 [97] Tall, D.; Katz, M. A cognitive analysis of Cauchy’s conceptions of function, continuity, limit, and infinitesimal, with implications for teaching the calculus. Educational Studies in Mathematics, to appear. See http://dx.doi.org/10.1007/s1064901495319 and http://arxiv.org/abs/1401.1468
 [98] Tannery, P., Henry, C. Oeuvres de Fermat, Vol. 1 GauthierVillars, 1891.
 [99] Tannery, P., Henry, C. Oeuvres de Fermat, Vol. 3 GauthierVillars, 1896.
 [100] Tao, T. See the post http://terrytao.wordpress.com/2013/11/16/qualitativeprobabilitytheorytypesandthegroupchunkandgroupconfigurationtheorems/
 [101] Tho, T. Equivocation in the foundations of Leibniz’s infinitesimal fictions. Society and Politics 6 (2012), no. 2, 70–98.
 [102] Weil, A. Number theory. An approach through history. From Hammurapi to Legendre. Birkhäuser Boston, Inc., Boston, MA, 1984.
 [103] Wheeler, J.: At home in the universe. Masters of Modern Physics. American Institute of Physics, Woodbury, NY, 1994.
 [104] Wiener, N. Differential space. Journal of Mathematical Physics 2 (1923), no. 1, 131–174.
 [105] Williamson, T. How probable is an infinite sequence of heads? Analysis 67 (2007), no. 3, 173–180.
 [106] Wisan, W. The new science of motion: a study of Galileo’s De motu locali. Arch. History Exact Sci. 13 (1974), 103–306.