Fermat, Leibniz, Euler, and the gang

Fermat, Leibniz, Euler, and the gang: The true history of the concepts of limit and shadow

Tiziana Bascelli T. Bascelli, Via S. Caterina, 16, 36030 Montecchio P.no (VI), Italy Emanuele Bottazzi E. Bottazzi, Dipartimento di Matematica, Università di Trento, Italy Frederik Herzberg F. Herzberg, Center for Mathematical Economics, Bielefeld University, D-33615 Bielefeld, Germany Vladimir Kanovei V. Kanovei, IPPI, Moscow, and MIIT, Moscow, Russia Karin U. Katz K. Katz, Department of Mathematics, Bar Ilan University, Ramat Gan 52900 Israel Mikhail G. Katz M. Katz, Department of Mathematics, Bar Ilan University, Ramat Gan 52900 Israel Tahl Nowik T. Nowik, Department of Mathematics, Bar Ilan University, Ramat Gan 52900 Israel David Sherry D. Sherry, Department of Philosophy, Northern Arizona University, Flagstaff, AZ 86011, US  and  Steven Shnider S. Shnider, Department of Mathematics, Bar Ilan University, Ramat Gan 52900 Israel
Abstract.

Fermat, Leibniz, Euler, and Cauchy all used one or another form of approximate equality, or the idea of discarding “negligible” terms, so as to obtain a correct analytic answer. Their inferential moves find suitable proxies in the context of modern theories of infinitesimals, and specifically the concept of shadow. We give an application to decreasing rearrangements of real functions.

1. Introduction

The theories as developed by European mathematicians prior to 1870 differed from the modern ones in that none of them used the modern theory of limits. Fermat develops what is sometimes called a “precalculus” theory, where the optimal value is determined by some special condition such as equality of roots of some equation. The same can be said for his contemporaries like Descartes, Huygens, and Roberval.

Leibniz’s calculus advanced beyond them in working on the derivative function of the variable . He had the indefinite integral whereas his predecessors only had concepts more or less equivalent to it. Euler, following Leibniz, also worked with such functions, but distinguished the variable (or variables) with constant differentials , a status that corresponds to the modern assignment that  is the independent variable, the other variables of the problem being dependent upon it (or them) functionally.

Fermat determined the optimal value by imposing a condition using his adequality of quantities. But he did not really think of quantities as functions, nor did he realize that his method produced only a necessary condition for his optimisation condition. For a more detailed general introduction, see chapters 1 and 2 of the volume edited by Grattan-Guinness (Bos et al. 1980 [19]).

The doctrine of limits is sometimes claimed to have replaced that of infinitesimals when analysis was rigorized in the 19th century. While it is true that Cantor, Dedekind and Weierstrass attempted (not altogether successfully; see Ehrlich 2006 [32]; Mormann & Katz 2013 [79]) to eliminate infinitesimals from analysis, the history of the limit concept is more complex. Newton had explicitly written that his ultimate ratios were not actually ratios but, rather, limits of prime ratios (see Russell 1903 [89, item 316, p. 338-339]; Pourciau 2001 [84]). In fact, the sources of a rigorous notion of limit are considerably older than the 19th century.

In the context of Leibnizian mathematics, the limit of  as  tends to  can be viewed as the “assignable part” (as Leibniz may have put it) of  where  is an “inassignable” infinitesimal increment (whenever the answer is independent of the infinitesimal chosen). A modern formalisation of this idea exploits the standard part principle (see Keisler 2012 [67, p. 36]).

In the context of ordered fields , the standard part principle is the idea that if  is a proper extension of the real numbers , then every finite (or limited) element  is infinitely close to a suitable . Such a real number is called the standard part (sometimes called the shadow) of , or in formulas, . Denoting by  the collection of finite elements of , we obtain a map

 st:Ef→R.

Here  is called finite if it is smaller (in absolute value) than some real number (the term finite is immediately comprehensible to a wide mathematical public, whereas limited corresponds to correct technical usage); an infinitesimal is smaller (in absolute value) than every positive real; and  is infinitely close to  in the sense that  is infinitesimal.

Briefly, the standard part function “rounds off” a finite element of  to the nearest real number (see Figure 1).

The proof of the principle is easy. A finite element  defines a Dedekind cut on the subfield  (alternatively, on ), and the cut in turn defines the real  via the usual correspondence between cuts and real numbers. One sometimes writes down the relation

 x≈x0

to express infinite closeness.

We argue that the sources of such a relation, and of the standard part principle, go back to Fermat, Leibniz, Euler, and Cauchy. Leibniz would discard the inassignable part of  to arrive at the expected answer, , relying on his law of homogeneity (see Section 4). Such an inferential move is mirrored by a suitable proxy in the hyperreal approach, namely the standard part function.

Fermat, Leibniz, Euler, and Cauchy all used one or another form of approximate equality, or the idea of discarding “negligible” terms. Their inferential moves find suitable proxies in the context of modern theories of infinitesimals, and specifically the concept of shadow.

The last two sections present an application of the standard part to decreasing rearrangements of real functions and to a problem on divergent integrals due to S. Konyagin.

This article continues efforts in revisiting the history and foundations of infinitesimal calculus and modern nonstandard analysis. Previous efforts in this direction include Bair et al. (2013 [6]); Bascelli (2014 [7]); Błaszczyk et al. (2013 [15]); Borovik et al. (2012 [16], [17]); Kanovei et al. (2013 [55]); Katz, Katz & Kudryk (2014 [61]); Mormann et al. (2013 [79]); Sherry et al. (2014 [92]); Tall et al. (2014 [97]).

2. Methodological remarks

To comment on the historical subtleties of judging or interpreting past mathematics by present-day standards,111Some reflections on this can be found in (Lewis 1975 [76]). note that neither Fermat, Leibniz, Euler, nor Cauchy had access to the semantic foundational frameworks as developed in mathematics at the end of the 19th and first half of the 20th centuries. What we argue is that their syntactic inferential moves ultimately found modern proxies in Robinson’s framework, thus placing a firm (relative to ZFC)222The Zermelo–Fraenkel Set Theory with the Axiom of Choice. semantic foundation underneath the classical procedures of these masters. Benacerraf (1965 [10]) formulated a related dichotomy in terms of mathematical practice vs mathematical ontology.

For example, the Leibnizian laws of continuity (see Knobloch 2002 [69, p. 67]) and homogeneity can be recast in terms of modern concepts such as the transfer principle and the standard part principle over the hyperreals, without ever appealing to the semantic content of the technical development of the hyperreals as a punctiform continuum; similarly, Leibniz’s proof of the product rule for differentiation is essentially identical, at the syntactic level, to a modern infinitesimal proof (see Section 4).

2.1. A-track and B-track

The crucial distinction between syntactic and semantic aspects of the work involving mathematical continua appears to have been overlooked by R. Arthur who finds fault with the hyperreal proxy of the Leibnizian continuum, by arguing that the latter was non-punctiform (see Arthur 2013 [5]). Yet this makes little difference at the syntactic level, as explained above. Arthur’s brand of the syncategorematic approach following Ishiguro (1990 [52]) involves a reductive reading of Leibnizian infinitesimals as logical (as opposed to pure) fictions involving a hidden quantifier à la Weierstrass, ranging over “ordinary” values. This approach was critically analyzed in (Katz & Sherry 2013 [65]); (Sherry & Katz 2013 [92]); (Tho 2012 [101]).

Robinson’s framework poses a challenge to traditional historiography of mathematical analysis. The traditional thinking is often dominated by a kind of Weierstrassian teleology. This is a view of the history of analysis as univocal evolution toward the radiant Archimedean framework as developed by Cantor, Dedekind, Weierstrass, and others starting around 1870, described as the A-track in a recent piece in these Notices (see Bair et al. 2013 [6]).

Robinson’s challenge is to point out not only the possibility, but also the existence of a parallel Bernoullian333Historians often name Johann Bernoulli as the first mathematician to have adhered systematically and exclusively to the infinitesimal approach as the basis for the calculus. track for the development of analysis, or B-track for short. The B-track assigns an irreducible and central role to the concept of infinitesimal, a role it played in the work of Leibniz, Euler, mature Lagrange,444In the second edition of his Mécanique Analytique dating from 1811, Lagrange fully embraced the infinitesimal in the following terms: “Once one has duly captured the spirit of this system [i.e., infinitesimal calculus], and has convinced oneself of the correctness of its results by means of the geometric method of the prime and ultimate ratios, or by means of the analytic method of derivatives, one can then exploit the infinitely small as a reliable and convenient tool so as to shorten and simplify proofs”. See (Katz & Katz 2011 [58]) for a discussion. Cauchy, and others.

The caliber of some of the response to Robinson’s challenge has been disappointing. Thus, the critique by Earman (1975 [30]) is marred by a confusion of second-order infinitesimals like  and second-order hyperreal extensions like ; see (Katz & Sherry 2013 [65]) for a discussion.

Victor J. Katz (2014 [66]) appears to imply that a B-track approach based on notions of infinitesimals or indivisibles is limited to “the work of Fermat, Newton, Leibniz and many others in the 17th and 18th centuries”. This does not appear to be Felix Klein’s view. Klein formulated a condition, in terms of the mean value theorem,555The Klein–Fraenkel criterion is discussed in more detail in Kanovei et al. (2013 [55]). for what would qualify as a successful theory of infinitesimals, and concluded:

I will not say that progress in this direction is impossible, but it is true that none of the investigators have achieved anything positive (Klein 1908 [68, p. 219]).

Klein was referring to the current work on infinitesimal-enriched systems by Levi-Civita, Bettazzi, Stolz, and others. In Klein’s mind, the infinitesimal track was very much a current research topic; see Ehrlich (2006 [32]) for a detailed coverage of the work on infinitesimals around 1900.

2.2. Formal epistemology: Easwaran on hyperreals

Some recent articles are more encouraging in that they attempt a more technically sophisticated approach. K. Easwaran’s study (2014 [31]), motivated by a problem in formal epistemology,666The problem is concerned with saving philosophical Bayesianism, a popular position in formal epistemology, which appears to require that one be able to find on every algebra of doxastically relevant propositions some subjective probability assignment such that only the impossible event () will be assigned an initial/uninformed subjective probability, or credence, of . attempts to deal with technical aspects of Robinson’s theory such as the notion of internal set, and shows an awareness of recent technical developments, such as a definable hyperreal system of Kanovei & Shelah (2004 [57]).

Even though Easwaran, in the tradition of Lewis (1980 [77]) and Skyrms (1980 [94]), tries to engage seriously with the intricacies of employing hyperreals in formal epistemology,777For instance, he concedes: “And the hyperreals may also help, as long as we understand that they do not tell us the precise structure of credences.” (Easwaran 2014 [31], Introduction, last paragraph). not all of his findings are convincing. For example, he assumes that physical quantities cannot take hyperreal values.888Easwaran’s explicit premise is that “All physical quantities can be entirely parametrized using the standard real numbers.” (Easwaran 2014 [31, Section 8.4, Premise 3]). However, there exist physical quantities that are not directly observable. Theoretical proxies for unobservable physical quantities typically depend on the chosen mathematical model. And not surprisingly, there are mathematical models of physical phenomena which operate with the hyperreals, in which physical quantities take hyperreal values. Many such models are discussed in the volume by Albeverio et al. (1986 [1]).

For example, certain probabilistic laws of nature have been formulated using hyperreal-valued probability theory. The construction of mathematical Brownian motion by Anderson (1976 [4]) provides a hyperreal model of the botanical counterpart. It is unclear why (and indeed rather implausible that) an observer A, whose degrees of belief about botanical Brownian motion stem from a mathematical model based on the construction of mathematical Brownian motion by Wiener (1923 [104]) should be viewed as being more rational than another observer B, whose degrees of belief about botanical Brownian motion stem from a mathematical model based on Anderson’s construction of mathematical Brownian motion.999One paradoxical aspect of Easwaran’s methodology is that, despite his anti-hyperreal stance in (2014 [30]), he does envision the possibility of useful infinitesimals in an earlier joint paper (Colyvan & Easwaran 2008 [27]), where he cites John Bell’s account (Bell’s presentation of Smooth Infinitesimal Analysis in [9] involves a category-theoric framework based on intuitionistic logic); but never the hyperreals. Furthermore, in the 2014 paper he cites the surreals as possible alternatives to the real number–based description of the “structure of physical space” as he calls it; see Subsection 2.5 below for a more detailed discussion.

Similarly problematic is Easwaran’s assumption that an infinite sequence of probabilistic tests must necessarily be modeled by the set of standard natural numbers (this is discussed in more detail in Subsection 2.5). Such an assumption eliminates the possibility of modeling it by a sequence of infinite hypernatural length. Indeed, once one allows for infinite sequences to be modeled in this way, the problem of assigning a probability to an infinite sequence of coin tosses that was studied in (Elga 2004 [33]) and (Williamson 2007 [105]) allows for an elegant hyperreal solution (Herzberg 2007 [48]).

Easwaran reiterates the common objection that the hyperreals are allegedly “non-constructive” entities. The bitter roots of such an allegation in the radical constructivist views of E. Bishop have been critically analyzed in (Katz & Katz 2011 [59]), and contrasted with the liberal views of the Intuitionist A. Heyting, who felt that Robinson’s theory was “a standard model of important mathematical research” (Heyting 1973 [51, p. 136]). It is important to keep in mind that Bishop’s target was classical mathematics (as a whole), the demise of which he predicted in the following terms:

Very possibly classical mathematics will cease to exist as an independent discipline (Bishop 1968 [14, p. 54]).

2.3. Zermelo–Fraenkel axioms and the Feferman–Levy model

In his analysis, Easwaran assigns substantial weight to the fact that “it is consistent with the ZF [Zermelo–Fraenkel set theory] without the Axiom of Choice” that the hyperreals do not exist (Easwaran 2014 [30, Section 8.4]); see Figure 2. However, on the same grounds, one would have to reject parts of mathematics with important applications. There are fundamental results in functional analysis that depend on the Axiom of Choice such as the Hahn–Banach theorem; yet no one would suggest that mathematical physicists or mathematical economists should stop exploiting them.

Most real analysis textbooks prove the -additivity (i.e., countable additivity) of Lebesgue measure, but -additivity is not deducible from ZF, as shown by the Feferman–Levy model; see (Feferman & Levy 1963 [36]); (Jech 1973 [54, chapter 10]). Indeed, it is consistent with ZF that the following holds:

• the continuum  of real numbers is a countable union of countable sets .

See (Cohen 1966 [26, chapter IV, section 4]) for a description of a model of ZF in which  holds.101010Property  may appear to be asserting the countability of the continuum. However, in order to obtain a bijective map from a countable collection of countable sets to (and hence, by diagonalization, to ), the Axiom of Choice (in its “countable” version which allows a countably-infinite sequence of independent choices) will necessarily be used. Note that  implies that the Lebesgue measure is not countably additive, as all countable sets are null sets whereas  is not a null set. Therefore, countable additivity of the Lebesgue measure cannot be established in ZF.

Terence Tao wrote:

By giving up countable additivity, one loses a fair amount of measure and integration theory, and in particular the notion of the expectation of a random variable becomes problematic (unless the random variable takes only finitely many values). (Tao 2013 [100])

Tao’s remarks suggest that deducibility from ZF is not a reasonable criterion of mathematical plausibility by any modern standard.

There are models of ZF in which there are infinitesimal numbers, if properly understood, among the real numbers themselves. Thus, there exist models of ZF which are also models of Nelson’s (1987 [82]) radically elementary mathematics, a subsystem of Nelson’s (1977 [81]) Internal Set Theory. Here radically elementary mathematics is an extension of classical set theory (which may be understood as ZF111111Even though Nelson would probably argue for a much weaker system; see Herzberg (2013 [49, Appendix A.1]), citing Nelson (2011 [83]). ) by a unary predicate, to be interpreted as

“… is a standard natural number”,

with additional axioms that regulate the use of the new predicate (notably external induction for standard natural numbers) and ensure the existence of non-standard numbers. Nelson (1987 [82, Appendix]) showed that a major part of the theory of continuous-time stochastic processes is in fact equivalent to a corresponding radically elementary theory involving infinitesimals, and indeed, radically elementary probability theory has seen applications in the sciences; see for example (Reder 2003 [85]).

In sum, mathematical descriptions of non-trivial natural phenomena involve, by necessity, some degree of mathematical idealisation, but Easwaran has not given us a good reason why only such mathematical idealisations that are feasible in every model of ZF should be acceptable. Rather, as we have already seen, there are very good arguments (e.g., from measure theory) against such a high reverence for ZF.

2.4. Skolem integers and Robinson integers

Easwaran recycles the well-known claim by A. Connes that a hypernatural number leads to a nonmeasurable set. However, the criticism by Connes121212Note that Connes relied on the Hahn-Banach theorem, exploited ultrafilters, and placed a nonconstructive entity (namely the Dixmier trace) on the front cover of his magnum opus; see (Katz & Leichtnam 2013 [62]) and (Kanovei et al. 2013 [55]) for details. is in the category of dressing down a feature to look like a bug, to reverse a known dictum from computer science slang. This can be seen as follows. The Skolem non-standard integers are known to be purely constructive; see Skolem (1955 [93]) and Kanovei et al. (2013 [55]). Yet they imbed in Robinson’s hypernaturals :

 NSko↪NRob. (2.1)

Viewing a purely constructive Skolem hypernatural

 H∈NSko∖N

as a member of  via the inclusion (2.1), one can apply the transfer principle to form the set

 XH={A⊂N:H∈∗A},

where  is the natural extension of . The set is not measurable. What propels the set  into existence is not a purported weakness of a nonstandard integer  itself, but rather the remarkable strength of both the Łoś-Robinson transfer principle and the consequences it yields.

2.5. Williamson, complexity, and other arguments

Easwaran makes a number of further critiques of hyperreal methodology. His section 8.1, entitled “Williamson’s Argument”, concerns infinite coin tosses. Easwaran’s analysis is based on the model of a countable sequence of coin tosses given by Williamson [105]. In this model, it is assumed that

… for definiteness, [the coin] will be flipped once per second, assuming that seconds from now into the future can be numbered with the natural numbers (Easwaran 2014 [31, section 8.1]).

What is lurking behind this is a double assumption which, unlike other “premises”, is not made explicit by Easwaran. Namely, he assumes that

1. a vast number of independent tests is best modeled by a temporal arrangement thereof, rather than by a simultaneous collection; and

2. the collection of seconds ticking away “from now [and] into the future” gives a faithful representation of the natural numbers.

These two premises are not self-evident and some research mathematicians have very different intuitions about the matter, as much of the literature on applied nonstandard analysis (e.g., Albeverio et al. 1986 [1]; Reder 2003 [85]) illustrates.

It seems that in Easwaran’s model, an agent can choose not to flip the coin at some seconds, thus giving rise to events like “a coin that is flipped starting at second 2 comes up heads on every flip”. However, in all applications we are aware of, this additional structure used to rule out the use of hyperreals as range of probability functions seems not to be relevant.

Williamson and Easwaran appear to be unwilling to assume that, once one decides to use hyperreal infinitesimals, one should also replace the original algebra “of propositions in which the agent has credences” with an internal algebra of the hyperreal setting. In fact, such an additional step allows one to avoid both the problems raised by Williamson’s argument in his formulation using conditional probability, and those raised by Easwaran in section 8.2 of his paper.

A possible model with hyperreal infinitesimals for an infinite sequence of coin tosses is given by representing every event by means of a sequence , where  represents the outcome of the th flip and  is a fixed hypernatural number. In this model, consider the events “ Heads for ”, that we will denote , and “ Heads for ”, that we will denote . In such a setting, events  and are not isomorphic, contrary to what was argued in (Williamson [105, p. 3]). This is due to the fact that hypernatural numbers are an elementary extension of the natural numbers, for which the formula  always holds. Moreover, the probability of is the infinitesimal , while the probability of  is the strictly greater infinitesimal , thus obeying the well known rule for conditional probability.

Easwaran’s section 8.4 entitled “The complexity argument” is based on four premises. However, his premise 3, to the effect that “all physical quantities can be entirely parametrized using the standard real numbers”, is unlikely to lead to meaningful philosophical conclusions based on “first principles”. This is because all physical quantities can be entirely parametrized by the usual rational numbers alone, due to the intrinsic limits of our capability to measure physical quantities. A clear explanation of this limitation was given by Dowek. In particular, since

a measuring instrument yields only an approximation of the measured magnitude, […] it is therefore impossible, except according to this idealization, to measure more than the first digits of a physical magnitude. […] According to this principle, this idealization of the process of measurement is a fiction. This suggests the idea, reminiscent of Pythagoras’ views, that Physics could be formulated with rational numbers only. We can therefore wonder why real numbers have been invented and, moreover, used in Physics. A hypothesis is that the invention of real numbers is one of the many situations, where the complexity of an object is increased, so that it can be apprehended more easily. (Dowek 2013 [29])

Related comments by Wheeler (1994 [103, p. 308]), Brukner & Zeilinger (2005 [22, p. 59]), and others were analyzed by Kanovei et al. (2013 [55, Section 8.4]). See also Jaroszkiewicz (2014 [53]).

If all physical quantities can be entirely parametrized by using rational numbers, there should be no compelling reason to choose the real number system as the value range of our probability measures. However, Easwaran is apparently comfortable with the idealisation of exploiting a larger number system than the rationals for the value range of probability measures. What we argue is that the real numbers are merely one among possible idealisations that can be used for this purpose. For instance, in hyperreal models for infinite sequence of coin tosses developed by Benci, Bottazzi & Di Nasso (2013 [11]), all events have hyperrational probabilities. This generalizes both the case of finite sequences of coin tosses, and the Kolmogorovian model for infinite sequences of coin tosses, where a real-valued probability is generated by applying Caratheodory’s extension theorem to the rational-valued probability measure over the cylinder sets.

Given Easwaran’s firm belief that “the function relating credences to the physical is not so complex that its existence is independent of Zermelo-Fraenkel set theory” (see his section 8.4, premise 2), it is surprising to find him suggesting that

the surreal numbers seem more promising as a device for future philosophers of probability to use (Easwaran 2014 [31, Appendix A.3]).

However, while the construction of the surreals indeed “is a simultaneous generalization of Dedekind’s construction of the real numbers and von Neumann’s construction of the ordinals”, as observed by Easwaran, it is usually carried out in the Von Neumann–Bernays–Gödel set theory (NBG) with Global Choice; see, for instance, the “Preliminaries” section of (Alling 1987 [3]). The assumption of the Global Axiom of Choice is a strong foundational assumption.

The construction of the surreal numbers can be performed within a version of NBG that is a conservative extension of ZFC, but does not need Limitation of Size (or Global Choice). However, NBG clearly is not a conservative extension of ZF; and if one wishes to prove certain interesting features of the surreals one needs an even stronger version of NBG that involves the Axiom of Global Choice. Therefore, the axiomatic foundation that one needs for using the surreal numbers is at least as strong as the one needed for the hyperreals.

2.6. Infinity and infinitesimal: let both pretty severely alone

At the previous turn of the century, H. Heaton wrote:

I think I know exactly what is meant by the term zero. But I can have no conception either of infinity or of the infinitesimal, and I think it would be well if mathematicians would let both pretty severely alone (Heaton 1898 [47, p. 225]).

Heaton’s sentiment expresses an unease about a mathematical concept of which one may have an intuitive grasp141414The intuitive appeal of infinitesimals make them an effective teaching tool. The pedagogical value of teaching calculus with infinitesimals was demonstrated in a controlled study by Sullivan (1976 [96]). but which is not easily formalizable. Heaton points out several mathematical inconsistencies or ill-chosen terminology among the conceptions of infinitesimals of his contemporaries. This highlights the brilliant mathematical achievement of a consistent “calculus” for infinitesimals attained through the work of Hewitt (1948 [50]), Łoś (1955 [78]), Robinson (1961 [87]), and Nelson (1977 [81]), but also of their predecessors like Fermat, Euler, Leibniz, and Cauchy, as we analyze respectively in Sections 3, 4, 5, and 6.

Our interpretation of Fermat’s technique is compatible with those by Strømholm (1968 [95]) and Giusti (2009 [43]). It is at variance with the interpretation by Breger (1994 [21]), considered by Knobloch (2014 [70]) to have been refuted.

Adequality, or  (parisotēs) in the original Greek of Diophantus, is a crucial step in Fermat’s method of finding maxima, minima, tangents, and solving other problems that a modern mathematician would solve using infinitesimal calculus. The method is presented in a series of short articles in Fermat’s collected works. The first article, Methodus ad Disquirendam Maximam et Minimam, opens with a summary of an algorithm for finding the maximum or minimum value of an algebraic expression in a variable . For convenience, we will write such an expression in modern functional notation as .

3.1. Summary of Fermat’s algorithm

One version of the algorithm can be broken up into six steps in the following way:

1. Introduce an auxiliary symbol , and form ;

2. Set adequal the two expressions  (the notation “” for adequality is ours, not Fermat’s);

3. Cancel the common terms on the two sides of the adequality. The remaining terms all contain a factor of ;

4. Divide by  (see also next step);

5. In a parenthetical comment, Fermat adds: “or by the highest common factor of ”;

6. Among the remaining terms, suppress all terms which still contain a factor of . Solving the resulting equation for  yields the extremum of .

In modern mathematical language, the algorithm entails expanding the difference quotient

 f(a+e)−f(a)e

in powers of  and taking the constant term.151515Fermat also envisions a more general technique involving division by a higher power of  as in step (5). The method (leaving aside step (5)) is immediately understandable to a modern reader as the elementary calculus exercise of finding the extremum by solving the equation . But the real question is how Fermat understood this algorithm in his own terms, in the mathematical language of his time, prior to the invention of calculus by Barrow, Leibniz, Newton, and others.

There are two crucial points in trying to understand Fermat’s reasoning: first, the meaning of “adequality” in step (2), and second, the justification for suppressing the terms involving positive powers of  in step (6). The two issues are closely related because interpretation of adequality depends on the conditions on . One condition which Fermat always assumes is that  is positive. He did not use negative numbers in his calculations.161616This point is crucial for our argument below using the transverse ray. Since Fermat is only working with positive values of his , he only considers a ray (rather than a full line) starting at a point of the curve. The convexity of the curve implies an inequality, which Fermat transforms into an adequality without giving much explanation of his procedure, but assuming implicitly that the ray is tangent to the curve. But a transverse ray would satisfy the inequality no less than a tangent ray, indicating that Fermat is relying on an additional piece of geometric information. His procedure of applying the defining relation of the curve itself, to a point on the tangent ray, is only meaningful when the increment  is small (see Subsection 3.2).

Fermat introduces the term adequality in Methodus with a reference to Diophantus of Alexandria. In the third article of the series, Ad Eamdem Methodum (Sur la Même Méthode), he quotes Diophantus’ Greek term , which he renders following Xylander and Bachet, as adaequatio or adaequalitas (see A. Weil [102, p. 28]).

3.2. Tangent line and convexity of parabola

Consider Fermat’s calculation of the tangent line to the parabola (see Fermat [38, p. 122-123]). To simplify Fermat’s notation, we will work with the parabola , or

 x2y=1.

To understand what Fermat is doing, it is helpful to think of the parabola as a level curve of the two-variable function .

Given a point  on the parabola, Fermat wishes to find the tangent line through the point. Fermat exploits the geometric fact that by convexity, a point

 (p,q)

on the tangent line lies outside the parabola. He therefore obtains an inequality equivalent in our notation to , or . Here , and  is Fermat’s magic symbol we wish to understand. Thus, we obtain

 p2y−e>1. (3.1)

At this point Fermat proceeds as follows:

1. he writes down the inequality , or ;

3. he writes down the adequality ;

4. he uses an identity involving similar triangles to substitute

 xp=y+ry+r−e

where  is the distance from the vertex of the parabola to the point of intersection of the tangent to the parabola at  with the axis of symmetry,

5. he cross multiplies and cancels identical terms on right and left, then divides out by , discards the remaining terms containing , and obtains  as the solution.171717In Fermat’s notation . Step (v) can be understood as requiring the expression  to have a double root at , leading to the solution  or in Fermat’s notation .

What interests us here are steps (i) and (ii). How does Fermat pass from an inequality to an adequality? Giusti noted that

Comme d’habitude, Fermat est autant détaillé dans les exemples qu’il est réticent dans les explications. On ne trouvera donc presque jamais des justifications de sa règle des tangentes (Giusti 2009 [43]).

In fact, Fermat provides no explicit explanation for this step. However, what he does is to apply the defining relation for a curve to points on the tangent line to the curve. Note that here the quantity , as in , is positive: Fermat did not have the facility we do of assigning negative values to variables. Strømholm notes that Fermat

never considered negative roots, and if  was a solution of an equation, he did not mention it as it was nearly always geometrically uninteresting (Strømholm 1968 [95, p. 49]).

Fermat says nothing about considering points  “on the other side”, i.e., further away from the vertex of the parabola, as he does in the context of applying a related but different method, for instance in his two letters to Mersenne (see [95, p. 51]), and in his letter to Brûlart [39].181818This was noted by Giusti (2009 [43]). Now for positive values of , Fermat’s inequality (3.1) would be satisfied by a transverse ray (i.e., secant ray) starting at  and lying outside the parabola, just as much as it is satisfied by a tangent ray starting at . Fermat’s method therefore presupposes an additional piece of information, privileging the tangent ray over transverse rays. The additional piece of information is geometric in origin: he applies the defining relation (of the curve itself) to a point on the tangent ray to the curve, a procedure that is only meaningful when the increment  is small.

In modern terms, we would speak of the tangent line being a “best approximation” to the curve for a small variation ; however, Fermat does not explicitly discuss the size of . The procedure of “discarding the remaining terms” in step (v) admits of a proxy in the hyperreal context. Namely, it is the standard part principle (see Section 1). Fermat does not elaborate on the justification of this step, but he is always careful to speak of the suppressing or deleting the remaining term in , rather than setting it equal to zero. Perhaps his rationale for suppressing terms in  consists in ignoring terms that don’t correspond to an actual measurement, prefiguring Leibniz’s inassignable quantities. Fermat’s inferential moves in the context of his adequality are akin to Leibniz’s in the context of his calculus; see Section 4.

3.3. Fermat, Galileo, and Wallis

While Fermat never spoke of his  as being infinitely small, the technique was known both to Fermat’s contemporaries like Galileo (see Bascelli 2014 [7], [8]) and Wallis (see Katz & Katz [60, Section 24]) as well as Fermat himself, as his correspondence with Wallis makes clear; see Katz, Schaps & Shnider (2013 [63, Section 2.1]).

Fermat was very interested in Galileo’s treatise De motu locali, as we know from his letters to Marin Mersenne dated apr/may 1637, 10 august, and 22 october 1638. Galileo’s treatment of infinitesimals in De motu locali is discussed by Wisan (1974 [106, p. 292]) and Settle (1966 [91]).

Alexander (2014 [2]) notes that the clerics in Rome forbade the doctrine of the infinitely small on 10 august 1632 (a month before Galileo was put on trial over heliocentrism); this may help explain why the catholic Fermat might have been reluctant to speak of the infinitely small explicitly.191919See a related discussion at http://math.stackexchange.com/questions/661999/are-infinitesimals-dangerous

In a recent text, U. Felgner analyzes the Diophantus problems which exploit the method of , and concludes that

Aus diesen Beispielen wird deutlich, dass die Verben und adaequare nicht ganz dasselbe ausdrücken. Das griechische Wort bedeutet, der Gleichheit nahe zu sein, während das lateinische Wort das Erreichender Gleichheit (sowohl als vollendeten als auch als unvollendeten Prozeß) ausdrückt (Felgner 2014 [37]).

Thus, in his view, even though the two expressions have slightly different meanings, the Greek meaning “being close to equality” and the Latin meaning “equality which is reached (at the end of either a finite or an infinite process),” they both involve approximation. Felgner goes on to consider some of the relevant texts from Fermat, and concludes that Fermat’s method has nothing to do with differential calculus and involves only the property of an auxiliary expression having a double zero:

Wir hoffen, deutlich gemacht zu haben, dass die fermatsche “Methode der Adaequatio” gar nichts mit dem Differential-Kalkül zu hat, sondern vielmehr im Studium des Wertverlaufs eines Polynoms in der Umgebung eines kritischen Punktes besteht, und dabei das Ziel verfolgt zu zeigen, dass das Polynom an dieser Stelle eine doppelte Nullstelle besitzt (ibid.)

However, Felgner’s conclusion is inconsistent with his own textual analysis which indicates that the idea of approximation is present in the methods of both Diophantus and Fermat. As Knobloch (2014 [70]) notes, “Fermat’s method of adequality is not a single method but rather a cluster of methods.” Felgner failed to analyze the examples of tangents to transcendental curves, such as the cycloid, in which Fermat does not study the order of the zero of an auxiliary polynomial. Felgner mistakenly asserts that in the case of the cycloid Fermat did not reveal how he thought of the solution: “Wie FERMATsich die Lösung dachte, hat er nicht verraten.” (ibid.) Quite to the contrary, as Fermat explicitly stated, he applied the defining property of the curve to points on the tangent line:

Il faut donc adégaler (à cause de la propriété spécifique de la courbe qui est à considérer sur la tangente)

(see Katz et al. (2013 [63]) for more details). Fermat’s approach involves applying the defining relation of the curve, to a point on a tangent to the curve. The approach is consistent with the idea of approximation inherent in his method, involving a negligible distance (whether infinitesimal or not) between the tangent and the original curve when one is near the point of tangency. This line of reasoning is related to the ideas of the differential calculus. Note that Fermat does not say anything here concerning the multiplicities of zeros of polynomials. As Felgner himself points out, in the case of the cycloid the only polynomial in sight is of first order and the increment “” cancels out. Fermat correctly solves the problem by obtaining the defining equation of the tangent.

For a recent study of 17th century methodology, see the article (Carroll et al. 2013 [23]).

4. Leibniz’s Transcendental law of homogeneity

In this section, we examine a possible connection between Fermat’s adequality and Leibniz’s Transcendental Law of Homogeneity (TLH). Both of them enable certain inferential moves that play parallel roles in Fermat’s and Leibniz’s approaches to the problem of maxima and minima. Note the similarity in titles of their seminal texts: Methodus ad Disquirendam Maximam et Minimam (Fermat, see Tannery [98, pp. 133]) and Nova methodus pro maximis et minimis … (Leibniz 1684 [72] in Gerhardt [42]).

4.1. When are quantities equal?

Leibniz developed the TLH in order to enable inferences to be made between inassignable and assignable quantities. The TLH governs equations involving differentials. H. Bos interprets it as follows:

A quantity which is infinitely small with respect to another quantity can be neglected if compared with that quantity. Thus all terms in an equation except those of the highest order of infinity, or the lowest order of infinite smallness, can be discarded. For instance,

 a+dx=a (4.1)
 dx+ddy=dx

etc. The resulting equations satisfy this […] requirement of homogeneity (Bos 1974 [18, p. 33] paraphrasing Leibniz 1710 [75, p. 381-382]).

The title of Leibniz’s 1710 text is Symbolismus memorabilis calculi algebraici et infinitesimalis in comparatione potentiarum et differentiarum, et de lege homogeneorum transcendentali. The inclusion of the transcendental law of homogeneity (lex homogeneorum transcendentalis) in the title of the text attests to the importance Leibniz attached to this law.

The “equality up to an infinitesimal” implied in TLH was explicitly discussed by Leibniz in a 1695 response to Nieuwentijt, in the following terms:

Caeterum aequalia esse puto, non tantum quorum differentia est omnino nulla, sed et quorum differentia est incomparabiliter parva; et licet ea Nihil omnino dici non debeat, non tamen est quantitas comparabilis cum ipsis, quorum est differentia (Leibniz 1695 [73, p. 322]) [emphasis added–authors]

We provide a translation of Leibniz’s Latin:

Besides, I consider to be equal not only those things whose difference is entirely nothing, but also those whose difference is incomparably small: and granted that it [i.e., the difference] should not be called entirely Nothing, nevertheless it is not a quantity comparable to those whose difference it is.

4.2. Product rule

How did Leibniz use the TLH in developing the calculus? The issue can be illustrated by Leibniz’s justification of the last step in the following calculation:

 d(uv) =(u+du)(v+dv)−uv (4.2) =udv+vdu+dudv =udv+vdu.

The last step in the calculation (4.2) depends on the following inference:

 d(uv)=udv+vdu+dudv⟹d(uv)=udv+vdu.

Such an inference is an application of Leibniz’s TLH. In his 1701 text Cum Prodiisset [74, p. 46-47], Leibniz presents an alternative justification of the product rule (see Bos [18, p. 58]). Here he divides by , and argues with differential quotients rather than differentials. The role played by the TLH in these calculations is similar to that played by adequality in Fermat’s work on maxima and minima. For more details on Leibniz, see Guillaume (2014 [45]); Katz & Sherry (2012 [64]), (2013 [65]); Sherry & Katz [92]; Tho (2012 [101]).

5. Euler’s Principle of Cancellation

Some of the Leibnizian formulas reappear, not surprisingly, in his student’s student Euler. Euler’s formulas like

 a+dx=a, (5.1)

where  “is any finite quantity” (see Euler 1755 [35, § § 86,87]) are consonant with a Leibnizian tradition as reported by Bos; see formula (4.1) above. To explain formulas like (5.1), Euler elaborated two distinct ways (arithmetic and geometric) of comparing quantities, in the following terms:

Since we are going to show that an infinitely small quantity is really zero, we must meet the objection of why we do not always use the same symbol 0 for infinitely small quantities, rather than some special ones…[S]ince we have two ways to compare them, either arithmetic or geometric, let us look at the quotients of quantities to be compared in order to see the difference.
If we accept the notation used in the analysis of the infinite, then  indicates a quantity that is infinitely small, so that both and , where  is any finite quantity. Despite this, the geometric ratio  is finite, namely . For this reason, these two infinitely small quantities,  and , both being equal to , cannot be confused when we consider their ratio. In a similar way, we will deal with infinitely small quantities  and  (ibid., § 86, p. 51-52) [emphasis added–the authors].

Having defined the arithmetic and geometric comparisons, Euler proceeds to clarify the difference between them as follows:

Let  be a finite quantity and let  be infinitely small. The arithmetic ratio of equals is clear: Since , we have

 a±ndx−a=0.

On the other hand, the geometric ratio is clearly of equals, since

 a±ndxa=1. (5.2)

From this we obtain the well-known rule that the infinitely small vanishes in comparison with the finite and hence can be neglected [with respect to it] [35, §87] [emphasis in the original–the authors].

Like Leibniz, Euler considers more than one way of comparing quantities. Euler’s formula (5.2) indicates that his geometric comparison is procedurally identical with the Leibnizian TLH.

To summarize, Euler’s geometric comparision of a pair of quantities amounts to their ratio being infinitely close to a finite quantity, as in formula (5.2); the same is true for TLH. Note that one has  in this sense for an appreciable , but not for   (in which case there is equality only in the arithmetic sense). Euler’s “geometric” comparison was dubbed “the principle of cancellation” in (Ferraro [40, pp. 47, 48, 54]).

Euler proceeds to present the usual rules of infinitesimal calculus, which go back to Leibniz, L’Hôpital, and the Bernoullis, such as

provided  “since  vanishes compared with ” ([35, § 89]), relying on his “geometric” comparison. Euler introduces a distinction between infinitesimals of different order, and directly computes202020Note that Euler does not “prove that the expression is equal to 1”; such indirect proofs are a trademark of the  approach. Rather, Euler directly computes (what would today be formalized as the standard part of) the expression. a ratio of the form

 dx±dx2dx=1±dx=1

of two particular infinitesimals, assigning the value  to it (ibid., § 88). Euler concludes:

Although all of them [infinitely small quantities] are equal to 0, still they must be carefully distinguished one from the other if we are to pay attention to their mutual relationships, which has been explained through a geometric ratio (ibid., § 89).

The Eulerian hierarchy of orders of infinitesimals harks back to Leibniz’s work (see Section 4). Euler’s geometric comparision, or “principle of cancellation”, is yet another incarnation of the idea at the root of Fermat’s adequality and Leibniz’s Transcendental Law of Homogeneity. For further details on Euler see Bibiloni et al. (2006 [13]); Bair et al. (2013 [6]); Reeder (2013 [86]).

6. What did Cauchy mean by “limit”?

Laugwitz’s detailed study of Cauchy’s methodology places it squarely in the B-track (see Section 2). In conclusion, Laugwitz writes:

The influence of Euler should not be neglected, with regard both to the organization of Cauchy’s texts and, in particular, to the fundamental role of infinitesimals (Laugwitz 1987 [71, p. 273]).

Thus, in his 1844 text Exercices d’analyse et de physique mathématique, Cauchy wrote:

…si, les accroissements des variables étant supposés infiniment petits, on néglige, vis-à-vis de ces accroissements considérés comme infiniment petits du premier ordre, les infiniment petits des ordres supérieurs au premier, les nouvelles équations deviendront linéaires par rapport aux accroissements petits des variables. Leibniz et les premiers géomètres qui se sont occupés de l’analyse infinitésimale ont appelé différentielles des variables leurs accroissements infiniment petits, … (Cauchy 1844 [25, p. 5]).

Two important points emerge from this passage. First, Cauchy specifically speaks about neglecting (“on néglige”) higher order terms, rather than setting them equal to zero. This indicates a similarity of procedure with the Leibnizian TLH (see Section 4). Like Leibniz and Fermat before him, Cauchy does not set the higher order terms equal to zero, but rather “neglects” or discards them. Furthermore, Cauchy’s comments on Leibniz deserve special attention.

6.1. Cauchy on Leibniz

By speaking matter-of-factly about the infinitesimals of Leibniz specifically, Cauchy reveals that his (Cauchy’s) infinitesimals are consonant with Leibniz’s. This is unlike the differentials where Cauchy adopts a different approach.

On page 6 of the same text, Cauchy notes that the notion of derivative

représente en réalité la limite du rapport entre les accrossements infiniment petits et simultanés de la fonction et de la variable (ibid., p. 6) [emphasis added–the authors]

The same definition of the derivative is repeated on page 7, this time emphasized by means of italics. Note Cauchy’s emphasis on the point that the derivative is not a ratio of infinitesimal increments, but rather the limit of the ratio.

Cauchy’s use of the term “limit” as applied to a ratio of infinitesimals in this context may be unfamiliar to a modern reader, accustomed to taking limits of sequences of real numbers. Its meaning is clarified by Cauchy’s discussion of “neglecting” higher order infinitesimals in the previous paragraph on page 5 cited above. Cauchy’s use of “limit” is procedurally identical with the Leibnizian TLH, and therefore similarly finds its modern proxy as extracting the standard part out of the ratio of infinitesimals.

On page 11, Cauchy chooses infinitesimal increments  and , and writes down the equation

 dsdt=lim.ΔsΔt. (6.1)

Modulo replacing Cauchy’s symbol “lim.” by the modern one “st” or “sh”, Cauchy’s formula (6.1) is identical to the formula appearing in any textbook based on the hyperreal approach, expressing the derivative in terms of the standard part function (shadow).

6.2. Cauchy on continuity

On page 17 of his 1844 text, Cauchy gives a definition of continuity in terms of infinitesimals (an infinitesimal -increment necessarily produces an infinitesimal -increment). His definition is nearly identical with the italicized definition that appeared on page 34 in his Cours d’Analyse (Cauchy 1821 [24]), 23 years earlier, when he first introduced the modern notion of continuity. We will use the translation by Bradley & Sandifer (2009 [20]). In his Section 2.2 entitled Continuity of functions, Cauchy writes:

If, beginning with a value of  contained between these limits, we add to the variable  an infinitely small increment , the function itself is incremented by the difference .

Cauchy goes on to state that

the function  is a continuous function of  between the assigned limits if, for each value of  between these limits, the numerical value of the difference  decreases indefinitely with the numerical value of .

He then proceeds to provide an italicized definition of continuity in the following terms:

the function  is continuous with respect to  between the given limits if, between these limits, an infinitely small increment in the variable always produces an infinitely small increment in the function itself.

In modern notation, Cauchy’s definition can be stated as follows. Denote by  the halo of , i.e., the collection of all points infinitely close to . Then  is continuous at  if

 f(\scalebox1[.3]$◯$x)⊂\scalebox2[.3]$◯$f(x). (6.2)

Most scholars hold that Cauchy never worked with a pointwise definition of continuity (as is customary today) but rather required a condition of type (6.2) to hold in a range (“between the given limits”). It is worth recalling that Cauchy never gave an  definition of either limit or continuity (though (-type arguments occasionally do appear in Cauchy). It is a widespread and deeply rooted misconception among both mathematicians and those interested in the history and philosophy of mathematics that it was Cauchy who invented the modern definitions of limit and continuity; see, e.g., Colyvan & Easwaran (2008 [27, p. 88]) who err in attributing the formal  definition of continuity to Cauchy. That this is not the case was argued by Błaszczyk et al. (2013 [15]); Borovik et al. (2012 [17]); Katz & Katz (2011 [58]); Nakane (2014 [80]); Tall et al. (2014 [97]).

7. Modern formalisations: a case study

To illustrate the use of the standard part in the context of the hyperreal field extension of , we will consider the following problem on divergent integrals. The problem was recently posed at SE, and is reportedly due to S. Konyagin. The solution exploits the technique of a monotone rearrangement  of a function , shown by Ryff to admit a measure-preserving map such that . In general there is no “inverse”  such that ; however, a hyperreal enlargement enables one to construct a suitable (internal) proxy for such a , so as to be able to write ; see formula (8.2) below.

Theorem 7.1.

Let  be a real-valued function continuous on . Then there exists a number  such that the integral

 ∫101|f(x)−a|dx (7.1)

diverges.

A proof can be given in terms of a monotone rearrangement of the function (see Hardy et al. [46]). We take a decreasing rearrangement  of the function . If  is continuous, then the function  will also be continuous. If  is not constant on any set of positive measure, one can construct  by setting

 g=m−1wherem(y)=meas{x:f(x)>y}. (7.2)

Ryff (1970 [90]) showed that there exists a measure-preserving transformation222222However, see Section 8 for a hyperfinite approach avoiding measure theory altogether. that relates  and  as follows:

 f(x)=g∘ϕ(x) (7.3)

Finding a map  such that  is in general impossible (see Bennett & Sharpley [12, p. 85, example 7.7] for a counterexample). This difficulty can be circumvented using a hyperfinite rearrangement (see Section 8). By measure preservation, we have

 ∫10|f(x)−a|−1dx=∫10|g(x)−a|−1dx

(for every ).232323Here one needs to replace the function by the family of its truncations , and then let  increase without bound.

To complete the proof of Theorem 7.1, apply the result that every monotone function is a.e. differentiable.242424In fact, one does not really need to use the result that monotone functions are a.e. differentiable. Consider the convex hull in the plane of the graph of the monotone function , and take a point where the graph touches the boundary of the convex hull (other than the endpoints  and ). Setting  equal to the -coordinate of the point does the job. Take a point  where the function  is differentiable. Then the number  yields an infinite integral (7.1), since the difference  can be bounded above in terms of a linear expression.252525Namely, for  near such a point , we have , hence , yielding a lower bound in terms of a divergent integral.

8. A combinatorial approach to decreasing rearrangements

The existence of a decreasing rearrangement of a function  continuous on  admits an elegant proof in the context of its hyperreal extension , which we will continue to denote by .

We present a combinatorial argument showing that the decreasing rearrangement obeys the same modulus of uniformity as the original function.262626A function  on  is said to satisfy a modulus of uniformity , if . The argument actually yields an independent construction of the decreasing rearrangement (see Proposition 8.1) that avoids recourse to measure theory. It also yields an “inverse up to an infinitesimal,”  (see formula (8.2)), to the function  such that . For a recent application of combinatorial arguments in a hyperreal framework, see Benci et al. (2013 [11]).

In passing from the finite to the continuous case of rearrangements, Bennett and Sharpley [12] note that

nonnegative sequences  and are equimeasurable if and only if there is a permutation of  such that  for . … The notion of permutation is no longer available in this context [of continuous measure spaces] and is replaced by that of a “measure-preserving transformation” (Bennett and Sharpley 1988 [12, p. 79]).

We show that the hyperreal framework allows one to continue working with combinatorial ideas, such as the “inverse” function , in the continuous case as well.

Let , let  for . By the Transfer Principle (see e.g., Davis [28]; Herzberg [49]; Kanovei & Reeken [56]), the nonstandard domain of internal sets satisfies the same basic laws as the usual, “standard” domain of real numbers and related objects. Thus, as for finite sets, there exists a permutation  of the hyperfinite grid

 GH={p1,…,pH} (8.1)

by decreasing value of  (here  is the maximal value). We assume that equal values are ordered lexicographically so that . Hence we obtain an internal function

 ^g(pi)=f(ψ(pi)),i=1,…,H. (8.2)

Here  is (perhaps nonstrictly) decreasing on the grid  of (8.1). The internal sequences  and , where , are equinumerable in the sense above.

Proposition 8.1.

Let  be an arbitrary continuous function. Then there is a standard continuous real function  such that  for all , where  denotes the standard part of a hyperreal .

Proof.

Let . We claim that  is S-continuous (microcontinuous), i.e., for each pair , if  is infinitesimal then so is . To prove the claim, we will prove the following stronger fact:

for every  there are  such that  and .

The sets  and  are nonempty and there are at most  points which are not in . Let  and  be such that  is minimal. All integers between  and  are not in . Hence there are at most  such integers, and therefore . By definition of  and , we obtain , which proves the claim. Thus  is indeed S-continuous.

This allows us to define, for any standard , the value  to be the standard part of the hyperreal  for any hyperinteger  such that  is infinitely close to , and then  is a continuous272727The argument shows in fact that the modulus of uniformity of is bounded by that of ; see footnote 26. and (non-strictly) monotone real function equal to the decreasing rearrangement  of (7.2). ∎

The hyperreal approach makes it possible to solve Konyagin’s problem without resorting to standard treatments of decreasing rearrangements which use measure theory. Note that the rearrangement defined by the internal permutation  preserves the integral of  (as well as the integrals of the truncations of ), in the following sense. The right-hand Riemann sums satisfy

 H∑i=1f(pi)Δx=H∑i=1f(ψ(pi))Δx=H∑i=1^g(pi)Δx, (8.3)

where . Thus  transforms a hyperfinite Riemann sum of  into a hyperfinite Riemann sum of . Since  and , we conclude that  and  have the same integrals, and similarly for the integrals of ; see footnote 23.

The first equality in (8.3) holds automatically by the transfer principle even though  is an infinite permutation. (Compare with the standard situation where changing the order of summation in an infinite sum generally requires further justification.) This illustrates one of the advantages of the hyperreal approach.

9. Conclusion

We have critically reviewed several common misrepresentations of hyperreal number systems, not least in relation to their alleged non-constructiveness, from a historical, philosophical, and set-theoretic perspective. In particular we have countered some of Easwaran’s recent arguments against the use of hyperreals in formal epistemology. A hyperreal framework enables a richer syntax better suited for expressing proxies for procedural moves found in the work of Fermat, Leibniz, Euler, and Cauchy. Such a framework sheds light on the internal coherence of their procedures which have been often misunderstood from a whiggish post-Weierstrassian perspective.

Acknowledgments

The work of Vladimir Kanovei was partially supported by RFBR grant 13-01-00006. M. Katz was partially funded by the Israel Science Foundation grant no. 1517/12. We are grateful to Thomas Mormann and to the anonymous referee for helpful suggestions, and to Ivor Grattan-Guinness for contributing parts of the introduction.

References

• [1] Albeverio, S.; Høegh-Krohn, R.; Fenstad, J.; Lindstrøm, T. Nonstandard Methods in Stochastic Analysis and Mathematical Physics. Pure and Applied Mathematics, 122. Academic Press, Inc., Orlando, FL, 1986.
• [2] Alexander, A. Infinitesimal: How a Dangerous Mathematical Theory Shaped the Modern World. Farrar, Straus and Giroux, 2014.
• [3] Alling, N. Foundations of Analysis over Surreal Number Fields. North-Holland Mathematical Library, 1987.
• [4] Anderson, R. A non-standard representation for Brownian motion and Itô integration. Israel Journal of Mathematics 25 (1976), no. 1-2, 15–46.
• [5] Arthur, R. Leibniz’s syncategorematic infinitesimals. Arch. Hist. Exact Sci. 67 (2013), no. 5, 553–593.
• [6] Bair, J.; Błaszczyk, P.; Ely, R.; Henry, V.; Kanovei, V.; Katz, K.; Katz, M.; Kutateladze, S.; McGaffey, T.; Schaps, D.; Sherry, D.; Shnider, S. Is mathematical history written by the victors? Notices of the American Mathematical Society 60 (2013) no. 7, 886-904. See http://www.ams.org/notices/201307/rnoti-p886.pdf and http://arxiv.org/abs/1306.5973
• [7] Bascelli, T. Galileo’s quanti: understanding infinitesimal magnitudes. Arch. Hist. Exact Sci. 68 (2014), no. 2, 121–136.
• [8] Bascelli, T. Infinitesimal issues in Galileo’s theory of motion. Revue Roumaine de Philosophie 58 (2014), no. 1, 23-41. Tiziana Bascelli, ”Infinitesimal Issues in Galileo’s Theory of Motion”, in Rev. Roum. Philosophie, 58,
• [9] Bell, J. The Continuous and the Infinitesimal in Mathematics and Philosophy. Polimetrica, 2006.
• [10] Benacerraf, P. What numbers could not be. Philos. Rev. 74 (1965), 47–73.
• [11] Benci, V.; Bottazzi E.; Di Nasso, M. Elementary numerosity and measures, preprint (2013).
• [12] Bennett, C.; Sharpley, R. Interpolation of operators. Pure and Applied Mathematics 129. Academic Press, Boston, MA, 1988.
• [13] Bibiloni, L.; Viader, P.; Paradís, J. On a series of Goldbach and Euler. Amer. Math. Monthly 113 (2006), no. 3, 206–220.
• [14] Bishop, E. Mathematics as a numerical language. 1970 Intuitionism and Proof Theory (Proc. Conf., Buffalo, N.Y., 1968) pp. 53–71. North-Holland, Amsterdam.
• [15] Błaszczyk, P.; Katz, M.; Sherry, D. Ten misconceptions from the history of analysis and their debunking. Foundations of Science, 18 (2013), no. 1, 43-74. See http://dx.doi.org/10.1007/s10699-012-9285-8 and http://arxiv.org/abs/1202.4153
• [16] Borovik, A.; Jin, R.; Katz, M. An integer construction of infinitesimals: Toward a theory of Eudoxus hyperreals. Notre Dame Journal of Formal Logic 53 (2012), no. 4, 557-570. See http://dx.doi.org/10.1215/00294527-1722755 and http://arxiv.org/abs/1210.7475
• [17] Borovik, A.; Katz, M. Who gave you the Cauchy–Weierstrass tale? The dual history of rigorous calculus. Foundations of Science 17 (2012), no. 3, 245-276. See http://dx.doi.org/10.1007/s10699-011-9235-x and http://arxiv.org/abs/1108.2885
• [18] Bos, H. J. M. Differentials, higher-order differentials and the derivative in the Leibnizian calculus. Arch. History Exact Sci. 14 (1974), 1–90.
• [19] Bos, H. J. M.; Bunn, R.; Dauben, J.; Grattan-Guinness, I.; Hawkins, T.; Pedersen, K. M. From the calculus to set theory, 1630–1910. An introductory history. Edited by I. Grattan-Guinness. Gerald Duckworth & Co. Ltd., London, 1980.
• [20] Bradley, R.; Sandifer, C. Cauchy’s Cours d’analyse. An annotated translation. Sources and Studies in the History of Mathematics and Physical Sciences. Springer, New York, 2009.
• [21] Breger, H. The mysteries of adaequare: a vindication of Fermat. Arch. Hist. Exact Sci. 46 (1994), no. 3, 193–219.
• [22] Brukner, Č.; Zeilinger, A.: Quantum physics as a science of information, in Quo vadis quantum mechanics?, 47-61, Frontiers Collection, Springer, Berlin, 2005.
• [23] Carroll, M.; Dougherty, S.; Perkins, D. Indivisibles, Infinitesimals and a Tale of Seventeenth-Century Mathematics. Mathematics Magazine 86 (2013), no. 4, 239–254.
• [24] Cauchy, A. L. Cours d’Analyse de L’Ecole Royale Polytechnique. Première Partie. Analyse algébrique. Paris: Imprimérie Royale, 1821. Online at http://books.google.com/books?id=_mYVAAAAQAAJ&dq=cauchy&lr=&source=gbs_navlinks_s
• [25] Cauchy, A. L. Exercices d’analyse et de physique mathématique (vol. 3). Paris, Bachelier, 1844.
• [26] Cohen, P. Set theory and the continuum hypothesis. W. A. Benjamin, New York-Amsterdam, 1966.
• [27] Colyvan, M.; Easwaran, K. Mathematical and physical continuity. Australas. J. Log. 6 (2008), 87–93.
• [28] Davis, M. Applied nonstandard analysis. Pure and Applied Mathematics. Wiley-Interscience [John Wiley & Sons], New York-London-Sydney, 1977. Reprinted: Dover, NY, 2005, see http://store.doverpublications.com/0486442292.html
• [29] Dowek, G. Real numbers, chaos, and the principle of a bounded density of information. Invited paper at International Computer Science Symposium in Russia, 2013. See https://who.rocq.inria.fr/Gilles.Dowek/Publi/csr.pdf
• [30] Earman, J. Infinities, infinitesimals, and indivisibles: the Leibnizian labyrinth. Studia Leibnitiana 7 (1975), no. 2, 236–251.
• [31] Easwaran, K. Regularity and hyperreal credences. Philosophical Review 123 (2014), No. 1, 1–41.
• [32] Ehrlich, P. The rise of non-Archimedean mathematics and the roots of a misconception. I. The emergence of non-Archimedean systems of magnitudes. Arch. Hist. Exact Sci. 60 (2006), no. 1, 1–121.
• [33] Elga, A. Infinitesimal chances and the laws of nature. Australasian Journal of Philosophy 82 (2004), no. 1, 67-76.
• [34] Euler, L. Institutiones Calculi Differentialis. SPb, 1755.
• [35] Euler, L. Foundations of Differential Calculus. English translation of Chapters 1–9 of ([34]) by D. Blanton, Springer, N.Y., 2000.
• [36] Feferman, S.; Levy, A. Independence results in set theory by Cohen’s method II. Notices Amer. Math. Soc. 10 (1963), 593.
• [37] Felgner, U. Der Begriff der ”Angleichung” (, adaequatio) bei Diophant und Fermat (2014), preprint.
• [38] Fermat, P. Méthode pour la recherche du maximum et du minimum. p. 121-156 in Tannery’s edition [99].
• [39] Fermat, P. Letter to Brûlart. Oeuvres, Vol. 5, pp. 120-125.
• [40] Ferraro, G. Differentials and differential coefficients in the Eulerian foundations of the calculus. Historia Mathematica 31 (2004), no. 1, 34–61. See http://dx.doi.org/10.1016/S0315-0860(03)00030-2.
• [41] Gerhardt:, C. I. (ed.) Historia et Origo calculi differentialis a G. G. Leibnitio conscripta, ed. C. I. Gerhardt, Hannover, 1846.
• [42] Gerhardt, C. I. (ed.) Leibnizens mathematische Schriften (Berlin and Halle: Eidmann, 1850-1863).
• [43] Giusti, E. Les méthodes des maxima et minima de Fermat. Ann. Fac. Sci. Toulouse Math. (6) 18 (2009), Fascicule Special, 59–85.
• [44] Goldenbaum U.; Jesseph D. (Eds.) Infinitesimal Differences: Controversies between Leibniz and his Contemporaries. Berlin-New York: Walter de Gruyter, 2008.
• [45] Guillaume, M. “Review of Katz, M.; Sherry, D. Leibniz’s infinitesimals: their fictionality, their modern implementations, and their foes from Berkeley to Russell and beyond. Erkenntnis 78 (2013), no. 3, 571–625.” Mathematical Reviews (2014). See http://www.ams.org/mathscinet-getitem?mr=3053644
• [46] Hardy, G.; Littlewood, J. Pólya, G. Inequalities. Second edition. Cambridge University Press, 1952.
• [47] Heaton, H. Infinity, the Infinitesimal, and Zero. American Mathematical Monthly 5 (1898), no. 10, 224–226.
• [48] Herzberg, F. Internal laws of probability, generalized likelihoods and Lewis’ infinitesimal chances—a response to Adam Elga. British Journal for the Philosophy of Science 58 (2007), no. 1, 25-43.
• [49] Herzberg, F. Stochastic calculus with infinitesimals. Lecture Notes in Mathematics, 2067. Springer, Heidelberg, 2013.
• [50] Hewitt, E. Rings of real-valued continuous functions. I. Trans. Amer. Math. Soc. 64 (1948), 45–99.
• [51] Heijting, A. Address to Professor A. Robinson. At the occasion of the Brouwer memorial lecture given by Prof. A. Robinson on the 26th April 1973. Nieuw Archief voor Wiskunde (3) 21 (1973), 134–137.
• [52] Ishiguro, H. Leibniz’s philosophy of logic and language. Second edition. Cambridge University Press, Cambridge, 1990.
• [53] Jaroszkiewicz, G. Principles of Discrete Time Mechanics. Cambridge Monographs on Mathematical Physics. Cambridge University Press, 2014.
• [54] Jech, T. The axiom of choice. Studies in Logic and the Foundations of Mathematics, Vol. 75. North-Holland Publishing Co., Amsterdam-London; Amercan Elsevier Publishing Co., Inc., New York, 1973.
• [55] Kanovei, V.; Katz, M.; Mormann, T. Tools, Objects, and Chimeras: Connes on the Role of Hyperreals in Mathematics. Foundations of Science 18 (2013), no. 2, 259–296. See http://dx.doi.org/10.1007/s10699-012-9316-5 and http://arxiv.org/abs/1211.0244
• [56] Kanovei, V.; Reeken, M. Nonstandard analysis, axiomatically. Springer Monographs in Mathematics, Berlin: Springer, 2004.
• [57] Kanovei, V.; Shelah, S. A definable nonstandard model of the reals. Journal of Symbolic Logic 69 (2004), no. 1, 159–164.
• [58] Katz, K.; Katz, M. Cauchy’s continuum. Perspectives on Science 19 (2011), no. 4, 426-452. See http://arxiv.org/abs/1108.4201 and http://www.mitpressjournals.org/doi/abs/10.1162/POSC_a_00047
• [59] Katz, K.; Katz, M. Meaning in classical mathematics: is it at odds with Intuitionism? Intellectica 56 (2011), no. 2, 223–302. See http://arxiv.org/abs/1110.5456
• [60] Katz, K.; Katz, M. A Burgessian critique of nominalistic tendencies in contemporary mathematics and its historiography. Foundations of Science 17 (2012), no. 1, 51–89. See http://dx.doi.org/10.1007/s10699-011-9223-1 and http://arxiv.org/abs/1104.0375
• [61] Katz, K.; Katz, M.; Kudryk, T. Toward a clarity of the extreme value theorem. Logica Universalis 8 (2014), no. 2, 193-214. See http://dx.doi.org/10.1007/s11787-014-0102-8 and http://arxiv.org/abs/1404.5658
• [62] Katz, M.; Leichtnam, E. Commuting and noncommuting infinitesimals. American Mathematical Monthly 120 (2013), no. 7, 631–641. See http://dx.doi.org/10.4169/amer.math.monthly.120.07.631 and http://arxiv.org/abs/1304.0583
• [63] Katz, M.; Schaps, D.; Shnider, S. Almost Equal: The Method of Adequality from Diophantus to Fermat and Beyond. Perspectives on Science 21 (2013), no. 3, 283-324. See http://www.mitpressjournals.org/doi/abs/10.1162/POSC_a_00101 and http://arxiv.org/abs/1210.7750
• [64] Katz, M.; Sherry, D. Leibniz’s laws of continuity and homogeneity. Notices of the American Mathematical Society 59 (2012), no. 11, 1550-1558. See http://www.ams.org/notices/201211/ and http://arxiv.org/abs/1211.7188
• [65] Katz, M.; Sherry, D. Leibniz’s infinitesimals: Their fictionality, their modern implementations, and their foes from Berkeley to Russell and beyond. Erkenntnis 78 (2013), no. 3, 571–625. See http://dx.doi.org/10.1007/s10670-012-9370-y and http://arxiv.org/abs/1205.0174
• [66] Katz, V. “Review of Bair et al., Is mathematical history written by the victors? Notices Amer. Math. Soc. 60 (2013), no. 7, 886–904.” Mathematical Reviews
• [67] Keisler, H. J. Elementary Calculus: An Infinitesimal Approach. Second Edition. Prindle, Weber & Schimidt, Boston, 1986. Revision from february 2012 online at http://www.math.wisc.edu/keisler/calc.html
• [68] Klein, F. Elementary Mathematics from an Advanced Standpoint. Vol. I. Arithmetic, Algebra, Analysis. Translation by E. R. Hedrick and C. A. Noble [Macmillan, New York, 1932] from the third German edition [Springer, Berlin, 1924]. Originally published as Elementarmathematik vom höheren Standpunkte aus (Leipzig, 1908).
• [69] Knobloch, E. Leibniz’s rigorous foundation of infinitesimal geometry by means of Riemannian sums. Foundations of the formal sciences, 1 (Berlin, 1999). Synthese 133 (2002), no. 1-2, 59–73.
• [70] Knobloch, E. “Review of: Katz, M.; Schaps, D.; Shnider, S. Almost equal: the method of adequality from Diophantus to Fermat and beyond. Perspectives on Science 21 (2013), no. 3, 283–324.” Mathematical Reviews
• [71] Laugwitz, D. Infinitely small quantities in Cauchy’s textbooks. Historia Mathematica 14 (1987), no. 3, 258–274.
• [72] Leibniz, G. Nova methodus pro maximis et minimis …, in Acta Erud., Oct. 1684. See Gerhardt [42], V, pp. 220-226.
• [73] Leibniz, G. (1695) Responsio ad nonnullas difficultates a Dn. Bernardo Niewentiit circa methodum differentialem siu infinitesimalem motas. In Gerhardt [42], V, p. 320–328.
• [74] Leibniz, G. (1701) Cum Prodiisset…mss “Cum prodiisset atque increbuisset Analysis mea infinitesimalis …” in Gerhardt [41], pp. 39–50.
• [75] Leibniz, G. (1710) Symbolismus memorabilis calculi algebraici et infinitesimalis in comparatione potentiarum et differentiarum, et de lege homogeneorum transcendentali. In Gerhardt [42, vol. V, pp. 377-382].
• [76] Lewis, A. Bringing modern mathematics to bear on historical research. Historia Mathematica 2 (1975), 84-85.
• [77] Lewis, D. A subjectivist’s guide to objective chance. In Studies in Inductive Logic and Probability (ed. Richard C. Jeffrey), pp. 263–293. University of California Press, 1980.
• [78] Łoś, J. Quelques remarques, théorèmes et problèmes sur les classes définissables d’algèbres. In Mathematical interpretation of formal systems, 98–113, North-Holland Publishing Co., Amsterdam, 1955.
• [79] Mormann, T.; Katz, M. Infinitesimals as an issue of neo-Kantian philosophy of science. HOPOS: The Journal of the International Society for the History of Philosophy of Science 3 (2013), no. 2, 236-280. See http://www.jstor.org/stable/10.1086/671348 and http://arxiv.org/abs/1304.1027
• [80] Nakane, M. Did Weierstrass’s differential calculus have a limit-avoiding character? His definition of a limit in - style. BSHM Bulletin: Journal of the British Society for the History of Mathematics, 29 (2014), no. 1, 51-59. See http://dx.doi.org/10.1080/17498430.2013.831241
• [81] Nelson, E. Internal set theory: a new approach to nonstandard analysis. Bulletin of the American Mathematical Society 83 (1977), no. 6, 1165–1198.
• [82] Nelson, E. Radically elementary probability theory. Annals of Mathematics Studies, vol. 117. Princeton University Press, 1987.
• [83] Nelson, E. Warning signs of a possible collapse of contemporary mathematics. In Infinity (eds. Michael Heller, W. Hugh Woodin), pp. 76–85. Cambridge University Press, 2011.
• [84] Pourciau, B. Newton and the notion of limit. Historia Mathematica 28 (2001), no. 1, 18–30.
• [85] Reder, C. Transient behaviour of a Galton-Watson process with a large number of types. Journal of Applied Probability 40 (2003), no. 4 1007–1030.
• [86] Reeder, P. Internal Set Theory and Euler’s Introductio in Analysin Infinitorum. MSc Thesis, Ohio State University, 2013.
• [87] Robinson, A. Non-standard analysis. Nederl. Akad. Wetensch. Proc. Ser. A 64 = Indag. Math. 23 (1961), 432–440 [reprinted in Selected Works, see item [88], pp. 3-11]
• [88] Robinson, A. Selected papers of Abraham Robinson. Vol. II. Nonstandard analysis and philosophy. Edited and with introductions by W. A. J. Luxemburg and S. Körner. Yale University Press, New Haven, Conn., 1979.
• [89] Russell, B. The Principles of Mathematics. Routledge. London 1903.
• [90] Ryff, J. Measure preserving transformations and rearrangements. J. Math. Anal. Appl. 31 (1970), 449–458.
• [91] Settle, T. Galilean science: essays in the mechanics and dynamics of the Discorsi. Ph.D. dissertation, Cornell University, 1966, 288 pages.
• [92] Sherry, D.; Katz, M. Infinitesimals, imaginaries, ideals, and fictions. Studia Leibnitiana 44 (2012), no. 2, 166-192. See http://arxiv.org/abs/1304.2137
• [93] Skolem, T. Peano’s axioms and models of arithmetic. In Mathematical interpretation of formal systems, pp. 1–14. North-Holland Publishing Co., Amsterdam, 1955.
• [94] Skyrms, B. Causal Necessity: A Pragmatic Investigation of the Necessity of Laws. Yale University Press, 1980.
• [95] Strømholm, P. Fermat’s methods of maxima and minima and of tangents. A reconstruction. Arch. History Exact Sci. 5 (1968), no. 1, 47–69.
• [96] Sullivan, K. Mathematical Education: The Teaching of Elementary Calculus Using the Nonstandard Analysis Approach. Amer. Math. Monthly 83 (1976), no. 5, 370–375.
• [97] Tall, D.; Katz, M. A cognitive analysis of Cauchy’s conceptions of function, continuity, limit, and infinitesimal, with implications for teaching the calculus. Educational Studies in Mathematics, to appear. See http://dx.doi.org/10.1007/s10649-014-9531-9 and http://arxiv.org/abs/1401.1468
• [98] Tannery, P., Henry, C. Oeuvres de Fermat, Vol. 1 Gauthier-Villars, 1891.
• [99] Tannery, P., Henry, C. Oeuvres de Fermat, Vol. 3 Gauthier-Villars, 1896.
• [100]
• [101] Tho, T. Equivocation in the foundations of Leibniz’s infinitesimal fictions. Society and Politics 6 (2012), no. 2, 70–98.
• [102] Weil, A. Number theory. An approach through history. From Hammurapi to Legendre. Birkhäuser Boston, Inc., Boston, MA, 1984.
• [103] Wheeler, J.: At home in the universe. Masters of Modern Physics. American Institute of Physics, Woodbury, NY, 1994.
• [104] Wiener, N. Differential space. Journal of Mathematical Physics 2 (1923), no. 1, 131–174.
• [105] Williamson, T. How probable is an infinite sequence of heads? Analysis 67 (2007), no. 3, 173–180.
• [106] Wisan, W. The new science of motion: a study of Galileo’s De motu locali. Arch. History Exact Sci. 13 (1974), 103–306.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters