A Sharp Double Inequality forthe Inverse Tangent Function

# A Sharp Double Inequality for the Inverse Tangent Function

Gholamreza Alirezaei G. Alirezaei is with the Institute for Theoretical Information Technology, RWTH Aachen University, 52056 Aachen, Germany (e-mail: alirezaei@ti.rwthaachen.de).The present work is categorized in terms of Mathematics Subject Classification (MSC2010): 26D05, 26D07, 26D15, 33B10, 39B62.
###### Abstract

The inverse tangent function can be bounded by different inequalities, for example by Shafer’s inequality. In this publication, we propose a new sharp double inequality, consisting of a lower and an upper bound, for the inverse tangent function. In particular, we sharpen Shafer’s inequality and calculate the best corresponding constants. The maximum relative errors of the obtained bounds are approximately smaller than and for the lower and upper bound, respectively. Furthermore, we determine an upper bound on the relative errors of the proposed bounds in order to describe their tightness analytically. Moreover, some important properties of the obtained bounds are discussed in order to describe their behavior and achieved accuracy.

trigonometric bounds; Shafer’s inequality; inverse tangent approximation;

## I Introduction

The inverse tangent function is an elementary mathematical function that appears in many applications, especially in different fields of engineering. In electrical engineering, especially in the communication theory and signal processing, it is mostly used to describe the phase of a complex-valued signal. But there are many other applications in which the inverse tangent function plays an important role. On the one hand, it is often used as an approximation for more complex functions because of its elementary behavior. For instance, the Heaviside step function is the most famous function that can be very accurately approximated by the inverse tangent function. On the other hand, it is sometimes approximated by simpler functions in order to enable further calculations. For instance, the inverse tangent function can be accurately approximated by its argument if the absolute value of the argument is sufficiently small. Quite naturally the problem arises how to replace the inverse tangent function with a surrogate function, in order to approximate the inverse tangent function as well as other contemplable functions accurately. If a surrogate function with a mathematically simple form could be found, then the subsequent application of such a surrogate function would be considerable. A few application cases are in the field of information and estimation theory where an unknown phase shift or the direction of arrival is estimated, for example by using the CORDIC-algorithm [1], the MUSIC-algorithm [2], or MAP and ML estimators [3]. Some other cases are in the field of system design and control theory where a non-linear network unit is modeled by a non-linear function, for instance the saturation behavior of an amplifier [4] or the sigmoidal non-linearity in neuronal networks [5]. Some more application cases are related to the theory of signals and systems, where a signal should be mapped into a set of coefficients of basis functions, however the transformation is not feasible because of the phase description by the inverse tangent function, for example in some Fourier-related transforms [6]. Many other applications are likewise conceivable.

But we have to mention that finding a simple replacement for the inverse tangent function is in fact difficult. In the present work, we thus focus only on a special idea which has some nice properties and is described in the following.

In [7], R. E. Shafer proposed the elementary problem: Show that for all the inequality

 3x1+2√1+x2

holds, where denotes the inverse tangent function that is defined for all real numbers . From Shafar’s problem several inequalities have been emerged to date. In particular, the authors in [8] investigated double inequalities of the form

 a1xa2+√1+x20 , (2)

and they determined the coefficients , , and such that the above double inequality is sharp. In the present work, we follow a similar idea and investigate a generalized version of the double inequality in (2). We consider functions of the type

 xc1+√c2+c3x2 (3)

with positive real coefficients , and , because such kind of functions has advantageous properties in order to replace the inverse tangent function as we will discuss in the next section. Then we determine the triple such that a lower and an upper bound for the inverse tangent function is achieved. In order to describe the tightness of the obtained bounds, we determine an upper bound on the relative errors of the proposed bounds. Furthermore, we discuss some corresponding properties of the proposed bounds and visualize the achieved results.

G. A.

July 18, 2013

### Mathematical Notations:

Throughout this paper we denote the set of real numbers by . The mathematical operation denotes the absolute value of any real number . Furthermore, denotes the order of any function .

## Ii Main Theorems

In the current section, we present the new bounds for the inverse tangent function and describe some of their important properties. {theorem} For all , let , and be defined by

 (4)
 g(x)\coloneqqarctan(x) (5)

and

 (6)

Then, for all , the double inequality

 \absf(x) ≤ \absg(x) ≤ \absh(x) (7)

holds.

###### Proof.

See Appendix A. ∎

{remark}

The functions , and are point symmetric such that the identities , and hold. Hence, it is sufficient to consider only the case of . {remark} The triples and are the best possible ones such that the above double inequality holds. In other words, no component of the first triple can be replaced by a smaller value and no component of the second triple can be replaced by a larger value with respect to while keeping the other components fixed. In this sense, the double inequality in Theorem II is sharp.

We get a first impression of the nature of the bounds from Figure 1. As we can see the double inequality in Theorem II is very tight. The curves seem to be continuous, strictly increasing and convex. Hence, we elaborately discuss the mathematical properties of the obtained bounds in the following.

On the one hand, the first three elements in the Taylor series expansions of , and as approaches zero are obtained as

 f(x)≃x−2π2−4x3+23π2−8(π2−4)3x5+\Ord(x7) , (8)
 g(x)≃x−13x3+15x5+\Ord(x7) (9)

and

 h(x)≃x−13x3+π2+12108x5+\Ord(x7) . (10)
{remark}

Only the both first elements in the Taylor series expansions of and are identical to each other, while in the Taylor series expansions of and the both first two elements are pairwise identical to each other. Thus, achieves a better approximation of than for sufficiently small . On the other hand, the first three elements in the asymptotic power series expansions of , and as approaches infinity are obtained as

 f(x)≃π2−1x−π4−8π2−1616πx2+\Ord(x−3) , (11)
 g(x)≃π2−1x+13x3+\Ord(x−5) (12)

and

 h(x)≃π2−π2−64x+π4−12π2+188πx2+\Ord(x−3) (13)

by using the general definition of the asymptotic power series expansion [9, p. 11, Definition 1.3.3] and simple calculations. {remark} The both first two elements in the asymptotic power series expansions of and are pairwise identical to each other while in the asymptotic power series expansions of and only the both first elements are identical to each other. Thus, achieves a better approximation of than for sufficiently large . {corollary} From equations (8)–(13) we conclude that

 limx↦±0f(x)=limx↦±0g(x)=limx↦±0h(x)=0 (14)

and

 limx↦±∞f(x)=limx↦±∞g(x)=limx↦±∞h(x)=±π2 . (15)
{lemma}

For all , both bounds and are continuous.

###### Proof.

Both numerators and denominators of and are continuous functions in and the denominators are always non-zero which imply the absence of discontinuities. ∎

{lemma}

For all , both bounds and are strictly increasing.

###### Proof.

By differentiation we obtain the following first derivative

 d\dxxc1+√c2+c3x2=c2+c1√c2+c3x2√c2+c3x2[c1+√c2+c3x2]2 . (16)

This derivative is positive for all because , and are positive constants in both bounds. Hence, the bounds are strictly increasing. ∎

{corollary}

Both bounds and are differentiable on , because the first derivative of the bounds exists due to the derivative in equation (16). {corollary} Both bounds and are limited, due to equation (15) and because of the monotonicity in Lemma II. {corollary} Both bounds and do not have any critical points, because they are strictly increasing and differentiable on . {lemma} For all , both bounds and are concave while for all , both bounds are convex.

###### Proof.

By differentiation we obtain the following second derivative

 d2\dx2xc1+√c2+c3x2=−c3x3c1c2+2c1c3x2+3c2√c2+c3x2(c2+c3x2)\sfrac32[c1+√c2+c3x2]3 . (17)

The sign of this derivative is only dependent on because , and are positive constants in both bounds. Hence, this derivative is non-positive for all and non-negative for all which completes the proof. ∎

{corollary}

Both bounds and have the same unique inflection point at the origin, due to opposing convexities for and , see Lemma II.

In the following enumeration, we now summarize the properties of the bounds that have been shown, thus far.

1. The bounds are equal only at zero and they approach the same limit as approaches infinity.

2. Both bounds are point symmetric, continuous, strictly increasing, differentiable, and limited.

3. They are convex for all and concave otherwise.

4. There are no critical points.

5. Both bounds have the same unique inflection point.

The above properties enable us to use the proposed bounds suitable in future works. It remains to show the tightness of the bounds with respect to the inverse tangent function. For this purpose we deduce an upper-bound on the actual relative errors of the bounds, in the following. {definition} For all , the relative errors of the bounds given in Theorem II are defined by

 rf(x)\coloneqqg(x)−f(x)g(x) (18)

and

 rh(x)\coloneqqh(x)−g(x)g(x) . (19)

Note that is a removable singularity for both last ratios because of approximations (8), (9) and (10). Thus, and are continuously extendable over . All following fractions are also continuously extendable over in a similar manner so that no difficulties related to singularities occur, hereinafter.

{theorem}

For all , the inequalities

 max{rf(x),rh(x)}≤h(x)−f(x)f(x)=10−π2−2√9+π2x2+√(π2−4)2+4π2x2π2−6+2√9+π2x2 (20)

and

 min{rf(x),rh(x)}≤h(x)−f(x)h(x)+f(x)=10−π2−2√9+π2x2+√(π2−4)2+4π2x2π2−2+2√9+π2x2+√(π2−4)2+4π2x2≤max{rf(x),rh(x)} (21)

hold.

###### Proof.

See Appendix B. ∎

Note, that the inequalities in Theorem II do not contain the inverse tangent function, at all.

In Figure 2, the relative errors of the obtained bounds are shown. The maximum relative errors of the bounds are approximately smaller than and for and , respectively. It is worthwhile mentioning that both bounds are valid for the whole domain of real numbers.

## Iii Conclusion

In the present work, we have investigated the approximation of the inverse tangent function and deduced two new bounds. We have derived a lower and an upper bound with simple closed-form formulae which are sharp and very accurate. Furthermore, we have presented some useful and important properties of the obtained bounds. These properties can be necessary in future works. Moreover, we have investigated the relative errors of the proposed bounds. The corresponding maximum relative errors of the bounds are approximately smaller than and for the lower bound and upper bound, respectively. These values show that the obtained bounds are very accurate and thus are suitably applicable in the most engineering problems. Finally, we have illustrated some results in order to visualize the achieved gains.

## Appendix A Proof of the Bounds

{lemma}

The transcendental number can be bounded by the double inequality

 293 < π2 < 10 . (22)
###### Proof.

Both bounds are well known for long, see for example [10]. A new proof of the upper bound can be found in [11]. We here give an elementary proof of the lower bound. The identities and , see for example [12, p. 8, eq. 0.233.3 and p. 12, eq. 0.244.3], are used to deduce

 π26=1+∞∑k=11k2−∞∑k=11k(k+1)=1+∞∑k=11k2(k+1)=1+12+112+136=2918+∞∑k=41k2(k+1)>2918 . (23)

Hence follows. ∎

In the following, we denote the differences and by

 Δf(x)\coloneqqg(x)−f(x) (24)

and

 Δh(x)\coloneqqh(x)−g(x) , (25)

respectively. From Corollary II it is immediately deduced that

 limx↦±0Δf(x)=limx↦±0Δh(x)=limx↦±∞Δf(x)=limx↦±∞Δh(x)=0 . (26)

By direct algebra the first derivatives of and are given as

 dΔf(x)\dx=11+x2−14π2+√(1−4π2)2+4x2π2+x2π24√(1−4π2)2+4x2π2(4π2+√(1−4π2)2+4x2π2)2 (27)

and

 (28)

respectively. {corollary} The first derivatives of and vanish only at three real points, namely

 xf∈{0,±(π2−4)√−2π4+36π2−160π4−8π2−16} (29)

and

 xh∈{0,±√−5π4+108π2−576π(10−π2)} , (30)

respectively.

###### Proof.

We set (27) and (28) equal to zero and obtain the points in (29) and (30) by direct calculations. It remains to prove that all points are real. This is done by showing that the discriminant functions

 yf(ν)\coloneqq−2ν4+36ν2−160=2(ν2−8)(10−ν2) (31)

and

 yh(ν)\coloneqq−5ν4+108ν2−576=(5ν2−48)(12−ν2) (32)

are non-negative for . A curve tracing of and leads to the relationships

 yf(ν)≥0⇔8≤ν2≤10 (33)

and

 yh(ν)≥0⇔485≤ν2≤12 , (34)

respectively. Hence, both and are non-negative for all . By comparing the latter double inequality with the double inequality in Lemma A we deduce that and are non-negative, and hence, all roots in (29) and (30) are real. ∎

{corollary}

The difference is positive for all sufficiently small positive real numbers . For all negative real numbers with sufficiently small absolute value, the difference is negative.

###### Proof.

We incorporate the equations (8) and (9) into (24) to derive the first-order approximation of as

 Δf(x)≃10−π23(π2−4)x3+\Ord(x5) (35)

for all sufficiently small values of . From the double inequality in Lemma A, we deduce that the last ratio is always positive which completes the proof. ∎

{corollary}

The difference is positive for all sufficiently large positive real numbers . For all negative real numbers with sufficiently large absolute value, the difference is negative.

###### Proof.

We incorporate the equations (12) and (13) into (25) to derive the first-order asymptotic approximation of as

 Δh(x)≃10−π24x−1+\Ord(x−2) (36)

for all sufficiently large values of . From the double inequality in Lemma A, we deduce that the last ratio is always positive which completes the proof. ∎

###### Proof of TheoremIi.

We only consider the case of . The case of can be proved analogously, due to the point symmetric property of all functions in Theorem II. On the one hand, we know from Corollary A that each of differences and has only one stationary point for all . On the other hand, each of them attains equal values at and as , i.e., and , according to the equation (26). Hence, and because of Corollary A and A, as increases from zero to infinity each of the differences and increases monotonically from zero to a maximum value and from there on decreases monotonically toward zero. Thus, both differences and are non-negative for all . In other words, if one of the differences had at least one sign change for some value of , then it would have at least two stationary points for , but this contradicts the curve tracing in Corollary A. ∎

## Appendix B Proof of the Relative Errors

{definition}

Let and be defined as in Definition II. Then, we define two auxiliary sets by

 \setIf\coloneqq{x∈\setR∣rf(x)≥rh(x)} (37)

and

 \setIh\coloneqq{x∈\setR∣rf(x)

Note that both sets and are disjoint and their union is the whole real domain. {corollary} For all , , the inequality

 g(x)≥h(x)+f(x)2 (39)

holds. If , , then the inequality

 g(x)

holds. In the case of with and with the above inequalities are reversed.

###### Proof.

For all with , and from Definition II and B it follows that

 rf(x)≥rh(x) ⇔ rf(x)g(x)≥rh(x)g(x)⇔ g(x)−f(x)≥h(x)−g(x) ⇔ g(x)≥h(x)+f(x)2 . (41)

Similarly, for all with it follows that

 rf(x)

In the case of , the functions , and are negative, and hence, the inequalities in (39) and (40) are reversed. ∎

###### Proof of TheoremIi.

The proof of inequality (20) follows from inequality (7) and Definition II. It gives

 rf(x)=g(x)−f(x)g(x)≤h(x)−f(x)g(x)≤h(x)−f(x)f(x) (43)

and

 rh(x)=h(x)−g(x)g(x)≤h(x)−f(x)g(x)≤h(x)−f(x)f(x) (44)

which in turn result in

 max{rf(x),rh(x)}≤h(x)−f(x)f(x) . (45)

The proof of inequality (21) follows from Definition II and Corollary B. For all with , it gives

 rf(x)=1−f(x)g(x)≥1−f(x)h(x)+f(x)2=h(x)−f(x)h(x)+f(x) (46)

and

 rh(x)=h(x)g(x)−1≤h(x)h(x)+f(x)2−1=h(x)−f(x)h(x)+f(x) (47)

which in turn result in

 rh(x)≤h(x)−f(x)h(x)+f(x)≤rf(x) . (48)

If with , then it gives

 rf(x)=1−f(x)g(x)<1−f(x)h(x)+f(x)2=h(x)−f(x)h(x)+f(x) (49)

and

 rh(x)=h(x)g(x)−1>h(x)h(x)+f(x)2−1=h(x)−f(x)h(x)+f(x) (50)

which in turn result in

 rf(x)

From the double inequalities (48) and (51), we deduce that

 min{rf(x),rh(x)}≤h(x)−f(x)h(x)+f(x)≤max{rf(x),rh(x)} (52)

for all . For the case of , the proof can be obtained analogously. The identities in (20) and (21) arise from straightforward calculations. ∎

## Acknowledgment

Research described in the present work was supervised by Univ.-Prof. Dr. rer. nat. R. Mathar, Institute for Theoretical Information Technology, RWTH Aachen University. The author would like to thank him for his professional advice and patience.

## References

• [1] J. E. Volder, “The cordic trigonometric computing technique,” Electronic Computers, IRE Transactions on, vol. EC-8, no. 3, pp. 330–334, 1959.
• [2] R. Schmidt, “Multiple emitter location and signal parameter estimation,” Antennas and Propagation, IEEE Transactions on, vol. 34, no. 3, pp. 276–280, 1986.
• [3] H. Fu and P.-Y. Kam, “MAP/ML estimation of the frequency and phase of a single sinusoid in noise,” Signal Processing, IEEE Transactions on, vol. 55, no. 3, pp. 834–845, 2007.
• [4] X.-B. Zeng, Q.-M. Hu, J.-M. He, Q.-P. Tu, and X.-J. Yu, “High power RF amplifier’s new nonlinear models,” in Microwave Conference Proceedings, 2005. APMC 2005. Asia-Pacific Conference Proceedings, vol. 2, 2005.
• [5] B. Widrow and M. Lehr, “30 years of adaptive neural networks: perceptron, madaline, and backpropagation,” Proceedings of the IEEE, vol. 78, no. 9, pp. 1415–1442, 1990.
• [6] C. Hwang and H.-C. Chow, “Simplification of z-transfer function via Padé approximation of tangent phase function,” Automatic Control, IEEE Transactions on, vol. 30, no. 11, pp. 1101–1104, 1985.
• [7] R. E. Shafer, “Elementary problems: E 1867,” The American Mathematical Monthly, vol. 73, no. 3, p. 309, 1966.
• [8] F. Qi, S.-Q. Zhang, and B.-N. Guo, “Sharpening and generalizations of Shafer’s inequality for the arc tangent function,” Journal of Inequalities and Applications, vol. 2009, 2009.
• [9] N. Bleistein and R. A. Handelsman, Asymptotic Expansions of Integrals.   New York: Dover Publications, 1986.
• [10] O. Neugebauer, The Exact Sciences in Antiquity.   Dover Publications Incorporated, 1969.
• [11] N. D. Elkies, “Why is so close to ?” The American Mathematical Monthly, vol. 110, no. 7, p. 592, 2003.
• [12] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, 7th ed.   London: Academic Press, 2007.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters