Newton's Method

Newton's Method is an iterative numerical technique used to approximate solutions for nonlinear equations, which often cannot be solved analytically. This method begins with an initial guess for the root of the equation and replaces the nonlinear function with a linear approximation based on the tangent line at that point. By solving this linear equation, a new estimate for the root is derived, and the process is repeated, typically leading to rapid convergence toward the actual solution.

The method is particularly effective when the initial guess is close to the desired root, as it exhibits quadratic convergence under favorable conditions. In cases where the derivative at the root may vanish or the initial guess is further away, convergence can still occur but may be slower. Variations of the method, such as the quasi-Newton methods and secant method, can simplify the process when evaluating derivatives directly is challenging.

Newton's Method has a wide range of applications, including solving complex equations in physics, engineering, and computer science, such as in semiconductor modeling and evaluating functions like square roots. Historically, named after Sir Isaac Newton, this technique has evolved significantly since its inception, particularly benefiting from advancements in computer technology, which enable efficient computation of large systems of equations.

Published in: 2022

By: Hager, William W.

Subject Terms

Newton's Method

Type of physical science: Computation

Field of study: Numerical methods

Although linear equations can be solved directly using a finite number of algebraic operations, nonlinear equations often cannot be solved in closed form. Newton's method is an iterative technique for approximating the solution to a nonlinear equation by a sequence of solutions to linear equations.

Overview

When it is not possible to obtain a closed form solution to a given nonlinear equation, numerical methods can be used to approximate a solution. The most successful numerical schemes are often modeled around Sir Isaac Newton's method. The basic idea in Newton's method is to start with a guess for the desired solution and replace the nonlinear equation by a linear equation involving lines or planes that are tangent to the graph at the starting point. The solution to the linear approximating equation (it is hoped) provides an improved estimate for the solution to the nonlinear equation. This improved estimate for the root is used to obtain a new linearization. Again, it is hoped that the solution to the linear equation yields a still better approximation to the desired root. The iteration continues in this way.

Figure 1 shows Newton's method for the equation x² - 2 = 0 and the starting guess x₁ = 8. In the first iteration, one constructs the tangent to the graph of y = x² - 2 at x₁. The intersection of the tangent line with the horizontal x-axis is the new iterate x₂. In the second iteration, one constructs the tangent to the graph at x₂, and the intersection of the tangent line with the x-axis is x₃. Observe that the iterations converge to the root where the graph crosses the x-axis and y is 0.

Now consider a general equation f(x) = 0. Let x₁ denote the starting guess for a root. The equation of the line tangent to the graph of y = f(x) at x = x₁ is where f', the derivative of f, denotes the slope of the tangent to the graph.

Since a root of the equation f(x) = 0 is a point on the graph of y = f(x) where y is zero, set y = 0 in the linearization to obtain an approximation to the original nonlinear equation: f(x₁) + f'(x₁)(x-x₁) = 0. The root of this linear equation, denoted x₂, is the new approximation to the root of the nonlinear equation. Thus x₂ satisfies the equation f(x₁) + f'(x₁)(x₂-x₁) = 0. The difference x₂ - x₁, called the Newton step, is denoted s₁. In terms of the Newton step, the iteration can be expressed x₂ = x₁ + s₁, where f'(x₁)s₁ = -f(x₁). To generate x₃, one linearizes about x₂ to obtain x₃ = x₂ + s₂, where f'(x₂)s₂ = -f(x₂). The iteration continues in the same way. At iteration k, the new iterate x_k+1 is expressed x_k+1 = x_k + s_k, where f'(x_k)s_k = -f(x_k). One can divide by the derivative to get an explicit formula for the Newton step: s_k = -f(x_k)/f'(x_k), so that the Newton iteration reduces to x_k+1 = x_k - f(x_k)/f'(x_x). This linearization technique applies to very general equations; in every case, nonlinear functions are replaced by linear functions to compute the Newton step. Consider a system of two equations in two unknowns x and y: f(x , y) = 0 and g(x, y) = 0 Now there are two functions f and g to linearize. Newton's iteration can be expressed:

x_k+1 = x_k + s_k, where f_x(x_k, y_k)s_k + f_y(x_k, y_k)t_k = -f(x_k, y_k)

y_k+1 = y_k + t_k, where g_x(x_k, y_k)s_k + g_y(x_k, y_k)t_k = -g(x_k, y_k).

Here f_x and f_y stand for derivatives with respect to x and y, respectively. Since there are two unknowns, x and y, the Newton step has two components s sub k and t sub k. Although the equations for the components of the Newton step appear complicated, computers can easily solve for s_k and t_k since the equations are linear. Each of the equations is the equation of a line with s_k on the vertical axis and t_k on the horizontal axis. The Newton step corresponds to the intersection of the lines.

Typically, Newton's method converges only when the starting guess is sufficiently close to the desired root of the equation. The collection of starting guesses for which Newton's method converges is its domain of convergence. To expand the domain of convergence, a damping parameter τ can be inserted in the iteration. In the case of one equation in one unknown, the damped iteration is x_k+1 = x_k - τs_k where s sub k is the usual Newton step, and τ is a positive parameter, less than or equal to 1, chosen to enhance convergence. The damping parameter is frequently chosen using "Armijo's rule," which can be stated in the following way: τ = 2^-j, where j is the first nonnegative integer for which ½ f(x_k + τs_k ½ ≤ - τ/2) ½ f(x_k) ½.

If f' does not vanish at the desired root of the equation, and f is a nice smooth function, then Newton's method is quadratically convergent when the starting guess is sufficiently close to a root. That is, if e sub k denotes the error at step k, there exists a constant C such that e_{k + 1} ≤ Ce_k². Modulo the constant C, the error is squared in each iteration. Even when f' vanishes at a root, Newton's method can converge; however, the convergence is typically linear. That is, there exists a constant c < 1 such that e_{k + 1} ≤ Ce_k.

When implementing Newton's method in specific applications, often the most difficult part of the implementation is the evaluation of derivatives. In quasi-Newton methods, these derivatives are approximated using differences of function values. In the case of one equation and one unknown, the simplest quasi-Newton scheme is obtained using the approximation f'(x_k) ≈ [f(x_k) - f(x_k-1)]/(x_k-x_k-1) With this substitution in Newton's method, we have x_k+1 = x_k + s_k, where {[f(x_k) - f(x_k-1)]/(x_k - x_k-1)}sk = -f(x_k).

This scheme is called the secant method, since the derivative at x sub k is approximated by the slope of the chord through the previous two iterates.

Another way to simplify Newton's iteration is to reuse an old derivative rather than compute the derivative at the current iterate. For example, if the derivative at the starting point x₁ is used repeatedly, one obtains the iteration x_{k + 1} = x_k + s_k where f' (x₁)s_k = -f(x_k). In this scheme, called modified Newton's method, the derivative is evaluated at x₁ in each iteration, instead of at x_k. The modified Newton iteration (typically) converges when the starting guess is sufficiently close to a root, although the convergence is slower than that of Newton's method; as a general rule, the modified iteration converges linearly while the true Newton iteration converges quadratically.

Applications

Newton's method is used to solve a nonlinear equation using the solutions to a sequence of linear equations. If the original equation is linear, Newton's method is inappropriate.

For example, if an electrical circuit is composed of batteries and resistors that obey Ohm's law, then current through a resistor is proportional to voltage. Since current is a linear function of voltage, the equations describing the currents and voltages are linear, and Newton's method does not apply. Newton's method is appropriate whenever the underlying physical principle is nonlinear.

In modeling semiconductor devices, systems of equations involving exponentials often arise. Consider the equation x = 1 - x², which does not contain an exponential, but that will illustrate ideas nevertheless. This equation has a root near x = .618. . .that can be computed by Newton's method; since f(x) = x - 1 + x², the iteration reduces to x_k+1 = x_k - (x_{k - 1} +x_k²)/(1+2x_k) Starting from x₁ = 0, one obtains the iterates given in table 1. Observe that the number of correct digits nearly doubles in each iteration; this behavior is characteristic of quadratic convergence. This doubling property usually occurs in a neighborhood of the root--far from the root, the convergence can be much slower.

Multiple line equation(s) cannot be represented in ASCII text; please see PDF of this article if available.

Another application of Newton's method is to the evaluation of elementary functions by a computer or a calculator. A common elementary function that is available on many calculators and in many computer programming languages is the square root function. To be specific, one can examine the problem of computing the square root of 2. The basic idea is the following: One devises an equation whose solution is the square root of 2 and solves the equation by Newton's method. One of the simplest equations with solution √2 is x²+2. Newton's method for this equation reduces to x_k+1 = x_k/2 + 1/x_k. An iteration like this, involving multiplications, divisions, additions, or subtractions, can be implemented in the electronic circuits of a calculator or coded in the machine language of a computer.

In other applications, the equations that one wishes to solve are obtained through the discretization of a differential equation. Consider the following boundary-value problem associated with semiconductors: d²x(t)/dt² = -e^-x(t), x(0) = 0 and x = 0. In other words, find a function x (t), defined for t between 0 and 1, whose second derivative at any t has the form -e^-x(t). One technique for approximating this function is to partition the interval [0, 1] into N subintervals and replace the continuous equation by a finite difference approximation. If ti = i/N are the subdivision points, the following finite difference approximations to the original boundary-value problem are commonly employed: x_{i + 1} - 2x_i + x_i - 1 = -e^-x_i/N², 1 ≤ i ≤ N - 1 where x₀ = 0 and x_N = 0, and x sub i is an approximation to x(t_i). For each value of N, there is a different system of equations, and as N increases, the solution to the finite difference system approaches the continuous function x(t). Because this system of equations is nonlinear, Newton's method can be applied.

Context

Newton's method is named in honor of Sir Isaac Newton, who used it as early as 1669 to solve various equations. He applied it both to an equation of Johannes Kepler and to polynomial equations like xⁿ = α. Since the solution of this equation is the nth root of α, he was able to compute, to any desired level of accuracy, roots of a given number. Computing in the 1600's was done by hand, and the rapid convergence of Newton's method made it an attractive method.

In the systematic development of Newton's method, modern mathematicians obtained the secant scheme by simplifying Newton's method, replacing Newton's derivative by a difference quotient. Historically, however, the secant method came first. The secant method first appeared in work of Franciscus Vieta in 1600, and Newton's method was a modification of this idea, in which the difference quotient was replaced by Newton's derivative. Despite the significance of Vieta's method, it was described in the 1670's as "work unfit for a Christian and more proper to one that can undertake to remove the Italian Alps into England." The extension of the Newton linearization approach to general abstract equations was studied in the 1950's by the Soviet mathematician Kantorovich.

Although Newton's method has been used for several hundred years to approximate solutions of nonlinear equations, its significance increased immensely with the introduction of the computer in the 1940's. By means of a computer, large linear systems of equations can be solved quickly, and, in particular, linearizations associated with Newton's method are solved routinely. As computer technology evolves, it is anticipated that Newton's method will continue to play a fundamental role in computational methods for science and engineering.

Principal terms

DERIVATIVE: the slope of a curve

DIVERGENT METHOD: a numerical method that does not approach the desired solution

DOMAIN OF CONVERGENCE: the starting guesses for which a numerical method converges

ITERATION: the operation of transforming one number into another number using some given formula

LINEAR CONVERGENCE: the error in a numerical method decreases by a constant factor less than one in each iteration

LINEAR EQUATION: the equation describing a line or a plane

LINEARIZATION: an approximation to an equation obtained by replacing nonlinear curves (or surfaces) by tangent lines (or planes)

NONLINEAR EQUATION: an equation that is not linear

QUADRATIC CONVERGENCE: a numerical method approaches the desired solution and the error is roughly squared in each iteration

ROOT: a solution to an equation

SECOND DERIVATIVE: the rate of change in the slope, an indication of the curvature

Bibliography

Dennis, J. E., Jr., and Robert B. Schnabel. NUMERICAL METHODS FOR UNCONSTRAINED OPTIMIZATION AND NONLINEAR EQUATIONS. Englewood Cliffs, N.J.: Prentice-Hall, 1983. Provides a nice treatment of both Newton and quasi-Newton methods. The treatment is more technical than that of Hager, but less technical than the book by Ortega and Rheinboldt.

Goldstine, Herman H. A HISTORY OF NUMERICAL ANALYSIS FROM THE SIXTEENTH THROUGH THE NINETEENTH CENTURY. New York: Springer-Verlag, 1977. Attempts to trace the development of numerical analysis during the period in which its foundation was being developed. Newton's method is one of many methods that are discussed.

Hager, William W. APPLIED NUMERICAL LINEAR ALGEBRA. Englewood Cliffs, N.J.: Prentice-Hall, 1988. Chapter 4 provides an easy-to-read development of Newton's method.

Larson, Roland E., Robert P. Hostetler, and Bruce H. Edwards. CALCULUS. Lexington, Mass.: D. C. Heath, 1990. A very popular introduction to calculus. Newton's method appears near the beginning, after derivatives have been introduced.

Ortega, James M., and Werner C. Rheinboldt. ITERATIVE SOLUTION OF NONLINEAR EQUATIONS IN SEVERAL VARIABLES. New York: Academic Press, 1970. A comprehensive mathematical treatment of methods for solving systems of equations.

Newton's method of x2 - 2 = 0 with starting guess x1 = 8

Newton's method for x = 1 - x2

Numerical Solutions of Differential Equations

Essay by William W. Hager

Newton's Method

Related Topics

On this Page

Subject Terms

Newton's Method

Overview

Applications

Context

Principal terms

Bibliography