Sale!

MATH2070: LAB #4 Newton’s method solution

$30.00 $25.50

Category:

Description

5/5 - (10 votes)

1 Introduction
The bisection method is very reliable, but it can be relatively slow, and it does not generalize easily to more
than one dimension. The secant and Muller’s methods are faster, but still do not generalize easily to multiple
dimensions. In this lab we will look at Newton’s method for finding roots of functions. Newton’s method
naturally generalizes to multiple dimensions and can be much faster than bisection. On the negative side,
it requires a formula for the derivative as well as the function, and it can easily fail. Nonetheless, it is a
workhorse method in numerical analysis. We will be taking three sessions to complete this lab.
2 Stopping Tests
Root finding routines check after each step to see whether the current result is good enough. The tests that
are made are called “termination conditions” or “stopping tests”. Common tests include:
Residual size |f(x)| <  Increment size |xnew − xold| <  Number of iterations: itCount > ITMAX
The size of the residual looks like a good choice because the residual is zero at the solution; however, it turns
out to be a poor choice because the residual can be small even if the iterate is far from the true solution.
You saw this when you found the root of the function f5(x) = (x − 1)5
in the previous lab.
The size of the increment is a reasonable choice because Newton’s method (usually) converges quadratically, and when it does the increment is an excellent approximation of the true error. The third stopping
criterion, when the number of iterations exceeds a maximum, is a safety measure to assure that the iteration
will always terminate in finite time.
It should be emphasized that the stopping criteria are estimated errors. In this lab, in class, and in
later labs, you will see many other expressions for estimating errors. You should not confuse the estimated
error with the true error, |xapprox − xexact|. The true error is not included among the stopping tests
because you would need to know the exact solution to use it.
3 Failure
You know that the bisection method is very reliable and rarely fails but always takes a (sometimes large)
fixed number of steps. Newton’s method works more rapidly a good deal of the time, but does fail. And
Newton’s method works in more than one dimension. One objective of this lab will be to see how Newton
can fail and get a flavor of what might be done about it.
Although we will seem to spend a lot of time looking at failures, you should still expect that Newton’s
method will converge most of the time. It is just that when it converges there is nothing special to say, but
when it fails to converge there is always the interesting questions of, “Why?” and “How can I remedy it?”
1
4 Introduction to Newton’s Method
The idea of Newton’s method is that, starting from a guessed point x0, we find the equation of the straight
line that passes through the point (x0, f(x0)) and has slope f
0
(x0). The next iterate, x1, is simply the root
of this linear equation, i.e., the location that the straight line intersects the x-axis.
A “quick and dirty” derivation of the formula can be taken from the definition of the derivative of a
function. Assuming you are at the point x
(k)
, (I am using superscripts in parentheses to denote iteration to
reduce confusion between iteration number and vector components.)
f(x
(k) + ∆x) − f(x
(k)
)
∆x
≈ f
0
(x
(k)
).
If f were linear, this approximate equality would be true equality and the next iterate, x
(k+1) would be the
exact solution, satisfying f(x
(k+1)) = 0. In other words
f(x
(k+1)) = f(x
(k) + ∆x) = 0.
This yields
−f(x
(k)
)
∆x
≈ f
0
(x
(k)
),
or
∆x = x
(k+1) − x
(k) = −
f(x
(k)
)
f
0(x
(k))
. (1)
As you know, Newton’s method also will work for vectors, so long as the derivative is properly handled.
Assume that x and f are n-vectors. Then the Jacobian matrix is the matrix J whose elements are
Jij =
∂fi
∂xj
.
(Here, i = 1 for the first row of J, i = 2 for the second row of J, etc., so that the matrix subscripts correspond
to the usual ones.) Thus, Newton’s method for vector functions can be written
x
(k+1) − x
(k) = −J(x
(k)
)
−1
f(x
(k)
). (2)
5 Writing Python code for functions
Newton’s method requires both the function value and its derivative, unlike the bisection method that requires
only the function value. To our existing files that defined a function, we can add extra code that returns the
derivative. For instance, consider the file f1.py from the previous lab. Here’s how we can modify it so it can
be used for either bisection or Newton’s method:
def f 1 ( x ) :
”””Computes y=x ˆ2−9. ”””
y=x∗∗2
return y
def fp 1 ( x ) :
”””Computes t h e d e r i v a t i v e o f y=x ˆ2−9. ”””
yprime=2∗x
return yprime
Now, when we set up a problem, if we just need the function value, we will use a statement like
from f 1 import f 1
2
but for Newton’s method, where we need both function and derivative, we will ask for both definitions.
from f 1 import f1 , f 1p
Note that from f1 is telling us to look in the file f1.py, but import f1, f1p is telling us to look inside that
file for definitions of those names.
6 Exercise 1
In this exercise, you will modify function files from the previous lab to include the definition of the derivative.
1. Copy the files f0.py, f1.py, f2.py, f3.py and f4.py from the previous lab. Modify each file to include
derivative definitions f0p, f1p and so on. Recall that the functions are:
f0: y=x-1 yprime=???
f1: y=x^2-9 yprime=???
f2: y=x^5-x-1 yprime=???
f3: y=x*exp(-x) yprime=???
f4: y=2*cos(3*x)-exp(x) yprime=???
2. Use your new codes to compute the values of the function and derivative at x=-1 for each of these
functions.
There are several other programming strategies for computing derivatives.
• You could compute the derivative value in a new file with a new name such as df0, df1, etc.
• You could compute the derivative using some sort of divided difference formula. This method is useful
for very complicated functions.
• You could use the symbolic capabilities of the Python package sympy to symbolically differentiate the
functions. This method seems attractive, but it would greatly increase the time spent in function
evaluation.
• There are automated ways to discover the formula for a derivative from the file or symbolic formula
defining the function, and these can be used to generate the file for the derivative automatically.
7 Exercise 2
In this exercise, you will get down to the task of writing Newton’s method as a function file. In this file, you
will see how to use a variable number of arguments in a function to simplify later calls. The size of the error
and the maximum number of iterations will be optional arguments. When these arguments are omitted,
they take on their default values.
The function you are about to write is shorter than bisect.py and the instructions take you through it
step by step.
1. Create a new, empty file named newton1.py.
2. Start off the function with its signature line and follow that with comments. The comments should
repeat the signature as a comment, contain a line or two explaining what the function does, explain
what each of the parameters mean, and include your name and the date.
3
def newton1 ( func , d func , x0 , maxIts =100) :
””” Uses Newtons method t o f i n d x so t h a t func ( x ) = 0 .
Args :
func : Func t ion t o f i n d t h e r o o t o f .
d func : d e f i n e s t h e d e r i v a t i v e o f t h e f u n c t i o n .
x0 : The i n i t i a l g u e s s
maxI ts : The maximum number o f i t e r a t i o n s . D e f a ul t i s 1 0 0 .
Re turns :
x : ???
numIts : ???
”””
import numpy a s np
(You may have to complete this exercise before coming back to explain all of the parameters.)
3. From a programming standpoint, the iteration should be limited to a fixed (large) number of steps.
The reason for this is that if it does not meet some termination criterion it will run forever.
4. Define the convergence criterion
EPSILON = 5. 0 e−5
5. Initialize the solution estimate:
x = x0
6. Some people like to use while loops so that the stopping criterion appears at the beginning of the
loop. My opinion is that if you are going to count the iterations you may as well use a for statement.
Start off the loop with
fo r numIts in range ( 1 : maxIts+1) :
7. Evaluate the function and its derivative at the current point x, much as was done in bisect.py. You
may choose the variable names as you please.
8. Define a Python variable increment as the negative of the function value divided by the value of its
derivative. This is the right side of Equation (1).
9. Complete equation (1) with the statement
x += inc remen t
10. Finish the loop and the file with the following statements
e r r o r E s tim a t e = np . abs ( inc remen t ) ;
print ( f ”{numIts } , x= {x } , e r r o r e s tim a t e = { e r r o r E s tim a t e }” )
i f e r r o r E s tim a t e n, ∆x
(k) = r
k−n
1 ∆x
(n)
. Hence,
x
(∞) = x
(n) + ∆x
(n) X∞
k=n
r
k−n
1
= x
(n) +
∆x
(n)
1 − r1
This equation indicates that a good error estimate would be
|x
(∞) − x
(n)
| ≈ |∆x
(n)
|
1 − r1
10 Exercise 5
We are now going to make yet another version of the code, so that we can use the new error estimate idea.
Make a copy of newton2.py called newton3.py.
7
1. Inside the file, change the name of the function from newton2() to newton3().
2. Comment out the print statement displaying r1 and r2, since it will become a distraction when large
numbers of iterations are needed.
3. Conveniently, r1 either goes to zero or remains bounded. If the sequence converges, r1 should remain
below 1, or at least its average should remain below 1. Replace the if-test for stopping to
i f e r r o r E s tim a t e < EPSILON∗(1− r 1 ) : return x , numIts Note: This code is mathematically equivalent to errorEstimate=abs(increment)/(1-r1), but I have multiplied through by (1 − r1) to avoid the problems that occur when r1 ≥ 1. Convergence will never be indicated when r1 ≥ 1 because errorEstimate is non-negative. 4. Use newton3() to fill in the following table, where you can compute the absolute value of the true error because you can easily guess the exact solution. Use maxIts=1000 for this exercise. You have already done these cases using the original convergence criterion in Exercise 3, so you can get those values from that exercise. newton1 newton1 newton3 newton3 Function start numIts true err numIts true error f1=x^2-9 0.1 ______ ______ _______ _______ f6=(x-4)^2 0.1 ______ ______ _______ _______ f7=(x-4)^20 0.1 ______ ______ _______ _______ You should see that the modified convergence does not harm quadratic convergence (about the same number of iterations required) and greatly improves the estimated error and stopping criterion in the linear case. A reliable error estimate is almost as important as the correct solution–if you don’t know how accurate a solution is, what can you do with it? This modification of the stopping criterion is very nice when r1 settles down to a constant value quickly. In real problems, a great deal of care must be taken because r1 can cycle among several values, some larger than 1, or it can take a long time to settle down. 11 Choice of initial guess The theorems about Newton’s method generally start off with the assumption that the initial guess is “close enough” to the solution. Since you don’t know the solution when you start, how do you know when it is “close enough?” In one-dimensional problems, the answer is basically that if you stay away from places where the derivative is zero, then any initial guess is OK. More to the point, if you know that the solution lies in some interval and f 0 (x) 6= 0 on that interval, then the Newton iteration will converge to the solution, starting from any point in the interval. When there are zeros of the derivative nearby, Newton’s method can display highly erratic behavior and may or may not converge. In the last part of the previous exercise, you saw a case where there are several roots, with zeros of the derivative between them, and moving the initial guess to the right moved the chosen root to the left. 12 Exercise 6 In this and the following exercise, you will be interested in the sequence of iterates, not just the final result. Re-enable the print statement displaying the values of the iterates in newton3(). 8 1. Copy the file cosmx.py from the previous lab, which defines cosmx as the function (f(x) = cos x − x). Modify the file to add a definition cosmxp which is the derivative of the function. 2. Use newton3() to find the root of cosmx starting from the initial value x=0.5. What is the solution and how many iterations did it take? (If it took more than ten iterations, go back and be sure your formula for the derivative is correct.) 3. Again, use newton3() to find the root of cosmx, but start from the initial value x=12. Note that 3π < 12 < 4π, so there are several zeros of the derivative between the initial guess and the root. You should observe that it takes the maximum number of iterations and seems not to converge. 4. Try the same initial value x=12, but also use maxIts=5000. (This is going to cause a large number of lines of printed information, but you are going to look at some of those lines.) Does it locate a solution in fewer than 5000 iterations? How many iterations does it take? Does it get the same root as before? 5. Look at the sequence of values of x that Newton’s method chose when starting from x=12. There is no real pattern to the numbers, and it is pure chance that finally put the iteration near the root. Once it is “near enough,” of course, it finds the root quickly as you can see from the estimated errors. Is the final estimated error smaller than the square of the immediately preceeding estimated error? You have just observed a common behavior: that the iterations seem to jump about without any real pattern until, seemingly by chance, an iterate lands inside the circle of convergence and they converge rapidly. This has been described as “wandering around looking for a good initial guess.” It is even more striking in multidimensional problems where the chance of eventually landing inside the ball of convergence can be very small. 13 Exercise 7 Another possible behavior is simple divergence to infinity. The following exercise presents a case of divergence to infinity. Attempt to find the root of f3 = x*exp(-x), starting from x=2. You should find it diverges in a monotone manner, so it is clear that the iterates are unbounded. What are values of iterates 95 through 100? This behavior can be proved, but the proof is not required for this lab. 14 Exercise 8 Sometimes people become so confident of their computational tools that they attempt the impossible. What would happen if you attempted to find a root of a function that had no roots? Basically, the same kind of behavior occurs as when there are zeros of the derivative between the initial guess and the root. 1. Write a file for f8=x^2+9, which defines f8 as the function value, and f8p as its derivative. 2. Apply newton3() to f8, starting from x=0.1. Describe what happens. 15 Exercise 9 But the function x 2 + 9 does have roots! The roots are complex, but Python knows how to do complex arithmetic. (Actually the roots are imaginary, but it is all the same to Python.) All Python needs is to be reminded to use complex arithmetic. If you have used the letter j as a variable in your Python code, its special value as the square root of -1 has been obscured. 9 For complex constants, Python will accept the syntax 2+3j to mean the complex number whose real part is 2 and whose imaginary part is 3. 1. Apply newton3() to f8=x^2+9, starting from the initial guess x=1+1j. (Python interprets “1j” and “3j” as √ −1 and 3√ −1 respectively.) You should get a result close to 0+3j and it should take fewer than 10 iterations. 2. Fill in the following table by using newton3() to find a root of f8 from various initial guesses. The exact root, as you can easily see, is ±3j, so you can compute the true error (typically a small number), as you did in Exercise 1. Initial guess numIts true error 1+1j _______ _________ 1-1j _______ _________ 10+5j _______ _________ 10+eps*1j _______ _________ Recall that eps is the smallest number you can add to 1.0 and still change it. Define eps by the command import numpy a s np ep s = np . f i n f o ( np . f l o a t ) . ep s What happened with that last case? The derivative is 2x and is not zero near x = (10 + (eps)j) ≈ 10. In fact, the derivative is zero only at the origin. The origin is not near the initial guess nor either root. It turns out that the complex plane is just that: a plane. While Newton’s method works in two or more dimensions, it is harder to see when it is going to have problems and when not. We will elaborate a bit in a later section and also return to this issue in the next lab. 16 Unpredictable convergence The earlier, one-dimensional cases presented in this lab might lead you to think that there is some theory relating the initial guess and the final root found using Newton’s method. For example, it is natural to expect that Newton’s method will converge to the root nearest the initial guess. This is not true in general! In the exercise below, you will see that it is not possible, in general, to predict which of several roots will arise starting from a particular initial guess. Consider the function f9(z) = z 3 + 1, where z = x + yj is a complex number with real part x and imaginary part y. This function has the following three roots. ω1 = −1 ω2 = (1 + √ 3)/2j ω3 = (1 − √ 3)/2j 17 Exercise 10 In the following exercise, you will choose a number of regularly-spaced points in the square given by |x| ≤ 2 and |y| ≤ 2 and then start your newton3() iteration at each of these points. You will then construct an image by coloring each starting point in the square according to which of the four roots was found when starting from the given point. The surprise comes in seeing that nearby initial guesses do not necessarily result in the same root at convergence. 10 1. Be sure your newton3() does not have any active print statements in it. 2. Create a file f9.py that defines the function f9 and derivative f9p. 3. Download the file exercise10.py, which uses Newton’s method to analyze the function, and to plot the results. Describe the information that the plot is exhibiting. import numpy a s np import m a t pl o tli b . p y pl o t a s p l t from f 9 import f9 , f 9p from newton3 import newton3 NPTS = 100 x = np . l i n s p a c e ( −2, 2 , NPTS) y = np . l i n s p a c e ( −2, 2 , NPTS) omega = np . z e r o s ( 3 , dtype=np . complex ) omega [ 0 ] = −1 # red omega [ 1 ] = ( 1 + np . s q r t ( 3 ) ∗ 1 j ) / 2 # orange omega [ 2 ] = ( 1 − np . s q r t ( 3 ) ∗ 1 j ) / 2 # p u r pl e z = np . z e r o s ( (NPTS, NPTS) , dtype=np . complex ) fo r k in range ( 0 , NPTS) : fo r m in range ( 0 , NPTS) : z [ k , m] = x [ k ] + y [m] ∗ 1 j z = z . f l a t t e n ( ) which = [ ] fo r k in range ( 0 , len ( z ) ) : x , n um i t e r s = newton3 ( f9 , f9p , z [ k ] , maxIts = 500 ) whichRoot = np . argmin ( np . abs ( x − omega ) ) i f whichRoot == 0 : which += [ ’ red ’ ] e l i f whichRoot == 1 : which += [ ’ o r an ge ’ ] e l i f whichRoot == 2 : which += [ ’ p u r pl e ’ ] e l s e : print ( ’You have a bug ’ ) p l t . s c a t t e r ( z . r e al , z . imag , c o l o r=which ) p l t . s c a t t e r ( omega . r e al , omega . imag ) p l t . y l a b e l ( ’ Imaginary ’ ) p l t . x l a b e l ( ’ Real ’ ) p l t . show ( ) 4. Execute your modified file. The position (z=x+yj) on this image corresponds to the initial guess and the color values red, orange and purple correspond to roots 1, 2 and 3 to which the Newton iteration converged starting from that initial guess. This image illustrates that, for example, some initial guesses with large positive real parts converge to the root ω1 = −1 despite being closer to both of the other roots. Please include this plot with your summary file. It turns out that the set you have plotted is a Julia set (fractal). You can increase the detail of the plot by increasing NPTS and waiting longer. If you wish to learn more about “Newton Fractals,” I suggest a Wikipedia article https://en.wikipedia.org/wiki/Newton fractal or a very comprehensive paper by J. H. Hubbard, D. Schleicher, S. Sutherland. “How to Find All Roots of Complex Polynomials by Newton’s Method,” Inventiones Mathematicæ 146 (2001)1-33. In addition, the chatty but informative article https://www.chiark.greenend.org.uk/∼esgtatham/newton contains interesting information and plots. 11 18 Newton’s method revisited One disadvantage of Newton’s method is that we have to supply not only the function, but also a derivative. In the following exercise, we will try to make life a little easier by numerically approximating the derivative of the function instead of finding its formula. One way would be to consider a nearby point, evaluate the function there, and then approximate the derivative by y 0 ≈ f(x + ∆x) − f(x) ∆x (5) There are many possible ways to pick step sizes ∆x that change as the iteration proceeds. One way is considered in Exercise 13 below, but we will look at simply choosing a fixed value in the following two exercises. 19 Exercise 11 Make a copy of newton3.py and call it newtonfd.py. The convergence criterion should involve the factor (1-r1) in newtonfd.py because using an approximate derivative usually leads to slower linear convergence. 1. Change the function definition to be 1 def newton fd ( func , x0 , dx , maxIts ) : and change the comment lines to reflect the new function. 2. Previously, your code probably had a line like 1 d e r i v a t i v e = d func ( x ) Now, we don’t have the dfunc() function, and instead, are going to approximate the derivative using Equation (5) using the constant stepsize dx. Write this new approximation: 1 d e r i v a t i v e = ? ? ? 3. Check out your code using the function f1(x)=x^2-9, starting from the point x = 0.1, using an increment dx=0.00001. Since newtonfd() doesn’t need the a user-defined derivative function, your import statement can be simplified to only requesting the function definition: 1 from f 1 import f 1 You should observe essentially the same converged value in the same number (±1) of iterations as using newton3(). 20 Exercise 12 It turns out that the function f1(x)=x^2-9 is an “easy” function to solve using Newton iterations. More “difficult” functions require a more delicate choice for dx. Now we will compare the performance of the Newton method with user-supplied or approximated derivative values. We have f2(x)=x^5-x-1 and f6(x)=(x-4)^2. Use the starting point x = 5.0, compute the number of steps taken to converge using both newton3() once, and newtonfd() using a sequence of different stepsizes dx. You may have to increase the value of maxIts by a very great deal to get f6 to converge for the cases of large dx. 12 using newton3 Steps for f2 + f2p steps for f6 + f6p __________________ __________________ using newtonfd Steps for f2 steps for f6 dx = 0.00001 __________________ __________________ dx = 0.0001 __________________ __________________ dx = 0.001 __________________ __________________ dx = 0.01 __________________ __________________ dx = 0.1 __________________ __________________ dx = 0.5 __________________ __________________ As you can see, the choice of dx is critical. It is also quite difficult in more complicated problems. As mentioned earlier, Newton’s method generally converges quite quickly. If you write a Newton solver and observe poor or no convergence, the first thing you should look for is an error in the derivative. One way of finding such an error is to apply a method like newtonfd() and print both the estimated derivative and the analytical derivative computed inside your function. They should be close. This trick is especially useful when you are using Newton’s method in many dimensions, where the Jacobian can have some correctlycomputed components and some components with errors in them. You saw the secant method in Lab 3, and it had the update formula x = b −  b − a f(b) − f(a)  f(b). (6) 21 Exercise 13 1. Compare (6) with (1) and show that by properly identifying the variables x, a, and b then (6) can be interpreted as a Newton update using an approximation for f 0 (x (k) ). Explain your reasoning. 2. Starting from your newtonfd.py, copy it to a new file named secantfd.py and modify it to carry out the secant method. The secant method requires an extra value for the iterations to start, but there is only the value x0 in the signature line. Take b=x and a=x-dx only for the first iteration. Fill in the following table. Newtonfd Newtonfd secantfd secantfd Function start numIts solution numIts solution f1=x^2-9 0.1 ______ ______ ______ ______ f3=x*exp(-x) 0.1 ______ ______ ______ ______ f6=(x-4)^2 0.1 ______ ______ ______ ______ f7=(x-4)^20 0.1 ______ ______ ______ ______ You should observe that both methods converge to the same solution but that the secant method takes more iterations, but not nearly so many as newtonfd() with a larger fixed dx. This is because the theoretical convergence rate is quadratic for Newton and (1 + √ 5)/2 ≈ 1.62 for secant because of the approximation to the derivative. 13