Sale!

STAT 480 Homework 5 solved

Original price was: $35.00.Current price is: $30.00. $25.50

Category:

Description

5/5 - (5 votes)

Exercises for All Students
Note: In python, you will also want to use float for non-integer arithmetic, rather than int when dividing numbers.
Using int will result in integer arithmetic, so remainders will be dropped instead of resulting in decimals so there would
be truncation errors with int.
Exercise 1:
Using Hadoop and MapReduce, find the minimum monthly recorded air temperature from 1915 to 1924 and return
those minimum values in degrees Celsius. (You should have 12 values total, one for each month).
Exercise 2:
Using Hadoop and MapReduce, obtain the number of trusted temperature observations and the minimum and
maximum monthly temperatures in degrees Fahrenheit over the period of 1915 to 1924. Make sure your code only goes
through the data once to get these results (to do this you will need to update the minimum, maximum, and count at the
same step in the code).
Exercise 3:
Using Hadoop and MapReduce, obtain the total number of air temperature observations that are not missing for each
month during the period from 1915 to 1924 and the total number of observations with acceptable quality codes for each
month during that period. Make sure your code only goes through the data once to get these results (to do this, you
could have the mapper return (month, tempcount, validqcount) for each observation, and have the reducer aggregate).
Additional Exercise for Graduate Students
Exercise 4:
Using Hadoop and MapReduce, obtain the monthly mean air temperature in degrees Celsius for the period from 1915 to
1924. If you use a combiner, make sure your code will work when data needs to be recombined from samples of
different sizes.