Description

5/5 - (3 votes)

Exercise 1. This problem deals with parsing numerical data and performing simple statistical analysis. Imagine this scenario. Your chemistry lab-mate has collected many measurements of an experiment and has put them all into a single file for you. This file is named
A3-data-file.txt and is posted on OWL alongside these instructions. It is now your job to
analyze the results. In this experiment there were four distinct trials and you must perform
the analysis on each trial individually. Unfortunately, your lab-mate has mixed data from
different trials together! Fortunately, each measurement has a label to indicate of which trial
it is a part. For example, the first few lines of the data file are:
trial1 123.43
trial3 341.32
trial2 123.42
trial4 89.337
trial3 355.12
Therefore, your program must perform the analysis of the data in three steps:
1. Read the data in the file and sort each measurement based on which trial it is a part.
2. Perform the statistical analysis on each trial.
3. Write the statistical analysis of each trial to new file (i.e. write out four different files).
To perform these three steps your program should be broken into two parts.
Part 1. In this section we will define the contents of the myStatistics.py file. In this
file we look to implement six functions for statistical analysis: myMin, myMax, myAverage,
myMedian, myStandardDeviation, myCountBins.
For all functions do not use Python’s statistics package nor the built-in min, max
functions. You must implement the math yourself.
myMin is a function which takes as its only parameter a list of floating point values and
returns the minimum value among all values in the list.
myMax is a function which takes as its only parameter a list of floating point values and
returns the maximum value among all values in the list.
myAverage is a function which takes as its only parameter a list of floating point values and
returns the average of all values in the list.
myMedian is a function which takes as its only parameter a list of floating point values and
returns the median of the values in the list.
2 CS1026A
myStandardDeviation is a function which takes as its only parameter a list of floating point
values and returns the standard deviation of the sample of values. Standard deviation (𝜎)
can be computed by the following formula:
𝜎 =
⟨⧸︂⧸︂⟩
1
𝑛 − 1
𝑛
∑
𝑖=1
((𝑥𝑖 − 𝑥¯)
2
)
where 𝑥𝑖 are the individual values in a list of 𝑛 values, and 𝑥¯ is the average of the list of values.
myCountBins is a function which takes two parameters: a list of floating point values, and a
floating point number. This second parameter is the bin size. This function will implement
a simplified form of data binning https://en.wikipedia.org/wiki/Data_binning. This function will go through the list of values given as the first parameter to count the number of
values in the list which fall into a certain “bin”. The bins are defined as: 0 ≤ 𝑥𝑖 < bin size,
bin size ≤ 𝑥𝑖 < 2 × bin size, 2 × bin size ≤ 𝑥𝑖 < 3 × bin size, . . . until all values in the input list
have been found to exist in a certain bin. For example, if the maximum value in the input list
is 30 and the bin size is 10 then there should be 4 bins: 0 ≤ 𝑥𝑖 < 10, 10 ≤ 𝑥𝑖 < 20, 20 ≤ 𝑥𝑖 < 30,
30 ≤ 𝑥𝑖 < 40.
All functions only need to handle lists of floating point numbers. That is to say, if you
come across a non-number in the input list then your program is allowed to crash. The
myCountBins function only needs to handle non-negative numbers in its input list. All
other functions must handle all possible floating point numbers.
Part 2. In this section we define the contents of the userid_main.py file. In this file you
shall import your other file (e.g. by the command from myStatistics import *) and then
use those imported functions within your main function to prompt the user for the name of
the data file, read the data in that file, and then output the results of the analysis to four
different files. In particular, your main function should:
1. Prompt the user to input the name of the file which contains the data to analyze.
2. Open the file for reading, if possible. If the file is not found or not available for any
reason, simple print an error message “Sorry, the file is not available” and
then terminate the program.
• Hint: use try: except: to accomplish this.
3. Read all of the data in the file and separate the data into four different lists, one list
for each trial.
• Hint: A dictionary of lists would be very helpful here!
4. For each trial compute, using your myStatistics module,
• the minimum,
• the maximum,
• the average,
3 CS1026A
• the median,
• the standard deviation, and
• the list of bin counts for a bin size of 25.
5. For each trial we wish to output the computed data to the files trial1-data-analysis.txt,
trial2-data-analysis.txt, trial3-data-analysis.txt, and trial4-data-analysis.txt
where trial1 data goes in the file trial1-data-analysis.txt, etc. The data should be
output in the following format where each item inside angled brackets (e.g. )
is replaced by the actual value computed for that trial. All numbers should be printed
using 5 digits after the decimal place, except for bin_count which can simply be the
string representation of the list of counts.
minimum :
maximum :
average :
median :
std_dev :
bin_count: An example output file can be found on OWL next to these assignment instructions
but is also repeated here as an example. trial1-data-analysis.txt should have the
following contents:
minimum : 0.47074
maximum : 398.21285
average : 207.06971
median : 209.04432
std_dev : 115.91112
bin_count: [16, 12, 16, 15, 18, 19, 16, 13, 13, 12, 29, 11, 19, 14, 19, 19]
4 CS1026A

CS1026A Assignment 3 solved

Download Details:

Description

CS1026A Assignment 3 solved

Download Details:

Description

Related products

CS1026a: Assignment 3 solved

CS1026A Assignment 1 solved

CS1026a: Assignment 4 solved