Sale!

CSC 465 Homework 1 solved

Original price was: $35.00.Current price is: $30.00. $25.50

Category:

Description

5/5 - (8 votes)

The idea behind this assignment is to get you using the tools we’ll work with for this
course and demonstrate your understanding of the first two weeks’ material. You will
make graphs with both R and Tableau. Follow the guidelines below for making quality
graphs. Make each visualization and revise, making conscious decisions about the
appearance, rather than using the default settings. It requires some fiddling with
settings/code to get graphs the way you want them.
We’ll learn more about guidelines for clear graphs, but here is a starting point:
 Each graph should be clean with easy-to-read graphical elements (not too thick,
but not too thin either, not too much overlap of plot elements).
 The font size and weight should make labels easy to read, while not being
intrusive.
 The defaults may be fine, but you are highly encouraged to experiment with
different formatting options to try to improve the readability of the graphs. Plus,
doing so helps you learn the software better!
1) (20 pts) For this problem, we’ll look at data about Intel stock (Intel-1998 dataset from
the website). The data covers stock market trading for the Intel corporation in 1998.
Each row is a day, with the following columns: Date, Trading Day (integer day
number, including skips), Open (price at market open), High (highest price of day),
Low (lowest price of day), Close (price at market close), Volume (shares traded), and
Adj. Close (adjusted closing price, meaning accounting for stock splits, which are not
a problem in this data).
Make the specified graphs in either R or Tableau:
a. Graph the closing price vs. the date with an ordinary line graph. If you use
Tableau, you need to right-click on the Date and choose Exact Date from the
dropdown menu so that it uses the full date with “day”.
b. Graph the Volume vs. the exact Date as in the last part with a bar graph.
c. Create a scatterplot that graphs the Volume on the x-axis and the daily price
range on the y-axis. You will need to create an additional column that
contains the “range” of the prices for the day as the difference between the
fields High and Low.
Range = High – Low
Tableau can do it with a Calculated Field. In R you can do it by making a
new column equal to the result from subtracting the two columns. In Tableau,
to get a scatter plot, you will need to right click on both the Range and Volume
entries in graph and change them to “Dimensions”.
2) (20 pts) Use Tableau for this question. Open the GM cars dataset included with this
assignment (gmcar_price.txt). Each row represents a different car that was sold and
includes information about features like the mileage and the price of sale. Hint: use
the “Show Me” menu.
a. A treemap based on Price with a main subdivision for the Make of the car and
a minor subdivision based on the Model. Because each row of the data file
represents a single car but each box in the treemap represents all the cars with
a given make and model, pay very close attention to what kind of aggregation
is being used.
b. A packed bubble chart of the same type.
c. Write a short paragraph discussing the differences between the two plots.
Describe for each something that displayed more clearly than with the other.
d. Create a contingency plot (Tableau calls it a heat map under Show Me)
showing with color the number of cars (Number of Records) of each Type sold
by each Make. Explain at least one observation about that data that this chart
makes it easy to see.
3) (20 pts) This problem works with a dataset containing the population of Montana and
of each of the 7 Native American reservations within it (reservation70-00.xlsx). There
is a measurement for each decade between 1970 and 2000. Sheet1 has the original
data.
We will use Tableau for this question, but Sheet1 has a header that confuses Tableau.
If you’re interested, check out the “Data Interpreter” feature in Tableau to learn how
to deal with this. Otherwise, use Sheet2, where I’ve removed the header. We need a
few transformations to get the data ready to work with:
1. Renaming the 1970* field so it has no * and can be converted to a number
2. “Pivoting” the year fields in a similar manner to how it was demonstrated
in the tutorial.
3. Changing the name of the pivot fields to Year and Population, and
changing the type of the year field to “whole number”.
4. You can also hide the “Percent Change” field as it only contains
information for change over the entire period, not per decade.
5. If you would like to have an actual Date field for the Year, so that it is
treated by Tableau as a time instead of just a number, you need to create a
“Calculated Field”. It should construct a Date using the Year, i.e. make a
Date field that is on January 1 of the specified year:
makedate([Year], 1, 1)
6. We are not interested in the Montana population, only the reservation
populations. When you have used Location on your graph, you can right
mouse click (or click the down arrow within it) to apply filters. You can
also use “Exclude” from the right click menu on the legend just below the
“Marks” configuration.
Create graphs to show the following information, using appropriate graph types.
Make sure that the graphs are properly labeled and that the axis scales properly reflect
the type of data represented.
a. One chart that graphs the population growth over the years for the individual
reservations.
b. One that graphs the total reservation population for each year, subdivided
among the different reservations. The difference between this and (a) is that in
(b) we are not looking only at each population individually but at the growth
of the total population of all of them together, then subdivided by the
reservations.
4) For this question, answer only with text. You may include an illustration if you
would like, but you do not need to visualize data for this question.
a. Explain what we mean by ‘pre-attentive’ attributes. Are these as
effectively recognized by human perception when they are used in
combinations?
b. Use Weber’s Law to explain why it is important to include 0 in the
numerical axis of a bar chart.
5) This graph of cell phone
pricing plans is not very easy
to use. Use R for this
question and recreate this
graph in two different ways
of your choice. For each one,
explain what you are trying to
help the user see. For
example, one might be to
compare the cell phone
companies to see what kind
of plans they have. Another
might be best for examining
the trend of the relationship
between price and data
bandwidth. That relationship
may hold overall, or you
could look to see if it is
different per company. You
can decide what to visualize,
i.e. what question to answer
with your visualization, but make sure to explain what this visualization should be
showing. To get full credit, you must produce a graph which makes the answer to
your question immediately clear. It must also be well implemented, i.e. following
the guidelines at the top for a clean graph.
You do not need to type in all the values by hand. Here is R code that makes a
dataframe with these values in it:
cellPlans = data.frame(
c(“ATT”, “Sprint”, “Verizon”, “ATT”, “Sprint”,
“Verizon”, “ATT”, “Sprint”, “Verizon”, “ATT”,
“Verizon”, “Sprint”, “Verizon”, “ATT”,
“Verizon”, “Sprint”, “ATT”, “ATT”, “Sprint”),
c(1, 1, 2, 3, 3, 4, 6, 6, 8, 10, 12, 12, 16, 16,
24, 24, 25, 30, 40),
c(30, 20, 35, 40, 30, 50, 60, 45, 70, 80, 80, 60,
90, 90, 110, 80, 110, 135, 100))
names(cellPlans) = c(“Company”, “DataGB”, “Price”)