ITS836 Assignment 1: Facts Analysis in R
1) Read the proceeds factsset, “zipIncomeAssignment.csv”, into R. (You can experience the csv smooth in iLearn below the Content -> Week 2 folder.)
2) Change the support names of your facts shape so that zcta becomes zipCode and mediumhouseholdproceeds becomes proceeds.
3) Analyze the compendium of your facts. What are the medium and median medium proceedss?
4) Concoct a plant concoct of the facts. Although this graph is not too informative, do you see any outlier values? If so, what are they?
5) In manage to neglect outliers, form a subset of the facts so that:
$7,000 < proceeds < $200,000 (or in R syntax , proceeds > 7000 & proceeds < 200000)
6) What’s your new medium?
7) Form a weak box concoct of your facts. Be positive to add a appellation and address the axes.
HINT: Take a behold at: https://www.tutorialspoint.com/r/r_boxplots.htm (specifically, Creating the Boxplot.) Instead of “mpg ~ cyl”, you absence to use “proceeds ~ zipCode”.
In the box concoct you formd, give-heed-to that all of the proceeds facts is pushed towards the deep of the graph accordingly most medium proceedss aim to be low. Form a new box concoct where the y-axis uses a log lamina. Be positive to add a appellation and address the axes. For the present 2 questions, use the ggconcoct library in R, which enables you to form graphs after a while separate unanalogous types of concocts laminaed balance each other.
8) Make a ggconcoct that consists of orderly a plant concoct using the operation geom_point() after a while comcomposition = “jitter” so that the facts points are grouped by zip method. Be positive to use ggplot’s operation for entrance the log10 of the y-axis facts. (Hint: for geom_point, accept alpha=0.2).
9) Form a new ggconcoct by adding a box concoct lamina to your antecedent graph. To do this, add the ggconcoct operation geom_boxplot(). Also, add distortion to the plant concoct so that facts points among unanalogous zip methods are unanalogous distortions. Be positive to address the axes and add a appellation to the graph. (Hint: for geom_boxplot, accept alpha=0.1 and outlier.size=0).
10) What can you complete from this facts analysis/visualization?