Problem Sets



Steven Holland

Problem Set 2: Plotting

Plot 1

Generate 1000 random numbers from a log-normal distribution with a log mean of 1.7 and a log standard deviation of 0.5, and assign them, with a name of your choosing.

Plot a frequency distribution (histogram) of these data. Suggest 40 breaks, color the bars gray, rotate the y-axis labels, and do not show a main title.

Plot 2

For the next three plots, we will use our textbook by Michael J. Crawley, “Statistics: An Introduction using R”, which I will just call Crawley from here on. Read Chapters 1, 2, and the Appendix (Essentials of the R Language) in Crawley.

Download the worms data set (the URL is in the preface), and leave the file name unchanged (that is, it should stay as worms.csv). Read the data into R using read.table() and assign it to an object named worms. You should not edit the data before reading it in; load it as it is. Display the data frame; it should look like what is shown on the bottom of page 25 in Crawley.

Using bracket notation, display columns 1 and 4 for all rows. Hint: use c() to identify the columns.

In one command, display columns 2 and 3 for the samples for which the Damp field is FALSE.

Make a new plot window. Be sure to use a command that will work on all operating systems.

In that window, make a scatterplot of worm density versus soil pH. Do not create new temporary variables for worm density and soil pH, and do not use attach(). Remember that “A vs. B” means that A is the dependent variable and is shown on the y-axis, whereas B is the independent variable and is displayed on the x-axis. Use appropriate labels for the x and y axes (i.e., ones that do not include periods in the names). Rotate the y-axis values so that they are horizontal. Do not add a title to the plot. Use small solid black circles for the data points. Do all of this with one line of code.

Add a line for the regression in one line of code, making sure that you get the dependent and independent variables correct. The line should be blue and dashed (see lty in par).

Once you have made your plot, create a 7"x7" pdf file with the pdf() function, and recreate your plot. Save this plot as XXXXScatter.pdf, where XXXX is your last name, lowercase (e.g., hollandScatter.pdf).

Plot 3

Make a new plot window, and plot worm density versus pH, with all labeling as in Plot 2, but use the following colors for the small filled circles. You may find it easiest to build this plot in multiple steps with the points() command.

Arable: Blue
Grassland: Green
Meadow: Yellow
Orchard: Orange
Scrub: Red

Add these points using logical operations, not with row numbers. (hint: worms$Soil.pH[worms$Vegetation=='Orchard']). Ideally, you should be able to make plot 3 in seven lines of code or fewer. NEW: As always, rotate the y-axis labels and give meaningful names to your x and y axes. Also, feel free to use legend() to identify the colors, but this is not required.

Plot 4

Make a new plotting window that is 4" wide and 7" tall. Use the mfrow argument in par to create six plotting areas in this window, with the plots in two columns and three rows. Add these six plots of worm density versus pH, in this order:

all of the data - small black filled circles
only arable data - small blue filled circles
only grassland data - small green filled circles
only meadow data - small yellow filled circles
only orchard data - small orange filled circles
only scrub data - small red filled circles

Use a consistent xlim so that all plots have the same range for their x axes; do the same for their y axes. Do not embed magic numbers in xlim and ylim (Hint: use min and max). Set an appropriate main title for each plot (e.g., All, Grassland, etc.). You should be able to create the entire set of plots in eight lines of code or fewer. NEW: Again, rotate the y-axis labels and give meaningful names to your x and y axes. You should do this for all plots from here on.

Final instructions

Assemble and edit your R commands in a text editor. Separate the main sections of your code with a blank line and a comment, such as # Plot 1. Running your commands should produce four windows with plots and one pdf file.

Do not email the data file to me, as I have it already. Do not email your pdf file, as your code will generate it. Your commands will generate your pdf file when I run them. E-mail your commands file to Steven Holland, following the standard instructions. This problem set is due 4 September.