Problem Sets

Return Home

Contact Us

Steven Holland

Becoming a better coder

7 September

Include only what is necessary

Only include the steps that are necessary to generate the answers to the problems. You will need to edit your commands down to what is needed.

Don’t display the values of vectors or data frames in your answers unless I specifically ask for them. This is particularly true for large data frames and long vectors. It is ok to show them for yourself as you work, but do not include them in the file you turn in.

Don’t generate additional objects unless you need them or unless they clarify the work, such as in long function calls.

If a function returns a vector, don’t wrap the result in c().

When accessing parts of a dataframe with a logical test, don’t wrap the logical test in a which() statement.

Avoid paths

Don't include anything with a path in your code; it will automatically generate errors on someone else's computer. You can set your path when you work. Just do it up front, and don't include that in your code.

Do not include the command setwd(), even if you comment it out. Use setwd() when you do work, but delete it from what you turn in.

Important: This is such a serious error that setting the path or embedding a path will now be assessed -3 per occurrence, not -1.

Improve your plots

If you are going to add points with points() after you create a plot, be sure to set type="n" when you call plot(). Not doing this can cause two problems. First, if you are not planning to add all of the data with points(), not setting type="n" will cause all of the points to appear. Second, not setting the type will cause thin black rims to appear around all of your data points.

Use informative labels such as for main and your axes. In particular, do not use names of objects, like granite.SiO2 or myRandomNumbers. Also, do not use the main label to repeat what is on your axes. For example, if you plot pH vs. alkalinity, having a main title that says “pH vs. alkalinity” is redundant, and it should be removed.

Future-proof your code

Avoid embedding magic numbers (hard-coded numeric constants) in your code. For example, if you want your plot limits to be the maximum and minimum of your data, don't find the maximum and minimum and embed those numbers in xlim. If the maximum and minimum change, your code will no longer work. Instead, call max() and min() where you need them, for example:

plot(x, y, xlim=c(min(x), max(x)))

If you need to use these repeatedly, save them as an object. This saves you from repeating yourself, decreases the likelihood of errors, and makes your code more self-explanatory.

xlimits <- c(min(x), max(x))
plot(x, y, xlim=xlimits)
plot(x, w, xlim=xlimits)
plot(x, z, xlim=xlimits)

Make your code readable

We use spaces in our writing to make it more readable; good programmers do the same with their code. In particular, put a space before and after the assignment <- operator; it makes the object easily distinguished from how it was created. Likewise, in a string of arguments for a function, put a space after every comma, but not around the = operators; this will help make each argument in a list more obvious.

Use single quotes or use double quotes, but don’t use both, because that makes the reader think that you are trying to convey something when you aren’t.

1 September

Have a sense of style

Use blank lines to separate groups of related commands. It's hard to read code where there are no blank lines, and too many blank lines is just as hard. Think of blank lines as the punctuation in code; it iss there to help you read. For the problem sets, treat each paragraph in the assignment as a block of code, and put one blank line before it and one blank line after it.

Do not precede lines of code with spaces or tabs, unless you are inside of an if statement, a while statement, a for loop, or a function definition. Indents in these cases clarify the code, and adding them elsewhere is unnecessary and causes confusion over your intent.

Use comments where necessary to identify what a block of code does, or to explain a critical or confusing step. Avoid commenting every line of code, or even most lines of code.

The convention in R is to use <- for assignments rather than =, and we will adhere to that convention in this course.