Please submit your responses to all questions below via WebCampus (text box submission) on the date indicated (see WebCampus).
If you are new to R, or just want a refresher, please work at your own pace through the following submodules of the UNR R workshop:
I know some of you are enrolled in these Saturday R workshops for credit as part of GRAD 778, but I’d like you to at least skim these modules now so you don’t fall behind!
Let’s look at the syntax of a custom R function:
f <- function( <arguments>){
## Do something interesting
}
The arguments, passed to the function when it is called, can specify
default values (using equals sign), or not. The value returned by a
function is the last expression in the body of the function to be
evaluated, or else can be specified using the return()
function.
For example, the lm()
function comes with base R and is
used for simple linear regression. Type the following in your R console
to see what the arguments of that function are:
args(lm)
Here is a simple custom function for calculating any number cubed, with example applications:
Cube.me <- function ( x= 1) {
x * x * x
}
Cube.me()
Cube.me(3)
x.vec <- c(2, 19, 99)
Cube.me(x.vec)
Try it in R!
When you’re ready, write four functions according to the instructions below, applying them and testing them as indicated.
Please submit all responses (answers to all questions) in a single text box submission via the assignment link posted in WebCampus by the indicated date/time (on WebCampus). All answers should include readable, well-commented R code along with with any requested code for testing your functions. Please make sure your submission is as clear and legible as possible!
Please note that these assignments are not formal exams- you are encouraged to ask for help as needed! You may also work in groups, but please submit your own assignment so you get credit!
Using the formula here (see “summary
metrics” section) and reproduced below, write a function that computes
the sample variance of any arbitrary data vector. Use
this function to compute the sample variance for the “height” column in
the “trees” dataset built into base R. Compare your answer with the
results from using the var()
function (built into R!). Your
submission should include (1) the function code, (2) a command (line of
code) using your function to compute the sample variance of the “height”
column, and (3) a test using the var()
function to compute
the same thing.
Here’s the formula: \(s^2 = \sum_{n=1}^{i}{\frac{(x_i-\bar{x})^2}{(N-1)}}\)
Using the formula here (see “summary
metrics” section), write a function that computes the population
standard deviation (standard deviation for a complete
population rather than an incomplete sample) for any arbitrary data
vector. Use this function to compute the standard deviation for the
“height” column in the “trees” dataset built into base R. Compare your
answer with the results from using the sd()
function. Note
that the result should not be exactly the same! Your submission should
include (1) the function code, (2) a command (line of code) using your
function to compute the population sd for the “height” column, and (3) a
test using the sd()
function to compute the sample sd.
Here’s the formula: \(\sigma = \sqrt{\sum_{n=1}^{i}{\frac{(x_i-\mu)^2}{N}}}\)
Write a function to compute the CV for any data vector and apply this
to the “height” column in the “trees” dataset. [hint: You should
remember that the Coefficient of Variation (CV) is the ratio of the
(sample) standard deviation to the mean.] Use the formula for the sample
standard deviation (not the population standard deviation) to compute
the CV. You are welcome to use the sd()
function in base R
within your function if you’d like. Your submission should include (1)
the function code and (2) a command (line of code) using your function
to compute the CV of the “height” column in the ‘trees’ dataset.
Write a function for drawing a regression line through a scatter
plot. [hint: Within your custom function, you will use the existing
plot()
, lm()
, and abline()
functions]. You will need to include at least two arguments in your
function, specifying the response vector and the predictor vector].
Apply this function to the “height” (predictor) and “volume” (response)
columns in the “trees” dataset, and then to the “waiting” (predictor)
and “eruptions” (response) columns in the “faithful” dataset. Your
submission should include (1) the function code, (2) a command (line of
code) using your function on the “trees” dataset as specified, and (3) a
command (line of code) using your function on the “faithful” dataset as
specified.