In this submodule we will talk about how you can continue learning, gaining confidence, and expanding what you can do in R using contributed packages.

Whether you are an expert or a novice, you will always encounter situations where you don’t know exactly what to do. The great thing about R is that there is nearly always a way to do it (whether in base R or via packages contributed by R users from around the world), and once you get familiar with the basics, it’s usually not that hard to figure out how to accomplish your analysis objectives!

Load script for submodule #2.1

  1. Click here to download the script! Save the script to a convenient folder on your laptop.

  2. Load your script in RStudio. To do this, open RStudio and click on the folder icon in the toolbar at the top and load your script.

Packages

How is it possible to have so many features implemented in a single programming environment?

The answer is that you have a whole community of researchers adding new functionality all the time! This new functionality is implemented in the form of R Packages. Each package includes a set of functions and other objects for accomplishing specific tasks (and associated help files!).

You’ve already installed and worked with several packages associated with the ‘tidyverse’, like dplyr, ggplot2, tidyr, and lubridate.

Mode example

For a simple example (borrowed from Prelude in R), base R does not include a function for computing the mode of a vector. BUT, some R user decided that functionality would be useful! They coded up a function for computing the mode (in many different ways!), and then they documented it and wrapped it up into a package, called “modeest”, and made the package available in the CRAN repository (the ‘official’ repository system for R packages).

As you’ve already seen, packages in the CRAN repository can be installed using the function “install.packages()”:

# PACKAGES!  ---------------------

# install.packages("modeest")    # uncomment and run this if you haven't yet installed the package from CRAN! 
                                   # you only need to install this once, so you can comment it out if
                                   # you already have this package installed
library(modeest)    # load the package: This is package 'modeest' written by P. PONCET.

Now that we have installed and loaded the package in our current R session, we can learn more about it! One quick way to get a quick overview of the package is to use the following syntax:

library(help = "modeest")    # get overview of package

Can you use the package overview to identify a function for computing the mode of a data vector?

Let’s try to use it! If you haven’t already loaded the “data.missing.txt” dataset, download it here and save to your working directory.

newdf <- read.table(file="data_missing.txt", sep="\t", header=T)

# ?mlv   ?mfv # learn more about the functions for computing the mode (most likely value or most frequent value). Who knew there were so many methods for computing the mode?

  # lets find the most frequent value(s) in the "Export" column:
mfv(newdf$Export, na.rm = T)    
## [1] 0

Removing a package from your environment

Not that you need to do this very often, but if you want to remove a package from your current session, you can use the “detach()” function:

detach("package:modeest")  # remove the package from your current working session

Loading packages from GitHub

Many package developers use GitHub to make their packages available. Sometimes these packages are not yet on CRAN because they are not yet ready for prime time (so be careful!). Other times developers may choose to bypass the CRAN repository entirely, and instead just use GitHub. Most of the time, developers will use both GitHub and CRAN, where the version available on CRAN is not the latest version – for the latest version, you will need to use GitHub. There are some good reasons to put an R package on GitHub:

  1. It’s easier for others to peruse your code.
  2. People can actually fix any problems and send you a patch, which you can easily test and then incorporate into your package. Rather than having someone say, “There’s a typo in your documentation” they can say “Here, I’ve fixed a typo in your documentation.”
  3. With the install_github() function in the ‘devtools’ package, it’s easy for people to install your package directly from GitHub. It doesn’t have to be on CRAN (getting your package on CRAN can be a bit difficult)

The syntax for using “install_github()” is just install_github(“[author]/[package]”) Here’s an example using the “install_github()” function:

# install package from GitHub:

 # install.packages("remotes")    # run this if you haven't already installed the "remotes" package
library(remotes)
remotes::install_github("AckerDWM/gg3D")  # install a package from GitHub!

3d scatterplot example!

Before we move on, let’s use the package we just installed from github to make a 3d scatterplot!

Let’s load the packages:

# 3D Plotting example ---------------

library(tidyverse)
library(deSolve)
library(ggplot2)
library(plotly) 

Here is a data set that represents the number of times a dog barks in a day, and two potential explanatory factors: food given (in lbs) and cars passing by the house.

# Data: dog barks per day (and two explanatory variables)

dogbarks <- tibble(
  Cars= c(32, 28, 9, 41, 23, 26, 26, 31, 12, 25, 32, 13, 19, 19, 38,
          36, 43, 26, 21, 15, 17, 12, 7, 41, 38, 33, 31, 9, 40, 21),
  Food= c(0.328, 0.213, 0.344, 0.339, 0.440, 0.335, 0.167, 0.440, 0.328,
          0.100, 0.381, 0.175, 0.238, 0.360, 0.146, 0.430, 0.446, 0.345,
          0.199, 0.301, 0.417, 0.409, 0.142, 0.301, 0.305, 0.230, 0.118,
          0.272, 0.098, 0.415),
  Bark=c(15, 14, 6, 12, 8, 1, 9, 8, 1, 12, 14, 9, 8, 1, 19, 8, 13, 9,
       15, 11, 8, 7, 8, 16, 15, 10, 15, 4, 17, 0)
)

Let’s use the plotly package:

# plotly package to make 3d plot

plot_ly(x=dogbarks$Cars, y=dogbarks$Food, z=dogbarks$Bark, type="scatter3d", mode="markers", color="lightblue")

Plotly specializes in interactive plots- try changing the view angle by clicking and dragging. Or zooming in and out!

Finding the right package for the job

Usually a Google search is the best way to find packages for doing something you want to do! But you may also read about new packages in a journal (e.g., Methods in Ecology and Evolution) or textbook, or conference, or just talking with other researchers!

Learning more about how a package works

It’s not always perfectly obvious how to get started in actually using a new package. Here are three tips.

Built-in package overview

We’ve already seen this one- just use the following syntax to pull up an overview of any package:


library(help = "[package name]")

For example,

# Learning more about packages --------------------

# Package overview

library(help = "car")    # help file for the "car" package for applied regression

Package vignettes

“Vignettes” are provided in many packages, and are tremendously useful, providing worked examples. Usually the package overview (above) will tell you which vignettes are available (see above). Use the following syntax to pull up a vignette:


vignette('[name of vignette]', '[name of package]')

For example,

# package vignette 

browseVignettes('car')
vignette('embedding','car')   # pull up the "embedding" vignette in the 'car' package

Package documentation (html or PDF)

The ‘master’ documentation for any package is often packaged as a PDF file that can be downloaded from CRAN (each package on the CRAN repository has its own webpage- for example, here’s the one for the raster package)

Also, you can load the documentation for any package you have installed using the “help.start()” function:

# Load html documentation for R and all installed packages 

help.start()

Learning by example!

To me, there is no better way to learn (1) general coding, (2) best practices for statistical analyses, and (3) how to take advantage of specialized functions in packages than to try to work through and understand code examples that others have written.

One way to do this- for base R and for packages, is to access the examples provided in the package documentation. These examples are provided at the end of the help file for any function (using ‘help()’ or ‘?’). You can also run these examples using the “example()” function. For example:

# Built-in examples 

example(lm)   # run examples for "lm" function (included in base R)

Unfortunately, the code examples provided in help files (e.g., when using the leading question mark to pull up a help document) are often surprisingly unhelpful. Instead, you can try the package vignettes, or just use your favorite search engine to find more helpful examples on the web!

Package citations

Most packages include one or more citations, which are often peer-reviewed papers that document the novel algorithms included in the package, often in more detail than the help files. These are also the papers you should cite if you use the package for a manuscript. You can find these citations using the “citation()” function:

# package citations 

citation('car')   # citation for the 'car' package
## To cite the car package in publications use:
## 
##   Fox J, Weisberg S (2019). _An R Companion to Applied Regression_,
##   Third edition. Sage, Thousand Oaks CA.
##   <https://socialsciences.mcmaster.ca/jfox/Books/Companion/>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Book{,
##     title = {An {R} Companion to Applied Regression},
##     edition = {Third},
##     author = {John Fox and Sanford Weisberg},
##     year = {2019},
##     publisher = {Sage},
##     address = {Thousand Oaks {CA}},
##     url = {https://socialsciences.mcmaster.ca/jfox/Books/Companion/},
##   }
citation()    # and here's the citation for R in general- useful for when you use R for manuscripts
## To cite R in publications use:
## 
##   R Core Team (2023). _R: A Language and Environment for Statistical
##   Computing_. R Foundation for Statistical Computing, Vienna, Austria.
##   <https://www.R-project.org/>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {R: A Language and Environment for Statistical Computing},
##     author = {{R Core Team}},
##     organization = {R Foundation for Statistical Computing},
##     address = {Vienna, Austria},
##     year = {2023},
##     url = {https://www.R-project.org/},
##   }
## 
## We have invested a lot of time and effort in creating R, please cite it
## when using it for data analysis. See also 'citation("pkgname")' for
## citing R packages.

Online forums

In some cases you may run into issues with a particular package that you just can’t seem to solve. It’s still likely that others have had this problem! If you use some appropriate keywords in a google search you may find a link to an online forum (like stackoverflow) where someone has answered a similar question. It’s amazing how many knowledgeable people are answering questions on these forums - they are a tremendously valuable resource!

And of course, if you still can’t find an answer to your question, you can ask your question on one of these online forums. The best ones are probably Stack Overflow, and Cross Validated.

Ask the package author!!

When all else fails, you can always reach out to the package author! The author’s name and contact info are listed in the package overview, and I have found they are generally very responsive and willing to help!!

‘Cheat sheets’

There are lots of “cheat sheets” available for reminding you of basic R functionality.

For example, here is a good cheat sheet for using base R.

see Links page for more resources, including cheat sheets!

Google!

I don’t even understand why it works so well to google R help (‘r’ is just a single letter after all, how does a single letter help a search?), but it works pretty well!! With some basic programming knowledge, basic R know-how including how R packages work (and some cheat sheets by your desk), and the internet, you can pretty much always find a solution!

More helpful links for going further with R are provided on the Links page.

Any questions???

–go to next submodule–