Loading Data
We won’t go through loading data into R much in this tutorial because we’ll be using the interactive interface. Loading data is sometimes one of the trickier steps for people if they aren’t familiar with file paths. That link will explain to you a little about why you should keep your files organized in folders, what a file path is, and where to find it.
For more information on loading files into R, you can go to the loading data workshop that will walk you through how to add files on both the server and desktop versions of R.
Load Libraries Before Data
R comes with built in functions, but you can access many, many more by loading in libraries. Since R is open source, libraries can be written by anyone and can contain very specific functions. In order to load in the libraries you will use the function library() where the name of what you want goes inside the parentheses. However, you need to have a library installed before you use it, and to do that you have to use the function install.packages().
Since you are using the server version of R, you should not need to install most packages, you can just load them, but if you’re using your own version of R, or are wanting to use something more esoteric, you may need to install packages first.
Think of installing packages like buying a book and loading libraries like getting the book off the shelf. You only need to buy it once (install.packages()), but every time you want to use it, you need to get it off the shelf (library()).
The words “package” and “library” refer to basically the same thing and are used interchangeably.
Reading in a File
Now that you have uploaded your file, you can read it in using a file path. A file path is a line of code that says exactly where your file is. If you have a Workshop folder in your Home folder, its file path is ~/Workshop. If you wanted to look at a specific file called “Example.csv” in that folder, the file path to that file would be ~/Workshop/Example.csv.
If you are confused by file paths or you’re using the desktop version of R and not the server, you can find more info on file paths here.
To load a csv file (a comma separated value file) you use the command read_csv() that comes from the library {tidyverse}. To load an excel file you need to load an additional library and would use the command read_excel(). csv files are a good way to store data because they are not platform specific (not branded by Microsoft or another company) and they don’t include hidden formatting.
To load our data, we assign it to an object. You can call the object raw_data or whatever you’d like so that it is informative. Note that when you reference the file, you need to have the name exact, so it is case sensitive and requires the file extension (the bit after the .).
# load library
library(tidyverse)
# read in data
raw_data <- read_csv("Example.csv")Workshop Setup
For this workshop, all the libraries we use are pre-loaded for you. But, if you want to run the code in this workshop in your own version of R, you can copy, paste, and run the following code:
# load library
library(tidyverse)
library(palmerpenguins)
# this line is just to make the object 'penguins' appear in the Environment
penguins <- penguinsNow you’ll be setup the same way as this document.