Packages in R
Last updated on 2025-06-24 | Edit this page
Overview
Questions
- What is an R package?
- How can we use R packages
Objectives
- Explain how to use R packages
- Explain how to install R packages
- Understand the difference between installing and loading an R package
R Packages
Per default, R provides you with some basic functions like
sum()
, mean()
or t.test()
. These
functions can already accomplish a lot, but for more specialized
analyses or more user-friendly functions, you might want to use
additional functions.
If you are in need of a specific function to achieve your goal, you can either write it yourself (more on this later) or use functions written by other users. These functions are often collected in so-called “packages”. The official source for these packages on R is CRAN (the comprehensive R archive network).
Packages you may encounter
Packages make R really powerful. For 95% of your analysis-needs,
there probably exists a package designed specifically to hand this. For
example, some packages you might use often are tidyverse
for data cleaning, psych
for some psychology specific
functions, afex
for ANOVAs or lme4
for
multi-level models. You can even use R packages for more complicated
analyses like structural equation models (lavaan
) or
bayesian modeling (brms
). You can even write papers using R
using papaya
. Even this website was written using the
R-packages rmarkdown
and sandpaper
.
CRAN makes it really easy to use the over 7000 R packages other users
provide. You can install them using
install.packages("packagename")
with the name of the
package in quotation marks. This installs all functionalities of this
packages on your machine. However, this package is not automatically
available to you. Before using it in a script (or the console) you need
to tell R to “activate” this package. You can do this using
library(packagename)
. This avoids loading all installed
packages every time R is starting (which would take a while).
Using functions without loading a package
If you are only using a few functions from a certain package (maybe
even only once), you can avoid loading the entire package and only
specifically access that function using the ::
operator.
You can do this by typing packagename::function()
. If the
package is installed, it will allow you to use that function without
calling library(packagename)
first. This may also be useful
in cases where you want to allow the reader of your code to easily
understand what package you used for a certain function.
Demonstration
First, we need to install a package. This will often generate a lot of text in your console. This is nothing to worry about. In most cases, it is enough to look at the last few messages, they will tell you what went wrong or whether everything went right.
R
install.packages("dplyr")
OUTPUT
The following package(s) will be installed:
- dplyr [1.1.4]
These packages will be installed into "~/work/r-for-empra/r-for-empra/renv/profiles/lesson-requirements/renv/library/linux-ubuntu-jammy/R-4.5/x86_64-pc-linux-gnu".
# Installing packages --------------------------------------------------------
- Installing dplyr ... OK [linked from cache]
Successfully installed 1 package in 4.9 milliseconds.
Then, we will need to load the package to make its functions
available for use. For most packages, this will also print a lot of
messages in the console in the bottom left. Again, this is usually
harmless. If something does go wrong, you will see the word
Error:
along with a message somewhere.
Warnings:
can often be ignored in package installation.
R
# Loading the package dplyr
library(dplyr)
OUTPUT
Attaching package: 'dplyr'
OUTPUT
The following objects are masked from 'package:stats':
filter, lag
OUTPUT
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
Now we can use all the functions that dplyr
provides.
Let’s start by using glimpse()
to get a quick glance at
some data. For this case, we are using the iris
data, that
comes with your default R installation.
R
# iris is a dataset provided as default in R
glimpse(iris)
OUTPUT
Rows: 150
Columns: 5
$ Sepal.Length <dbl> 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9, 5.4, 4.…
$ Sepal.Width <dbl> 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1, 3.7, 3.…
$ Petal.Length <dbl> 1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 1.5, 1.5, 1.…
$ Petal.Width <dbl> 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.3, 0.2, 0.2, 0.1, 0.2, 0.…
$ Species <fct> setosa, setosa, setosa, setosa, setosa, setosa, setosa, s…
R
# Using a function without loading the entire package
# dplyr::glimpse(iris)
Here, we can see that iris
has 150 rows (or
observations) and 5 columns (or variables). The first four variables are
size measurements regarding length and width of sepal and petal and the
fifth variable is a variable containing the species of flower.
Challenges
Challenge 1:
Install the following packages: dplyr
,
ggplot2
, and psych
.
Challenge 2:
Load the package dplyr
and get an overview of the data
mtcars
using glimpse()
.
Challenge 3:
Figure out what kind of data mtcars
contains. Make a
list of the columns in the dataset and what they might mean.
Hint
You are allowed to use Google (or other sources) for this. It is common practice to google information you don’t know or look online for code that might help.
Challenge 4:
Use the function describe()
from the package
psych
without loading it first.
What differences do you notice between glimpse()
and
describe()
?
Key Points
- R packages are “add-ons” to R, they provide useful new tools.
- Install a package using
install.packages("packagename")
. - Use a package using
library(packagename)
at the beginning of your script. - Use
::
to access specific functions from a package without loading it entirely.