R - Beginners Workshop: Key Points

Pre-Alpha

R - Beginners Workshop

Introduction

Follow the structured outline to learn R basics and data analysis
Submit weekly scripts following the specified format
Contact me via mail or Slack for any queries

R Setup

R Studio is a shiny environment that helps you write R code
You can either write code directly in the console, or use script to organize and save your code
You can assign variables using <-
Be consistent in naming variables, other people should be able to read and understand your code

Packages in R

R packages are “add-ons” to R, they provide useful new tools.
Install a package using install.packages("packagename").
Use a package using library(packagename) at the beginning of your script.
Use :: to access specific functions from a package without loading it entirely.

Vectors and variable types

Scripts facilitate reproducible research
create vectors using c()
Numeric variables are used for computations, character variables often contain additional information
You can index vectors by using vector[index] to return or exclude specific indices
Use which() to filter vectors based on specific conditions

Projects

The working directory is where R looks for and saves files.
Absolute paths give full file locations; relative paths use the working directory.
R Projects manage the working directory automatically, keeping work organized.
Using R Projects makes code portable, reproducible, and easier to share.
A structured project with separate folders for data and scripts improves workflow.

Data Visualization (1)

Get an overview of the data by inspecting it or using glimpse() / describe()
Consult a codebook for more in-depth descriptions of the variables in the data
Visualize the distribution of a variable using hist()
Use ggplot(data = data, mapping = aes()) to provide the data and mapping to the plots
Add visualization steps like geom_point() or geom_smooth() using +

Data Visualization (2)

Get to know new data by inspecting it and computing key descriptive statistics
Visualize distributions of key variables in order to learn about factors that impact them
Visualize distribution of a numeric and a categorical variable using geom_density()
Visualize distribution of two categorial variables using geom_bar()

Filtering data

Use the pipe operator %>% to link multiple functions together
Use filter() to filter rows based on certain conditions
Use select() to keep only those rows that interest you

Creating new columns

Use mutate() to create and modify columns in a dataset.
Assign constant values, compute values from other columns, or use conditions to define new columns.
Use ifelse() for conditional column creation.
Compute row-wise sums and means efficiently using rowSums() and rowMeans().

Count and Summarize

Use count() to compute the number of occurrences for (combinations) of columns
Use summarize() to compute any summary statistics for your data
Use group_by() to group your data so you can receive summaries for each group separately
Combine functions like filter(), group_by() and summarize() using the pipe to receive specific results

Midterms

Apply what you have learned in new data!

t-Test

Something

Factor Analysis - Introduction & EFA

Something

Factor Analysis - CFA

Something

Factor Analysis - Advanced Measurement Models

Something