Introduction


  • Follow the structured outline to learn R basics and data analysis
  • Submit weekly scripts following the specified format
  • Contact me via mail or Slack for any queries

R Setup


  • R Studio is a shiny environment that helps you write R code
  • You can either write code directly in the console, or use script to organize and save your code
  • You can assign variables using <-
  • Be consistent in naming variables, other people should be able to read and understand your code

Packages in R


  • R packages are “add-ons” to R, they provide useful new tools.
  • Install a package using install.packages("packagename").
  • Use a package using library(packagename) at the beginning of your script.
  • Use :: to access specific functions from a package without loading it entirely.

Vectors and variable types


  • Scripts facilitate reproducible research
  • create vectors using c()
  • Numeric variables are used for computations, character variables often contain additional information
  • You can index vectors by using vector[index] to return or exclude specific indices
  • Use which() to filter vectors based on specific conditions

Projects


  • The working directory is where R looks for and saves files.
  • Absolute paths give full file locations; relative paths use the working directory.
  • R Projects manage the working directory automatically, keeping work organized.
  • Using R Projects makes code portable, reproducible, and easier to share.
  • A structured project with separate folders for data and scripts improves workflow.

Data Visualization (1)


  • Get an overview of the data by inspecting it or using glimpse() / describe()
  • Consult a codebook for more in-depth descriptions of the variables in the data
  • Visualize the distribution of a variable using hist()
  • Use ggplot(data = data, mapping = aes()) to provide the data and mapping to the plots
  • Add visualization steps like geom_point() or geom_smooth() using +

Data Visualization (2)


  • Get to know new data by inspecting it and computing key descriptive statistics
  • Visualize distributions of key variables in order to learn about factors that impact them
  • Visualize distribution of a numeric and a categorical variable using geom_density()
  • Visualize distribution of two categorial variables using geom_bar()

Filtering data


  • Use the pipe operator %>% to link multiple functions together
  • Use filter() to filter rows based on certain conditions
  • Use select() to keep only those rows that interest you

Creating new columns


  • Use mutate() to create and modify columns in a dataset.
  • Assign constant values, compute values from other columns, or use conditions to define new columns.
  • Use ifelse() for conditional column creation.
  • Compute row-wise sums and means efficiently using rowSums() and rowMeans().

Count and Summarize


  • Use count() to compute the number of occurrences for (combinations) of columns
  • Use summarize() to compute any summary statistics for your data
  • Use group_by() to group your data so you can receive summaries for each group separately
  • Combine functions like filter(), group_by() and summarize() using the pipe to receive specific results

Midterms


  • Apply what you have learned in new data!

t-Test


  • Something

Factor Analysis - Introduction & EFA


  • Something

Factor Analysis - CFA


  • Something

Factor Analysis - Advanced Measurement Models


  • Something