ECON 314 · Reference Guide

R Guide

Kyler Patrick, 2026.

Sections

01

R Markdown

Create .rmd files, insert code chunks, format text with LaTeX

02

Coding Tips

Data types, clean code practices, naming objects & variables

03

Packages

Installing and loading packages; recommended starter packages

04

Loading Data

Load .RData and .csv files; clean data, remove NAs, create dummies

05

Summary Tables

summary(), stargazer, and export_summs for professional tables

06

Regression Models

Linear models with lm(), logarithmic models, robust standard errors

07

Visualizations

Base R plots, histograms, correlation, and intro to ggplot2

A. Creating an R Markdown File (.rmd)

Open RStudio. Click File → New File → R Markdown…
Name the file and select your output type: HTML, PDF, or Word. Word is best for most users as it converts easily. HTML works without LaTeX or Microsoft Office. PDF requires LaTeX (which can be difficult to install). The output can be changed at the top of the file next to output:.
Every new R Markdown file starts with a setup chunk — do not delete it:

R

{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

⚠ Important

Delete any other auto-generated example code when you open a new file, but always keep the setup chunk.

Save the file: File → Save As… and choose your location.
R does not auto-save. Save frequently with File → Save (or Ctrl+S), especially after writing major code chunks.

B. Inserting a Code Chunk

The quickest way to insert a code chunk is with a keyboard shortcut:

OS	Shortcut
Windows	Ctrl + Alt + I
Mac	Cmd + Option + I

Alternatively, you can manually type the chunk delimiters:

R Markdown

```{r}
print("Example Code Chunk")
```

Output

## [1] "Example Code Chunk"

C. Text Formatting in R Markdown

Emphasis

Text outside of code chunks is treated as plain text. Use asterisks for emphasis:

Syntax	Result
italics	italics
bold	bold

Math & Equations (LaTeX Syntax)

Use dollar signs to write math inline or as a displayed equation:

Syntax	Use	Example
$...$	Inline math	Write beta: $β$
$$...$$	Displayed equation	$$\frac{1}{2} + \frac{1}{2} = 1$$
x^{n}	Superscript	2² = 4

💡 Tip

To type a literal dollar sign in markdown (not math), escape it with a backslash: \$. Don't do this inside code boxes.

A. Types of Data

Type	Description	Example
Character	Text / string data. Created with "" or ''. Any non-numeric entry in loaded data becomes a character.	"hello"
Integer	Whole numbers. Can be converted to/from character or numeric.	5L
Numeric	Non-whole numbers. Can be converted to/from character or numeric.	3.14

B. Clean Coding Practices

1. Use Spacing

Spaces between lines and between operators improve readability and make debugging easier.

R — Good

ExampleDF <- as.data.frame(ExampleData)

GuideLm1 <- lm(Var1 ~ Var2 + Var3, Data = Example)

R — Avoid this

# Harder to read and debug:
Example<-as.data.frame(ExampleData)
GuideLm1<-lm(Var1~Var2+Var3,Data=Example)

2. Sensible Variable Names

Since variable names can't contain spaces, use one consistent substitution method — either underscores or CamelCase — and never mix them:

R

# CamelCase:
ExampleVariable <- ExampleData$Variable1

# Underscores:
Example_Variable <- ExampleData$Variable1

3. Abbreviating Long Names

If a variable name is too long, remove vowels — but do it consistently for everything:

R

ExmplVrble <- ExampleVariable

4. Naming Models

Pick a consistent prefix for model names (lm, reg, or model) and number them sequentially:

R

lm1 <- lm(data$y ~ data$x1 + data$x2)
lm2 <- lm(data$y ~ data$x3 + data$x4)

5. Comments

Use # to add comments or prevent code from running. To comment/uncomment multiple lines at once, highlight them and press Ctrl+Shift+C.

R

# This is a comment — it won't run
# summary(data)  <-- this line is "commented out"
summary(data)  # runs normally

C. Saving Files

Create one main folder for the course, then subfolders for data files and markdown files. If you're using a lab desktop, use a flash drive so you can access files on any machine.

D. Naming & Accessing Objects

Creating named objects

Assign any object (data frames, models, variables, tables) using <-:

R

Object1 <- summary(ExampleData$x1)

Selecting a specific variable

Use the $ operator to pull a specific column from a data frame:

R

ExampleData$x1

Installing Packages

Use install.packages() to install a package. The name must be spelled correctly and surrounded by quotes. Only install each package once.

R

install.packages("stargazer")

💡 Recommended Starter Packages for ECON 314

Package	Purpose
stargazer	Publication-quality summary and model tables
jtools	Model tables with robust SE support
lmtest	Hypothesis tests for regression models
sandwich	Robust standard error estimation
tidyverse	Includes ggplot2, dplyr, readr, tidyr, purrr, tibble, stringr, forcats, lubridate

Loading Packages

Use library() to load a package each time you open R. No quotes needed.

R

library(stargazer)

Output

## Please cite as:
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3. https://CRAN.R-project.org/package=stargazer

💡 Tip for Beginners

While still learning R, avoid putting all your library() calls at the top of the document. Instead, load each package right before you need it — this helps you learn what each package actually does.

📦 Example Dataset

The examples below use the wage1 dataset from the wooldridge package. Run this to follow along:

R

# install.packages("wooldridge")  # run once, then comment out
library(wooldridge)
data("wage1")
ExampleData <- wage1
View(ExampleData)

A. Loading a .RData File

Copy the file path. On Windows: click the file then Ctrl+Shift+C, or right-click → "Copy as path". On Mac: Control-click → hold Option → "Copy as Pathname".
Switch any backslashes (\) to forward slashes (/) in the path. Then use load() and View():

R

# load("C:/Users/kkpat/Desktop/Econometrics/RData/attend.Rdata")
# ExampleData1 <- attend
# View(ExampleData1)

B. Loading a .csv File

Convert Excel files (.xlsx/.xls) to .csv by saving as CSV in Excel first.
Load tidyverse or readr, then use read_csv(). The result is a tibble — a modernized data frame:

R

library(tidyverse)
ExampleData2 <- read_csv("C:/Users/kkpat/Desktop/STAT320/Data/SelectedVars.csv")
View(ExampleData2)

To save the tibble as an .RData file (saves to the same folder as your R Markdown):

R

save(ExampleData, file = "ExampleData.RData")

C. Cleaning Data

Remove NA values

Use na.omit() on the full dataset or a specific variable:

R

ExampleDataNoNA <- na.omit(ExampleData)

ExampleData$WageNoNA <- na.omit(ExampleData$wage)

Create a subset of variables

Use subset() to select specific columns from a large dataset:

R

ExampleDataSubset <- subset(ExampleData, select = c(wage, educ, exper))
View(ExampleDataSubset)

Create a numeric dummy variable

Use ifelse() to convert a categorical variable into a binary (0/1) dummy:

R

# Creates a dummy = 1 if "old", = 0 if "new"
ExampleData$x1Dummy <- ifelse(DataName$x1 == "old", 1, 0)

A. The summary() Command

For data — five number summary + mean

R

summary(ExampleData$wage)

Output

##  Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
## 0.530   3.330   4.650   5.896   6.880  24.980

For models — full regression summary

R

Model1 <- lm(wage1$wage ~ wage1$educ + wage1$exper)
summary(Model1)

Output

## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  -3.39054    0.76657  -4.423 1.18e-05 ***
## wage1$educ    0.64427    0.05381  11.974  < 2e-16 ***
## wage1$exper   0.07010    0.01098   6.385 3.78e-10 ***
##
## Multiple R-squared: 0.2252,  Adjusted R-squared: 0.2222
## F-statistic: 75.99 on 2 and 523 DF,  p-value: < 2.2e-16

B. Stargazer

Stargazer produces publication-quality tables for both data and models. Use type = "text" for working in R; switch to type = "html" with an out = "Table.htm" argument for final papers.

Data summary table

R

library(stargazer)
stargazer(as.data.frame(ExampleData), type = "text", title = "Data Summary")

Model summary table

R

stargazer(Model1, type = "text", title = "Model Summary")

Multiple models side by side

R

stargazer(Model1, Model2, type = "html", out = "ModelTable1.htm",
          title = "Model 1 and 2 Summary")

Robust standard errors comparison

R

library(stargazer)
library(sandwich)

RobustSE <- sqrt(diag(vcovHC(Model1, type = "HC1")))

stargazer(Model1, Model1,
          se = list(NULL, RobustSE),
          type = "text",
          title = "Without and With Robust Standard Errors",
          notes = "Robust standard errors on the right",
          notes.append = TRUE)

C. export_summs (jtools)

A simpler alternative that outputs text-style model tables.

Basic model table

R

library(jtools)
export_summs(Model1)

Standard SE vs Robust SE comparison

R

library(jtools)
Model1Robust <- summ(Model1, robust = TRUE)

export_summs(Model1, Model1Robust,
             model.names = c("Standard SE", "Robust SE"))

A. Linear Models — lm()

Use lm() to fit an Ordinary Least Squares regression. Two equivalent formats:

R — Format 1 ($ notation)

Reg1 <- lm(ExampleData$wage ~ ExampleData$educ
                              + ExampleData$exper
                              + ExampleData$female)

R — Format 2 (data = argument)

Reg1 <- lm(wage ~ educ + exper + female, data = ExampleData)
# Variable names must exactly match the dataset columns

💡 Tip

The tilde ~ (meaning "is modeled by") is typed with Shift + ` (the key left of 1).

summary(Reg1) Output

## Coefficients:
##                    Estimate Std. Error t value  Pr(>|t|)
## (Intercept)       -1.73448    0.75362  -2.302   0.0218 *
## ExampleData$educ   0.60258    0.05112  11.788  < 2e-16 ***
## ExampleData$exper  0.06424    0.01040   6.177 1.32e-09 ***
## ExampleData$female -2.15552   0.27031  -7.974 9.74e-15 ***
##
## Multiple R-squared: 0.3093,  Adjusted R-squared: 0.3053

B. Logarithmic Models

To estimate a log-linear model, first create the log of the variable as a new column, then run lm() as usual.

R

# Step 1: Create the log variable
ExampleData$Logwage <- log(ExampleData$wage)

# Step 2: Run the regression on the log outcome
Reg2 <- lm(ExampleData$Logwage ~ ExampleData$educ
                                  + ExampleData$exper
                                  + ExampleData$female)
summary(Reg2)

Output

## Coefficients:
##                     Estimate Std. Error t value  Pr(>|t|)
## (Intercept)         0.480836   0.105016   4.579 5.86e-06 ***
## ExampleData$educ    0.091290   0.007123  12.816  < 2e-16 ***
## ExampleData$exper   0.009414   0.001449   6.496 1.93e-10 ***
## ExampleData$female -0.343597   0.037667  -9.122  < 2e-16 ***
##
## Multiple R-squared: 0.3526,  Adjusted R-squared: 0.3488

A. Simple Base R Plots

Scatter plot

Use plot(x, y) — list the x variable first, then y:

R

plot(ExampleData$exper, ExampleData$wage)

Correlation

Calculate the correlation coefficient between two variables with cor():

R

cor(ExampleData$exper, ExampleData$wage)

Output

## [1] 0.1129034

Histogram

Use hist() with a single variable to see its distribution:

R

hist(ExampleData$wage)

B. ggplot2

The ggplot2 package (part of the tidyverse) is far more powerful and flexible for creating publication-quality visualizations. It supports scatter plots with best-fit lines, histograms, violin plots, box-and-whisker plots, line charts, pie charts, and much more.

📖 Resource

The official ggplot2 documentation and cheat sheet are available at: https://ggplot2.tidyverse.org/

Quick ggplot2 starter example

R

library(ggplot2)

# Scatter plot with a best-fit line
ggplot(ExampleData, aes(x = educ, y = wage)) +
  geom_point(alpha = 0.4) +
  geom_smooth(method = "lm", se = TRUE) +
  labs(title = "Wage vs Education",
       x = "Years of Education",
       y = "Hourly Wage ($)") +
  theme_minimal()

R Guide

How to Set Up and Use R Markdown

General Coding Tips

Installing & Loading Packages

Loading Data & Data Cleaning

Summary Tables

Creating Regression Models

Data Visualizations