+ - 0:00:00
Notes for current slide
Notes for next slide

Introduction to R

Earo Wang
earo.wang@gmail.com

1 / 16

Consulation

Thursday 9:30 to 10:30 at W1105

2 / 16

What is R?

R is the language of data analysis.

-It works with objects and functions.

-Its basic functions are similar to what you can do in Excel.

-But it has almost every statistical and data analytics tool available.

-It easily allows people to create their own functions and share them with the world.

-It is free, available on any operating system and growing in popularity in business.

-It can be hard to learn at first, but it gets easier the more you use it.

3 / 16

Objects in R

4 / 16

Objects allow things to be created and stored.

5 / 16

Objects allow things to be created and stored.

They can be variables,

x <- 5
x + 1
## [1] 6
5 / 16

Objects allow things to be created and stored.

They can be variables,

x <- 5
x + 1
## [1] 6

or they can be vectors, matrices, and entire datasets.

y <- c(1, 4, 2, 6, 5, 0)
y
## [1] 1 4 2 6 5 0
z <- matrix(y, nrow = 2, ncol = 3)
z
## [,1] [,2] [,3]
## [1,] 1 2 5
## [2,] 4 6 0
5 / 16

Objects allow things to be created and stored.

They can be variables,

x <- 5
x + 1
## [1] 6

or they can be vectors, matrices, and entire datasets.

y <- c(1, 4, 2, 6, 5, 0)
y
## [1] 1 4 2 6 5 0
z <- matrix(y, nrow = 2, ncol = 3)
z
## [,1] [,2] [,3]
## [1,] 1 2 5
## [2,] 4 6 0

An object can be almost everything!

5 / 16

Functions in R

6 / 16

Functions are a set of instructions to turn an input into an output.

myMean <- function(x){
N <- length(x)
mean <- sum(x) / N
return(mean)
}
x <- c(4, 1, 6, -3, 7, 1)
myMean(x)
## [1] 2.666667
7 / 16

Functions are a set of instructions to turn an input into an output.

myMean <- function(x){
N <- length(x)
mean <- sum(x) / N
return(mean)
}
x <- c(4, 1, 6, -3, 7, 1)
myMean(x)
## [1] 2.666667

It's easy to give them multiple inputs.

addAndDouble <- function(x, y){
2 * (x + y)
}
addAndDouble(5, 2)
## [1] 14
7 / 16

R comes with many built in functions

x <- c(4, 1, 6, -3, 7, 1)
mean(x)
## [1] 2.666667
8 / 16

R comes with many built in functions

x <- c(4, 1, 6, -3, 7, 1)
mean(x)
## [1] 2.666667
max(x)
## [1] 7
8 / 16

R comes with many built in functions

x <- c(4, 1, 6, -3, 7, 1)
mean(x)
## [1] 2.666667
max(x)
## [1] 7
sd(x)
## [1] 3.723797
8 / 16

R comes with many built in functions

x <- c(4, 1, 6, -3, 7, 1)
mean(x)
## [1] 2.666667
max(x)
## [1] 7
sd(x)
## [1] 3.723797
cor(x, c(0, -5, 1, 3, 11, 2))
## [1] 0.4016281
8 / 16

R comes with many built in functions

x <- c(4, 1, 6, -3, 7, 1)
mean(x)
## [1] 2.666667
max(x)
## [1] 7
sd(x)
## [1] 3.723797
cor(x, c(0, -5, 1, 3, 11, 2))
## [1] 0.4016281
sort(x)
## [1] -3 1 1 4 6 7
8 / 16

R can do all of the regression you would do in eViews, Excel, Stata, SAS etc.

x <- rnorm(1000)
y <- 2 + 3*x + rnorm(1000, sd = 0.1)
linReg <- lm(y ~ x)
summary(linReg)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.35405 -0.06416 -0.00191 0.06639 0.28754
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.002005 0.003126 640.5 <2e-16 ***
## x 3.002633 0.003196 939.5 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09884 on 998 degrees of freedom
## Multiple R-squared: 0.9989, Adjusted R-squared: 0.9989
## F-statistic: 8.826e+05 on 1 and 998 DF, p-value: < 2.2e-16
9 / 16

Packages and Datasets

10 / 16

There are thousands of packages people have written with extra functions and data.

The dataset economics is part of the ggplot2 package and the function glimpse is part of the dplyr package. Install these once per computer.

install.packages("ggplot2")
install.packages("dplyr")
11 / 16

There are thousands of packages people have written with extra functions and data.

The dataset economics is part of the ggplot2 package and the function glimpse is part of the dplyr package. Install these once per computer.

install.packages("ggplot2")
install.packages("dplyr")

Then use library(ggplot2) and library(dplyr) to load their contents into R.

library(ggplot2)
library(dplyr)
data(economics, package="ggplot2")
11 / 16

There are thousands of packages people have written with extra functions and data.

The dataset economics is part of the ggplot2 package and the function glimpse is part of the dplyr package. Install these once per computer.

install.packages("ggplot2")
install.packages("dplyr")

Then use library(ggplot2) and library(dplyr) to load their contents into R.

library(ggplot2)
library(dplyr)
data(economics, package="ggplot2")

Now we have access to everything in dplyr and ggplot2 until you close R!

11 / 16

Now we can look at this economics data

glimpse(economics)
## Observations: 574
## Variables: 6
## $ date <date> 1967-07-01, 1967-08-01, 1967-09-01, 1967-10-01, 1967...
## $ pce <dbl> 507.4, 510.5, 516.3, 512.9, 518.1, 525.8, 531.5, 534....
## $ pop <int> 198712, 198911, 199113, 199311, 199498, 199657, 19980...
## $ psavert <dbl> 12.5, 12.5, 11.7, 12.5, 12.5, 12.1, 11.7, 12.2, 11.6,...
## $ uempmed <dbl> 4.5, 4.7, 4.6, 4.9, 4.7, 4.8, 5.1, 4.5, 4.1, 4.6, 4.4...
## $ unemploy <int> 2944, 2945, 2958, 3143, 3066, 3018, 2878, 3001, 2877,...

Alternatively, you can use any of these to get an idea about what the dataset looks like

head(economics)
tail(economics)
summary(economics)
str(economics)
12 / 16

Basic plotting with ggplot2

13 / 16

Histogram

# histogram
ggplot(data = economics) +
geom_histogram(aes(pce))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

14 / 16

Scatterplot

ggplot(data= economics) +
geom_point(aes(x = psavert, y = pce))

15 / 16

If you're ever stuck on how to use a function, you can get help with ?

?mean
?ggplot

There's also tonnes of online resources available,

16 / 16

Consulation

Thursday 9:30 to 10:30 at W1105

2 / 16
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow