Intro to R
The goal of this lab is to introduce you to R and RStudio,
which we’ll be using throughout the course both
To clarify which is which:
As the labs progress,
Before we get to that stage, however,
Today we begin with the fundamental building blocks of R and RStudio:
Go ahead and launch RStudio.
You should see a window that looks like the image shown below.
The panel on the lower left is where the action happens.
The panel in the upper right
Any plots that you generate will show up
generateData <- function(N) data.frame(time = Sys.time() + 1:N,
sym = "AUDUSD",
bid = rep(1.2345,N) + runif(min = -.0010,max = .0010,N),
ask = rep(1.2356,N) + runif(min = -.0010,max = .0010,N),
exch = sample(c("EBS","RTM","CNX"),N, replace = TRUE))
prices <- generateData(50)
head(prices, 5)
time sym bid ask exch
1 2023-09-07 16:03:45 AUDUSD 1.233667 1.235822 EBS
2 2023-09-07 16:03:46 AUDUSD 1.234914 1.236369 CNX
3 2023-09-07 16:03:47 AUDUSD 1.233837 1.236454 CNX
4 2023-09-07 16:03:48 AUDUSD 1.234259 1.234837 EBS
5 2023-09-07 16:03:49 AUDUSD 1.234900 1.234914 CNX
time sym bid ask exch spread mid
1 2023-09-07 16:03:45 AUDUSD 1.233667 1.235822 EBS 2.155507e-03 1.234744
2 2023-09-07 16:03:46 AUDUSD 1.234914 1.236369 CNX 1.454971e-03 1.235641
3 2023-09-07 16:03:47 AUDUSD 1.233837 1.236454 CNX 2.616981e-03 1.235146
4 2023-09-07 16:03:48 AUDUSD 1.234259 1.234837 EBS 5.787885e-04 1.234548
5 2023-09-07 16:03:49 AUDUSD 1.234900 1.234914 CNX 1.417177e-05 1.234907
6 2023-09-07 16:03:50 AUDUSD 1.234711 1.235277 EBS 5.656128e-04 1.234994
\[MAD(x) = median\left(\left|Y_i - \hat{Y}\right|\right)\]
function (x, center = median(x), constant = 1.4826, na.rm = FALSE,
low = FALSE, high = FALSE)
{
if (na.rm)
x <- x[!is.na(x)]
n <- length(x)
constant * if ((low || high) && n%%2 == 0) {
if (low && high)
stop("'low' and 'high' cannot be both TRUE")
n2 <- n%/%2 + as.integer(high)
sort(abs(x - center), partial = n2)[n2]
}
else median(abs(x - center))
}
<bytecode: 0x55ab66b63b70>
<environment: namespace:stats>
Using binomial identity:
# $\binom{n}{k}p^{k}(1-p)^{(n-k)} = # \binom{10}{3}\left(\frac{1}{2}\right)^{3}\left(\frac{1}{2}\right)^{7}$
choose(10,3)*(.5)^3*(.5)^7
[1] 0.1171875
Using binomial distribution density function:
Using simulation (100,000 tosses):
R is an open-source programming language,
meaning that users can contribute packages that make our lives easier,
and we can use them for free.
For this lab, and many others in the future,
If these packages were not already available in your R environment,
You may be asked to select a server from which to download;
Next, you need to load these packages
library
function.Run the following three lines in your console.
You only need to install packages once,
The Tidyverse packages
You can find more about the packages in the tidyverse
We will be using R Markdown to create reproducible lab reports.
See the following videos describing why and how:
Going forward you should refrain
If at any point you need to start over,
That was a short introduction to R and RStudio,
In this course we will be using the suite of R packages from the tidyverse.
The book R For Data Science by Grolemund and Wickham
If you are googling for R code,
These cheatsheets may come in handy throughout the semester:
Console
: interface for R languagePrompt
: Area to type individual lines of codeEnvironment
: Collection of all of the objects that have been loaded into RPrimitives
: a type of data in R, including numeric, integer, character, logical and factorVectors
: Basic data structure in RLists
: a collection of elements in a sequence, comprised of vectorsDataframes
: A way of organizing measurements into a coherent structure, organized by columns and rowsFunctions
: a set of pre-defined operations to be applied to a certain objectVectorized
: a way of programming that inherently understands mathematical operations as linear algebraPackage
: a collection of functions used for a specific purposeOpen Source
: an approach to programming that allows for users to read and change the underlying softwareLibrary
: the collection of packages that have been loaded into R to be used when running codeReproducible:
an approach to programming that instills the underlying principle of reusing code, that other members of the community can also use / test conclusionsMarkdown
: a computer language used to create visual documents from programming scripts