Suggestion Box
Spot an error or have suggestions for improvement on these notes? Let us know!
Data Types and Structures in R
In this section, we will explore how R stores and organizes data.
Understanding data types and structures is essential because it determines what operations you can perform and how R interprets your variables.
All data in R belongs to some type (what kind of thing it is) and some structure (how it is arranged).
1. Data Types in R
The most common atomic data types in R are:
| Type | Example | Description | JavaScript Equivalent |
|---|---|---|---|
| Numeric | x <- 3.14 |
Real numbers with decimals | let x = 3.14; |
| Integer | x <- 5L |
Whole numbers (note the L) |
let x = 5; |
| Character | word <- "hello" |
Text strings | let word = "hello"; |
| Logical | flag <- TRUE |
TRUE/FALSE values | let flag = true; |
| NA | value <- NA |
Missing value placeholder | let value = null; |
Checking a Variable’s Type
x <- 3.14
typeof(x)
class(x)
is.numeric(x)
You can test whether something is of a specific type using functions like is.numeric(), is.character(), or is.logical().
2. Creating and Using Vectors
A vector is the simplest structure in R. It stores values of the same type.
Create a vector with the c() function (short for combine).
scores <- c(88, 92, 95, 90)
scores
You can perform vectorized operations, meaning R applies the operation to every element automatically.
scores + 5 # adds 5 to every element
mean(scores) # calculates the mean
sd(scores) # calculates standard deviation
In JavaScript, the equivalent would be using arrays and map():
let scores = [88, 92, 95, 90];
let updated = scores.map(x => x + 5);
console.log(updated);
Accessing Elements
scores[1] # first element
scores[2:3] # second through third
scores[-1] # all but the first
3. Lists
A list can hold elements of different types — numbers, strings, vectors, even other lists.
student <- list(name = "Alex", age = 20, scores = c(88, 92, 95))
student
Access elements by $ or double brackets:
student$name
student[["age"]]
Lists are similar to JavaScript objects:
let student = { name: "Alex", age: 20, scores: [88, 92, 95] };
console.log(student.name);
4. Matrices and Arrays
A matrix is a 2-dimensional structure where all elements are of the same type.
m <- matrix(1:9, nrow = 3, byrow = TRUE)
m
Access elements by row and column:
m[1, 2] # row 1, column 2
m[, 3] # all rows, column 3
For higher dimensions, R uses arrays.
arr <- array(1:12, dim = c(2, 3, 2))
arr
5. Data Frames
A data frame is one of the most important structures in R.
It stores data in a tabular format — like a spreadsheet — where each column can have a different type.
df <- data.frame(
id = 1:4,
name = c("Alice", "Bob", "Carmen", "Diego"),
score = c(88, 92, 95, 90)
)
df
Access columns and rows:
df$name # access column
df[1, ] # first row
df[, "score"] # entire score column
Get a quick summary:
summary(df)
str(df)
Data frames in R are similar to arrays of objects in JavaScript:
let df = [
{id: 1, name: "Alice", score: 88},
{id: 2, name: "Bob", score: 92}
];
console.log(df[0].score);
6. Factors (Categorical Variables)
A factor represents categorical data (e.g., "male"/"female", "control"/"treatment").
group <- factor(c("control", "treatment", "control"))
group
levels(group)
Convert factors to numeric or character if needed:
as.character(group)
as.numeric(group)
Factors are useful when performing statistical modeling or creating labeled visualizations.
7. Combining Data Structures
You can combine vectors, lists, and data frames to organize data hierarchically.
participants <- list(
group = c("control", "treatment"),
results = data.frame(
id = 1:4,
score = c(88, 92, 95, 90)
)
)
participants
Access nested elements:
participants$results$score
participants[["group"]]
8. Converting Between Types
R provides built-in functions for converting data types.
as.numeric("5")
as.character(123)
as.logical(0)
as.data.frame(matrix(1:6, nrow = 2))
If a conversion isn’t possible (e.g., "text" → numeric), R will return NA with a warning.
9. Practical Example
Let’s create a small dataset and explore it.
# Create sample data
subject_id <- 1:5
rt <- c(520, 410, 615, 450, 395)
congruent <- c(TRUE, TRUE, FALSE, TRUE, FALSE)
# Combine into data frame
data <- data.frame(subject_id, rt, congruent)
# Inspect
head(data)
mean(data$rt)
summary(data)
Example output:
subject_id rt congruent
1 1 520 TRUE
2 2 410 TRUE
3 3 615 FALSE
4 4 450 TRUE
5 5 395 FALSE
10. Summary
- R stores information using data types (numeric, character, logical, etc.) and data structures (vectors, lists, data frames, etc.).
- Vectors are one-dimensional; data frames and matrices are two-dimensional.
- Lists can hold mixed data types.
- Data frames are the foundation for data analysis in R.
- Always check structure with
str()andsummary()before analysis.
Understanding data types and structures will make your future work with data manipulation, visualization, and modeling in R much easier.
Summary of R Data Types and Structures
| Type | Description | Most Important (for our purposes) |
|---|---|---|
| Atomic Types | Numeric, Integer, Character, Logical, NA | ** |
| Vector | 1D, homogeneous data (all same type) | |
| List | 1D, heterogeneous data (can store multiple types) | |
| Matrix | 2D, homogeneous data | |
| Array | Multi-dimensional, homogeneous data | |
| Data Frame | 2D, heterogeneous data (columns can vary in type) | ** |
| Factor | Categorical data with unique levels | ** |
| Function | First-class objects for performing tasks | ** |