; PSY 1903
PSY 1903 Programming for Psychologists

Suggestion Box

Spot an error or have suggestions for improvement on these notes? Let us know!

Indexing and Subsetting in R

In this section, we’ll explore how to access and manipulate specific parts of your data using indexing and subsetting.
These skills allow you to extract or modify elements of vectors, lists, matrices, and data frames efficiently.


1. Indexing Basics

Indexing means selecting elements from a data object using their position, names, or logical conditions.
R uses square brackets [] for indexing.

Example with a Vector

fruits <- c("apple", "banana", "cherry", "date")
fruits[1]       # first element
fruits[2:4]     # elements 2 through 4
fruits[-1]      # all but the first element

R is 1-indexed, meaning counting starts at 1 (not 0, as in some other languages).

JavaScript Comparison

let fruits = ["apple", "banana", "cherry", "date"];
console.log(fruits[0]); // first element in JS

2. Logical Indexing

You can subset data using logical values (TRUE or FALSE).
Only elements corresponding to TRUE will be selected.

nums <- c(5, 10, 15, 20)
nums[c(TRUE, FALSE, TRUE, FALSE)]  # selects 5 and 15
nums[nums > 10]                    # selects elements greater than 10

Logical indexing is powerful for filtering data.


3. Indexing by Name

If a vector or list has names, you can access elements by those names.

scores <- c(math = 90, english = 85, science = 92)
scores["math"]
scores[c("math", "science")]

You can combine named and position-based indexing:

scores[1]
scores["english"]

4. Subsetting Lists

Lists can contain elements of different types — numbers, strings, vectors, or even other lists.

student <- list(
  name = "Alex",
  age = 20,
  scores = c(88, 92, 95)
)

Access elements with $ or double brackets [[]]:

student$name
student[["age"]]
student$scores[2]

JavaScript Comparison

let student = { name: "Alex", age: 20, scores: [88, 92, 95] };
console.log(student.name);
console.log(student.scores[1]);

5. Indexing Matrices

Matrices are two-dimensional, so you use row, column indexing.

m <- matrix(1:9, nrow = 3, byrow = TRUE)
m
m[1, 2]     # row 1, column 2
m[ , 3]     # all rows, column 3
m[2, ]      # entire second row

You can also use negative indices to exclude specific rows or columns:

m[-1, ]     # exclude the first row

6. Subsetting Data Frames

Data frames are the most common structure you’ll work with in R.
They behave similarly to matrices but can hold columns of different types.

df <- data.frame(
  id = 1:4,
  name = c("Alice", "Bob", "Carmen", "Diego"),
  score = c(88, 92, 95, 90)
)

# Subset by position
df[1, ]       # first row
df[, 2]       # second column
df[1:2, c("id", "score")]  # rows 1–2, specific columns

Access by column name using $:

df$name
df$score

Equivalent in JavaScript using an array of objects:

let df = [
  {id: 1, name: "Alice", score: 88},
  {id: 2, name: "Bob", score: 92}
];
console.log(df[0].score);

7. Conditional Subsetting

You can use logical conditions to filter rows.

df[df$score > 90, ]          # rows where score > 90
df[df$name == "Alice", ]     # rows where name is Alice
df[df$score >= 90 & df$name != "Bob", ]

You can also create a logical vector first and reuse it:

high_scores <- df$score > 90
df[high_scores, ]

8. Adding and Removing Columns or Rows

Add a new column using $:

df$passed <- df$score >= 90
df

Remove a column by assigning NULL:

df$passed <- NULL
df

Add a new row using rbind():

new_row <- data.frame(id = 5, name = "Eva", score = 93)
df <- rbind(df, new_row)

Remove rows by negative indexing:

df <- df[-1, ]  # removes first row

9. Advanced Subsetting

You can combine conditions, names, and indices for precise control.

df[df$score > 90 & df$id < 4, c("name", "score")]

Or select specific columns programmatically:

columns_to_keep <- c("id", "score")
df[ , columns_to_keep]

10. Practical Example

Let’s apply everything to a small example from a reaction time experiment.

# Create a data frame
data <- data.frame(
  subject_id = 1:5,
  rt = c(520, 410, 615, 450, 395),
  congruent = c(TRUE, TRUE, FALSE, TRUE, FALSE)
)

# Subset only congruent trials
congruent_trials <- data[data$congruent == TRUE, ]

# Subset fast trials (RT < 500)
fast_trials <- data[data$rt < 500, ]

# Subset specific columns
subset_cols <- data[, c("subject_id", "rt")]

Example output:

  subject_id  rt congruent
1           1 520      TRUE
2           2 410      TRUE
3           3 615     FALSE
4           4 450      TRUE
5           5 395     FALSE

11. Summary

  • Use [] to extract elements from vectors, lists, matrices, and data frames.
  • Use $ or [["name"]] for named list or data frame elements.
  • Logical subsetting filters data based on conditions.
  • Negative indices remove elements.
  • Combine multiple methods for flexible data manipulation.

Mastering indexing and subsetting makes your R workflow efficient and precise — essential for data wrangling, cleaning, and analysis.


R indexing diagram