; PSY 1903
PSY 1903 Programming for Psychologists

Suggestion Box

Spot an error or have suggestions for improvement on these notes? Let us know!

Best Practices for Writing Readable, Maintainable, and Reproducible Code in R

In this section, we’ll learn how to write R code that is clear, consistent, and reproducible — code that you can understand six months from now and that others can follow and rerun.

Readable and reproducible code is critical for psychological research, where transparency and replication are essential parts of good scientific practice.


1. Why Readability and Reproducibility Matter

Readable code:

  • Makes it easy to see what’s happening step by step.
  • Reduces errors and confusion — for you and for collaborators.
  • Encourages better habits for debugging and organization.

Reproducible code:

  • Ensures anyone can run your script from start to finish and get the same results.
  • Supports transparent, open, and collaborative science.
  • Makes your workflow scalable — when your dataset or methods change, your code still works.

2. Writing Clear and Consistent Code

a) Use Descriptive Variable Names

Avoid single letters or vague placeholders. Your variable names should communicate what the object actually represents — not just its type or format.

# Poor
x <- 10
df1 <- read.csv("data.csv")

# Better
num_participants <- 10
survey_data <- read.csv("survey_responses.csv")

Descriptive names make your code easier to read, debug, and revisit later, especially in collaborative projects or reproducible research workflows.

Remember that you’re often not just writing code for the computer — you’re writing it for future you (or your collaborators) who need to understand what each piece of code does without guessing.

Why It Matters

  • Clarity reduces errors: Ambiguous names make it harder to track which variable does what, increasing the chance of using the wrong one in an analysis.
  • Ease of debugging: When code breaks, meaningful names make it obvious where the problem lies (“survey_data is missing a column” is clearer than “df1 is broken”).
  • Supports reproducibility: Well-named variables communicate intent — critical for sharing code in research, publications, or teaching.

Balancing Clarity and Brevity

  • While names should be descriptive, they don’t need to be long. Choose names that are clear, but concise:
  • Avoid overly short names like x, df, or tmp.
  • Avoid overly long or sentence-like names such as number_of_participants_in_study_one.
  • Instead, aim for short, descriptive names like num_participants, study1_data, or reaction_times.

Tip: If you can’t remember what a variable does after reading it aloud, rename it.

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

b) Consistent Style and Indentation

Good code style is like good writing — it’s not just about correctness, but clarity and structure.
Consistent naming and indentation make your scripts easier to read, debug, and share with others.

Consistent Naming Style

Use the same naming convention throughout your scripts.
In R, the most common is snake_case, where words are lowercase and separated by underscores.

Common conventions:

  • snake_case → most common in R (reaction_time, mean_score)
  • Avoid spaces or inconsistent capitalization (ReactionTime, reaction time, MeanScore)

Pick one style and stick with it across all your variables, functions, and files.
This helps both you and others immediately recognize what your code is doing — without pausing to decode naming quirks.

Consistent Indentation

Indentation shows structure — it tells readers which lines of code belong together.
Each new code block (such as inside { }) should be indented one level deeper than the block it’s inside.

RStudio will handle most of this automatically as you type, but you can also fix formatting at any time by selecting all (Cmd + A) and pressing Cmd + I (Mac) or Ctrl + I (Windows).

Example: Poor vs. Consistent Formatting
# Hard to read
for(i in 1:length(rt)){
if(rt[i]>800){
print("Very slow")
}else if(rt[i]>500){
print("Moderate speed")
}else{
print("Fast")
}}

# Easier to read
for (i in 1:length(rt)) {
  if (rt[i] > 800) {
    print("Very slow")
  } else if (rt[i] > 500) {
    print("Moderate speed")
  } else {
    print("Fast")
  }
}

Notice how indentation makes the structure visible — you can instantly see where the condition begins and ends.

Setting Auto-Indent Preferences in RStudio

You can customize RStudio’s automatic indentation behavior:

  1. Go to Tools → Global Options → Code → Editing
  2. Check or adjust the following:
    • “Auto-indent code after paste”
    • “Reindent lines automatically”
    • Set preferred Tab width (usually 2 spaces for R)
    • “Insert spaces for tab” to keep indentation consistent across systems
  3. Optional: enable “Soft-wrap R source files” so long lines wrap visually without affecting code structure.

Bonus: Reformat Code Automatically on Save

If you want RStudio to automatically clean up spacing and indentation each time you save a file, you can use the styler package.

install.packages("styler")
library(styler)

Then use Addins → Style active file (or assign a shortcut).
To apply it automatically every time you save, add this to your .Rprofile:

options(styler.addins_style_transformers = styler::style_file)

This ensures your code always follows consistent formatting conventions — even across projects or collaborators.

Why It Matters

  • Readability: Well-formatted code communicates structure immediately.
  • Debugging: It’s easier to spot where a missing brace or parenthesis occurs.
  • Collaboration: Consistent style helps others read and contribute to your codebase smoothly.
  • Professionalism: Clean, consistent code reflects good programming habits and reproducibility practices.

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

c) Using Comments Thoughtfully — Preferably with ##

Comments are essential for making your code understandable — not just to others, but to future you.
They explain the reasoning behind your decisions and make your scripts easier to read, debug, and reproduce.

Comment the “Why,” Not Just the “What”

A common beginner mistake is to comment every line to restate what the code already says.
Instead, use comments to explain why your code does something or what assumption it’s based on.

## Calculate average RT, excluding missing values
mean_rt <- mean(data$rt, na.rm = TRUE)

## Filter to include only congruent trials for comparison
filtered_data <- subset(data, congruent == TRUE)

These comments clarify your logic — not just your syntax.

Prefer Using Double Hashes (##)

We recommend using two hash symbols (##) for comments rather than one.

Why? Because in RStudio, if you comment or uncomment entire blocks of code, a single # line might lose its comment status — but double ## lines will stay commented.
This prevents accidental “un-commenting” when toggling large code blocks on and off.

## This is a comment that will stay commented
## even if you use RStudio’s comment/uncomment shortcut.

This convention is especially useful in teaching, collaborative work, and reproducible analysis scripts where you might frequently run or toggle sections of code.

Use Comments as Structure

You can also use comments to organize your code into readable sections — especially in longer scripts.

#### Load Packages ------------------------------------------------------------
library(tidyverse)

#### Load and Inspect Data ---------------------------------------------------
data <- read.csv("experiment_data.csv")
summary(data)

#### Calculate Summary Statistics --------------------------------------------
mean_rt <- mean(data$rt, na.rm = TRUE)

Notice how these section headers act as visual signposts, helping you (and others) quickly navigate your code.

Finding the Right Balance

Too few comments make your logic hard to follow.
Too many make your code cluttered and repetitive.
Aim for comments that:

  • Introduce a logical block or transformation
  • Explain decisions or assumptions
  • Clarify any non-obvious syntax or “tricks”

A good rule of thumb:

Write code as if you’re leaving a trail of breadcrumbs for someone else to follow.

Summary

Best Practice Why It Matters
Comment why, not what Communicates reasoning instead of restating code
Use ## for comments Prevents accidental uncommenting when toggling blocks
Use headers as structure Makes long scripts easier to navigate
Balance brevity and clarity Avoid both over-commenting and under-commenting

Well-commented code is self-documenting, making it easier to share, reproduce, and extend — a key skill in scientific programming and collaborative projects.


3. Organizing Projects and Files

Clean project organization is one of the most important habits for writing reproducible, professional code.
A well-structured project helps you (and others) understand where everything lives, prevents file path errors, and ensures your analyses can be rerun easily — even on another computer.

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

a) Use an R Project

Always work within an R Project (.Rproj).

An R Project keeps all your work — data, code, outputs, and reports — inside a single, self-contained folder.
When you open the .Rproj file, RStudio automatically sets your working directory to that project folder, so your paths will always start from the same place.

This means you don’t have to use full file paths like:

setwd("/Users/myname/Documents/psy1903_project/data/")

Instead, you can just reference files relative to the project root, like:

data <- read.csv("data/raw/iat_data.csv")

This makes your code portable and reproducible — anyone who downloads your project can run it without editing file paths.

Checking and Setting the Working Directory

Your working directory is the folder R treats as your project’s “home base.” It determines where R looks for files to read (like datasets) and where it saves anything you create (like plots, models, or outputs). If your working directory isn’t set correctly, R won’t be able to find your files — even if they’re in the same project folder — leading to errors like:

> Error in file(file, "rt") : cannot open the connection

You can confirm where R currently thinks you are using:

getwd()

If you’re not using an R Project, you can still set the working directory manually with:

setwd("path/to/your/project")

However, avoid using setwd() inside scripts, especially with full file paths like this:

setwd("/Users/student/Documents/psy1903_project")

These are called fixed (absolute) paths, and they point to a specific location on your computer. If you share your code with someone else, or move your files to another folder or machine, those paths will break.

Relative vs. Fixed Paths

Instead of using fixed file paths, it’s best to use relative paths — paths that start from your project’s root directory and work their way down. This makes your code portable and reproducible because it will work the same way on any computer, as long as the folder structure stays the same.

Example Comparison:

Type Example Works Everywhere? Notes
Fixed (absolute) path read.csv("/Users/alex/Documents/psy1903_project/data/raw/iat_data.csv") No Breaks when shared or moved to another computer
Relative path read.csv("data/raw/iat_data.csv") Yes Works anywhere as long as the project structure is the same

When you open an .Rproj file, RStudio automatically sets your working directory to the project’s top-level folder (see next section for recommended project structure). From there, you can always refer to files using relative paths, such as:

# Load a dataset from the "data/raw" folder
iat_data <- read.csv("data/raw/iat_data.csv")

# Save a plot to the "output/plots" folder
ggsave("output/plots/reaction_time_histogram.png")

By relying on relative paths, you ensure that your scripts are:

  • Reproducible: others can run them without editing file paths.
  • Portable: they’ll still work if you move the entire project folder.
  • Clean and professional: no personal system paths cluttering your code.

In short:

R Projects and relative paths work together to make your workflow both reliable and shareable — no setwd() required.

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

Organizing your project into folders helps you keep data, scripts, results, and reports clearly separated. This improves clarity and organization while also preventing mistakes like accidentally overwriting important files (for example, saving cleaned data over your raw data). Saving intermediate steps, such as cleaned datasets, also means you don’t have to rerun every part of your code each time, which increases both efficiency and reproducibility. Here’s an example structure you can adapt for your own analyses:

psy1903_project/
├── psy1903_project.Rproj   # R Project file that sets working directory automatically
├── data/
│   ├── raw/                # unmodified, original data files
│   └── cleaned/            # processed or cleaned data ready for analysis
├── scripts/                # R scripts used for analysis and visualization
├── output/
│   ├── plots/              # figures and visualizations
│   └── tables/             # tables or model outputs
└── reports/                # R Markdown or Quarto reports

This layout mirrors a typical data analysis workflow:

  1. Raw data is stored safely in the data/raw folder and should never be overwritten. Cleaned or transformed data used for analysis can be saved in data/cleaned.
  2. Scripts read data from the cleaned folder and produce outputs such as plots or tables that are saved in output/.
  3. Reports (e.g., R Markdown or Quarto files) bring everything together to summarize results for interpretation or presentation.
  4. The .Rproj file lives in the project’s root directory. Opening this file automatically sets your working directory to the project folder and keeps file paths consistent, which is essential for reproducibility and collaboration.

Create directory structure and .Rproj set up for the rest of the video:

setwd("/Users/garthcoombs/Desktop/psy1903/") # Will need to update to your path
dir.create("psy1903_project")
dir.create("psy1903_project/data")
dir.create("psy1903_project/data/raw")
dir.create("psy1903_project/data/cleaned")
dir.create("psy1903_project/scripts")
dir.create("psy1903_project/output")
dir.create("psy1903_project/output/plots")
dir.create("psy1903_project/output/tables")
dir.create("psy1903_project/reports")

Then do the following:

  1. In RStudio, go to File → New Project → Existing Directory
  2. Choose the folder: /Users/garthcoombs/Desktop/psy1903/psy1903_project
  3. Click Create Project This will:
  • open RStudio in that directory,
  • create psy1903_project.Rproj inside that folder,
  • and automatically set your working directory to that project root.
    • You can confirm with getwd(), which should return "/Users/garthcoombs/Desktop/psy1903/psy1903_project"

Now set up an example data file:

# Example IAT-style dataset
iat_data <- data.frame(
  subject_id = 1:20,
  rt = c(480, 530, 495, 610, 455, 390, 510, 565, 430, NA,
         380, 230, 395, 710, 755, 590, 810, 365, 630, 200),
  congruent = c(TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, FALSE, TRUE, FALSE,
                TRUE, FALSE, FALSE, TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, FALSE),
  condition = c("control", "control", "math_anxiety", "control",
                "math_anxiety", "control", "math_anxiety", "math_anxiety",
                "control", "math_anxiety", "control", "nature_anxiety",
                "math_anxiety", "control", "math_anxiety", "control",
                "math_anxiety", "nature_anxiety", "control", "math_anxiety"),
  prime = c("neutral", "neutral", "math", "neutral",
            "math", "neutral", "math", "math",
            "neutral", "math", "neutral", "nature",
            "math", "neutral", "math", "neutral",
            "math", "nature", "neutral", "math"),
  target = c("school", "nature", "school", "school",
             "school", "nature", "school", "school",
             "nature", "school", "nature", "school",
             "school", "nature", "school", "school",
             "school", "school", "nature", "school"),
  response = c("left", "right", "right", "left", "right", "left",
               "right", "right", "left", "right", "left", "left",
               "right", "left", "right", "left", "right", "right",
               "left", "right"),
  correct = c(TRUE, TRUE, FALSE, TRUE, FALSE, TRUE,
              FALSE, FALSE, TRUE, FALSE, TRUE, TRUE,
              FALSE, TRUE, FALSE, TRUE, TRUE, FALSE,
              TRUE, FALSE)
)

write.csv(iat_data, "data/raw/iat_data.csv", row.names = FALSE)

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

c) Why It Matters

A structured R Project setup provides major benefits:

  • Reproducibility: Your code can run anywhere — no broken file paths.
  • Portability: You can share your project folder and others can run it immediately.
  • Clarity: You’ll always know where each part of your analysis lives.
  • Safety: Separating “raw” and “cleaned” data prevents accidental overwriting of original data.
  • Efficiency: It’s easier to navigate, debug, and collaborate when your code and data follow consistent patterns.

Think of your R Project as your analysis home base — everything your research needs should live inside it.


4. Quarto and R Markdown: Reproducible Reports

a) What They Are

Quarto (.qmd) and R Markdown (.Rmd) are tools for literate programming — they let you write text, run code, and display results all in one document.
This keeps your analysis, results, and explanations synchronized so that if you rerun your code, your figures, tables, and statistics all update automatically.

Both tools are used for reproducible research, where anyone (including future you) can re-create your results from the same source file.

An .Rproj file defines the overall RStudio project: it sets the working directory and keeps all files organized. A .qmd file is a Quarto document inside that project, used to write and run code, text, and output together in a reproducible report.

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

Pause the video and create and save a new .qmd file:

Create a New Quarto Project

  1. Open RStudio
  2. Go to File ▸ New Project ▸ New Directory ▸ New Quarto Project
  3. Give it a name (e.g., IAT_RT_Analysis)
  4. Click Browse… and select your course folder → e.g., navigate to ~/Desktop/psy1903_project/reports/
  5. Click Create Project
  6. RStudio will open a new window with your Quarto project ready.

Save Your File

  1. In the new project, go to File ▸ New File ▸ Quarto Document
  2. Paste or write your R and Markdown code
  3. Save it as IAT_RT_Analysis.qmd inside /reports

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

b) Document Structure

Each Quarto or R Markdown document has three main parts:

  1. A YAML header — title, author, and output settings.
  2. Narrative text, written in Markdown.
  3. Code chunks, where you place your R code.

Example — Quarto (.qmd):

---
title: "IAT Reaction Time Analysis"
author: "Your Name"
format: html
---

## Introduction
In this report, we analyze reaction times from an Implicit Association Test.

```{r}
iat_data <- read.csv("../data/raw/iat_data.csv")
mean(iat_data$rt, na.rm = TRUE)
```

Example — R Markdown (.Rmd):

---
title: "IAT Reaction Time Analysis"
author: "Your Name"
output: html_document
---

## Introduction
In this report, we analyze reaction times from an Implicit Association Test.

```{r}
iat_data <- read.csv("../data/raw/iat_data.csv")
mean(iat_data$rt, na.rm = TRUE)
```

In both formats, text outside the gray code chunks is written in plain Markdown, while code inside the chunks is executed when you render the report.

You can run a single code chunk by clicking the green “Run” arrow above it in RStudio or by pressing Cmd/Ctrl + Shift + Enter.

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

c) What “Render” (or “Knit”) Means

When you click Render (Quarto) or Knit (R Markdown), R runs your entire file from start to finish, executes every code chunk, and produces a final output — usually:

  • HTML (interactive and shareable)
  • PDF (print-ready)
  • Word (editable)

This process ensures reproducibility because Quarto starts a fresh R session each time.
If your file depends on something not defined inside it, rendering will fail — which is a good thing! It ensures your code runs cleanly and independently.

Always render from a clean environment so your document doesn’t depend on leftover variables from your global workspace.

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

d) Working Directories and Paths

When rendering a Quarto or R Markdown document, R starts in the project root directory (the same folder as your .Rproj file).
That’s why using relative paths like "data/cleaned/iat_data.csv" keeps your report portable and reproducible.

Avoid using setwd() inside Quarto or R Markdown — the .Rproj file already ensures paths resolve correctly.

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

e) Inline Code

You can include R results directly in your text using inline code:

The average reaction time was `r round(mean(iat_data$rt, na.rm = TRUE), 2)` milliseconds.

When rendered, this sentence will display the actual computed mean rather than the code.

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

f) Syntax Summary

Feature R Markdown (.Rmd) Quarto (.qmd)
Format field output: format:
Inline R code `r expression` same
Button label “Knit” “Render”
Default file extension .Rmd .qmd

⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯

g) Quarto vs. R Markdown — Which Should You Use?

R Markdown and Quarto share nearly identical syntax.
Quarto is the newer, recommended system — it’s faster, more consistent, and works across R, Python, and Julia without extra setup.
R Markdown remains fully supported and is fine for legacy projects or simple assignments.

Feature R Markdown Quarto (Recommended)
Introduced 2014 2022
Default button Knit Render
Language support R only R, Python, Julia, Observable JS
Best for Existing .Rmd workflows All new projects and reproducible pipelines

In short: R Markdown got us here, but Quarto is where the field is going.
For new work, start with Quarto. It does everything R Markdown does, plus more, with nearly identical syntax.


5. Debugging and Common Sources of Errors

Debugging means systematically finding and fixing errors in your code — it’s a normal part of programming. Even experienced coders spend much of their time testing, refining, and debugging.

When errors occur, start by reading the message carefully, checking your objects with ls() and str(), and using tools like print() and traceback() to locate the issue.

If things get confusing, clear your environment with rm(list = ls()) and rerun your script from the top — this ensures your code works independently of leftover variables.

Scope and Hidden Dependencies

Sometimes the problem isn’t syntax or logic, but scope — when your code relies on a global variable instead of one passed as an argument.
These hidden dependencies can make a function appear to work, but only as long as certain objects exist in your workspace.

If a script works once and then fails after you clear your environment, that’s a sign of a scope issue.
Functions should be self-contained, with all needed variables passed as arguments.

You can check what’s in your environment with ls() and identify where variables live with environment().

For a deeper look at how scope works and how to avoid global-variable bugs, see the Scope and Troubleshooting Loops & Functions notes.

For a full walkthrough of debugging tools, workflows, and using AI to help fix errors, see the Debugging and AI Tools notes and video.

Common Mistakes and How to Avoid Them

Even when your code runs, it may not be working the way you expect.
Many issues in R come from small but common mistakes that can be prevented with a few simple habits.

Mistake Why It Happens How to Fix
Missing parentheses or quotes Typos or mismatched brackets cause R to stop reading your command. Check that every opening (, [, {, or " has a closing pair.
Object not found Variable hasn’t been defined or has been cleared. Run ls() to view existing objects; re-run the code that defines it.
Using old global variables Code relies on values left in memory from earlier runs. Pass inputs as function arguments; test after clearing your environment.
Ignoring NAs mean() and similar functions return NA if data contain missing values. Add na.rm = TRUE to ignore missing values during calculations.
Hardcoded file paths Code breaks when run on a different computer or folder structure. Use relative paths (e.g., "data/cleaned/myfile.csv") instead of full absolute paths.
Mixing comment styles Single # lines can become uncommented accidentally when toggling. Use ## for comments that should always stay commented.
No structure Code runs but is difficult to read or debug later. Use headers (## Section Name) and consistent indentation to organize scripts.
Skipping testing Functions rely on global objects or untested logic. Clear the environment and rerun your script to verify it works from scratch.

Tip: The best time to catch bugs is while you’re writing code — add comments, use consistent names, and test incrementally as you build.


6. Saving and Reloading Objects

Before clearing your environment or closing RStudio, it’s important to save any objects you’ll need later — such as data frames, models, or analysis results.
Saving objects ensures your work is reproducible and prevents loss of progress if R crashes or your session resets.

R can save objects in several formats, depending on what you need:

Format Function Description
.csv write.csv() / read.csv() Saves tabular data as a human-readable text file. Can be opened in Excel or Google Sheets.
.rds saveRDS() / readRDS() Saves one R object in its exact structure. Ideal for models or single datasets.
.RData save() / load() Saves multiple objects together into one workspace file.

Example:

## Load Data
iat_data <- read.csv("data/raw/iat_data.csv")

## Save multiple objects together
my_model <- lm(rt ~ target + condition, data = iat_data)
save(iat_data, my_model, file = "data/cleaned/important_objects.RData")

## Clear the environment
rm(list = ls())

## Reload them later
load("data/cleaned/important_objects.RData")

Note: This code is meant to be run from an R Script, not the .qmd file we created, so the relative paths are different. The .qmd root directory is /psy1903_project/reports/ and the R Script current working directory is /psy1903_project/. To run it from the .qmd, you will need to add ../ between the " and data, e.g., iat_data <- read.csv("../data/raw/iat_data.csv").

When to use each:

  • Use .csv for sharing data with others or storing raw and cleaned datasets.
  • Use .rds for saving intermediate R objects you’ll reload into scripts or reports.
  • Use .RData for saving several related objects together (e.g., data + model results).

Tip: Avoid saving your entire workspace automatically on exit.
Instead, save specific objects intentionally — this keeps your project clean and ensures you always know what each file contains.


8. Summary

  • Use clear, descriptive names, consistent indentation, and double-hash (##) comments to make your code readable and maintainable.
  • Plan scripts with headers and sections so others (and future you) can follow your workflow easily.
  • Work inside an R Project and use relative paths — this ensures your code runs anywhere without manual adjustments.
  • Keep your analyses organized: store raw data, cleaned data, scripts, and outputs in separate folders for clarity and reproducibility.
  • Write self-contained functions that use arguments instead of relying on global variables. Clear your environment regularly to test this.
  • Use Quarto (or R Markdown for legacy projects) to produce reproducible reports that integrate text, code, and output — “knitting” confirms your workflow runs from start to finish.
  • When debugging, work systematically:
    • Use print() or paste() to inspect intermediate results (notes on paste() here).
    • Use tools like str(), traceback(), and browser() to isolate problems.
    • Remember that hidden scope issues often cause “object not found” errors.
  • Save important objects (data, models, results) intentionally using saveRDS() or save() before clearing your environment.
  • Be mindful of vectorized functions — they make your code shorter, faster, and easier to read than manual loops.

Readable, reproducible code reflects careful thinking. It’s not just good programming practice — it’s good scientific practice.