; PSY 1903
PSY 1903 Programming for Psychologists

Suggestion Box

Spot an error or have suggestions for improvement on these notes? Let us know!

Week 11 · Scoring, Filtering, and Summarizing One Participant

1 · What “Scoring” Means in This Context

In most experiments, we collect two main types of data:

  • Behavioral performance, such as reaction times and accuracy, and
  • Questionnaire responses, which measure self-reported states or attitudes.

Questionnaire responses usually consist of multiple items that need to be combined into a single number representing a participant’s overall standing on a construct—such as engagement, focus, or anxiety. This process is called scoring.

Different questionnaires are scored in different ways:

  • Most use sum scores, while others use the mean to keep the overall score on the same scale as each item.
  • Some items are worded in the opposite direction and must be reverse-scored so that higher values consistently indicate more of the same construct.
  • Some questionnaires yield both a global score and sub-scores that capture related but distinct dimensions.
    • For example, the NEO Personality Inventory’s global Neuroticism construct includes sub-scores for anxiety, angry hostility, depressed mood, self-consciousness, impulsiveness, and vulnerability.

This is a common workflow in psychological research: pairing self-report data with behavioral measures to understand both subjective and objective aspects of performance.

The goal of scoring is to turn raw responses into valid, interpretable measures that summarize each participant’s behavior and psychological state.


2 · Example Questionnaire

After completion of the NPT, participants completed the Task Engagement and Focus (TEF-10) questionnaire, which measures how well they felt they stayed on task using a mean score.

  1. I stayed focused on the digits.
  2. I felt distracted during the task. (reverse)
  3. I understood what each key meant.
  4. My mind wandered while I was responding. (reverse)
  5. I tried my best throughout the task.
  6. I felt confident in my responses.
  7. I rushed through the trials without thinking. (reverse)
  8. The instructions were clear.
  9. I paid attention to the mapping rules.
  10. I was motivated to do well.

Reverse-scored items: 2, 4, 7

Each item uses a five-point Likert scale:

  • 0 = Strongly Disagree
  • 1 = Disagree
  • 2 = Neutral
  • 3 = Agree
  • 4 = Strongly Agree

To reverse score an item, you use the formula: reverse_value = (max_likert_value - min_likert_value) - original value

Because jsPsychSurveyLikert starts at 0 and ends at 4, the reverse-score formula is:
reversed_value = 4 − original_value

If your questionnaire instead used a 1–5 scale, the reverse formula would be:
reversed_value = 6 − original_value

Understanding the structure of this questionnaire will help you write the scoring function in a systematic way. We know we need to extract the responses, correct the reversed ones, and then calculate a total or mean score. Each step of the code will mirror these conceptual steps.


3 · Working with and Scoring the Questionnaire

When you use jsPsychSurveyLikert, all of a participant’s questionnaire responses get saved in a single row of your data file. In our experiments, including today's example data, we’ve been tagging that row with something like trialType = "questionnaire" so we can easily find it later when we’re scoring.

If you look at that row in your imported data, you will see that the response column contains a long text string that looks like this:

{"item1":4,"item2":3,"item3":1,"item4":2,"item5":5,"item6":4,"item7":2,"item8":1,"item9":4,"item10":5}

This format is called JSON, which stands for JavaScript Object Notation. JSON is a compact, structured way to store information so that different programs can easily read and exchange it.

Inside the curly braces { }, information is organized as key–value pairs.
Each key (for example "item1") represents a variable name or questionnaire item, and each value (for example 4) represents the participant’s response.
Together, these pairs describe one participant’s full set of answers.

Key Value Meaning
"item1" 4 The participant selected response 4 on item 1
"item2" 3 The participant selected response 3 on item 2
"item3" 1 The participant selected response 1 on item 3

Although this looks organized, R treats the entire JSON object as a single character string because it sees only text inside that cell.
To analyze the responses, we must decode the JSON into a structure that R can work with.


Decoding the JSON

We will use two R functions to turn this text into numeric data:

  1. fromJSON() from the jsonlite package will read the JSON string and convert it into an R list, where each item corresponds to a questionnaire response.
  2. unlist() will flatten that list into a simple numeric vector, which can then be used to compute means or totals.

This decoding step is essential because R cannot perform numeric operations (such as averaging or reversing) on text values.
Once the responses are numeric, we can score the questionnaire by applying the reverse-scoring rules and calculating an overall mean or total.


Scoring the Questionnaire

Now that we understand how the data are stored and what needs to happen, we can work through the code to actually accomplish this.

Below is the scaffold for this analysis pipeline.

#### score_questionnaire.R -----------------------------------------------------
## Purpose: Take a JSON string from the questionnaire row and return a single score.
## Scale: jsPsychSurveyLikert default 0–4. Reverse items: 2, 4, 7.

## 1) Parse the JSON string into an R object
##    Use jsonlite::fromJSON() to convert the text into a list.
## Example:
responses <- fromJSON(json_string)

## 2) Flatten and convert to numeric
##    Use unlist() to turn the list into a vector and coerce to numeric if needed.
## Example:
responses <- as.numeric(unlist(responses))

## 3) Reverse-score the specified items
rev_items <- c(2, 4, 7)
responses[rev_items] <- 4 - responses[rev_items]

## 5) Compute the final score
mean_score <- mean(responses, na.rm = TRUE)

Now we've calcualted this participant's mean score on the Task Engagement and Focus (TEF-10).


3 · Working with and Scoring the Behavioral Data

Once we’ve scored the questionnaire for a single participant, the next step is to summarize their behavioral performance — how quickly and accurately they responded during the task.

Our raw data include multiple trial types and blocks, so we’ll calculate separate summary statistics for each subset of trials. This lets us see how performance differs across conditions like practice, magnitude, and parity.

We’ll focus on two key measures for each subset:

  • Mean reaction time (RT), which tells us how fast participants responded on valid trials
  • Accuracy, which tells us the proportion of trials they answered correctly

Reaction time (RT) data often contain outliers.
Some responses are unrealistically fast (accidental keypresses), while others are extremely slow (distraction or inattention).
Filtering helps remove those implausible values before calculating averages.

For this task, we will exclude:

  • RTs below 250 ms (too fast)
  • RTs above 900 ms (too slow)

Filtering should be done separately within each block to avoid removing trials incorrectly.

Typical workflow:

  1. Split the data into practice and experiment blocks.
  2. Within the experiment block, split by trial type (for example, magnitude vs parity).
  3. Filter each subset to keep only RTs between 250 and 900 ms.
  4. Compute mean RT and accuracy for each subset.

Step-by-step example

After filtering out implausible reaction times (for example, keeping only 250–900 ms), we can use the mean() function to compute these summaries. We’ll include na.rm = TRUE so R ignores any missing data.

#### Score Behavioral Data ------------------------------------
## Separate data into block and trial types
practice_filtered  <- participant_data[participant_data$block == "practice", ]
magnitude_filtered <- participant_data[participant_data$block == "experiment" & participant_data$trial_type == "magnitude", ]
parity_filtered    <- participant_data[participant_data$block == "experiment" & participant_data$trial_type == "parity", ]
  
## Filter out unreasonable reaction times (keep 250–900 ms)
practice_filtered  <- practice_filtered[practice_filtered$rt  >= 250 & practice_filtered$rt  <= 900, ]
magnitude_filtered <- magnitude_filtered[magnitude_filtered$rt >= 250 & magnitude_filtered$rt <= 900, ]
parity_filtered    <- parity_filtered[parity_filtered$rt    >= 250 & parity_filtered$rt    <= 900, ]
  
## Calculate mean reaction time and accuracy for each trial type
practice_mean_rt  <- mean(practice_filtered$rt, na.rm = TRUE)
practice_acc      <- mean(practice_filtered$correct, na.rm = TRUE)
magnitude_mean_rt <- mean(magnitude_filtered$rt, na.rm = TRUE)
magnitude_acc     <- mean(magnitude_filtered$correct, na.rm = TRUE)
parity_mean_rt    <- mean(parity_filtered$rt, na.rm = TRUE)
parity_acc        <- mean(parity_filtered$correct, na.rm = TRUE)

## Calculate standard deviation of reaction times for each trial type
practice_sd_rt  <- sd(practice_filtered$rt, na.rm = TRUE)
magnitude_sd_rt <- sd(magnitude_filtered$rt, na.rm = TRUE)
parity_sd_rt    <- sd(parity_filtered$rt, na.rm = TRUE)

Now we've calculated this participant's mean reaction times and accuracy for all relevant experiment trial types (practice, magnitude, and parity).


5 · Summarizing One Participant

Now that we have our questionnaire score (mean_score) and behavioral summaries (practice_mean_rt, practice_acc, magnitude_mean_rt, magnitude_acc, parity_mean_rt, parity_acc, practice_sd_rt, magnitude_sd_rt, parity_sd_rt), we can assemble them into a single participant-level row.

  1. Derive a subject ID from the file name When working with real data, each participant’s results are usually stored in a separate file. We need a way to keep track of which participant each file belongs to so that, once we combine them later, we can still identify individual participants.

Rather than typing IDs manually, we can extract them directly from the file names using a few simple R commands.

file_path  <- "psy1903/web/npt_project/data/raw/npt-experiment-2025-11-05-12-34-56.csv"
subject_id <- sub("\\.csv$", "", basename(file_path))
  • file_path: Stores the full path to one participant’s data file. Here it’s written explicitly, but later we’ll pass it in automatically when looping through all participants.
  • basename(file_path): Removes the folder path and leaves only the file name: "npt-experiment-2025-11-05-12-34-56.csv"
  • sub("\\.csv$", "", ...): Removes the .csv extension from the file name
    • sub() replaces part of a string that matches a pattern
    • "\\.csv$" is a regular expression meaning “the characters .csv at the end of the string.”
    • The second argument "" means “replace it with nothing.”

The result, stored in subject_id, is a clean, readable participant identifier such as: "npt-experiment-2025-11-05-12-34-56"

We’ll include this ID as a column in each participant’s summary row so that when all participants are combined into one data frame, we can trace every row back to its source file.

  1. Create a one-row data frame for this participant
participant_summary <- data.frame(
  subject_id         = subject_id,
  tef10_score        = mean_score,                  
  practice_mean_rt   = practice_mean_rt,
  practice_acc       = practice_acc,
  magnitude_mean_rt  = magnitude_mean_rt,
  magnitude_acc      = magnitude_acc,
  parity_mean_rt     = parity_mean_rt,
  parity_acc         = parity_acc,
  practice_sd_rt     = practice_sd_rt,
  magnitude_sd_rt    = magnitude_sd_rt,
  parity_sd_rt       = parity_sd_rt,
  stringsAsFactors   = FALSE
)

participant_summary

This gives us a tidy, single-row summary for one participant that we can later stack with others.


6 · Quality Checks

Let's run a few quick checks on our single-participant objects before we move on, just to make sure everything in our code looks like it ran correctly.

# Questionnaire score within expected range (allowing NA)
mean_score
is.numeric(mean_score)

# RTs were filtered as intended (spot check ranges)
practice_mean_rt; magnitude_mean_rt; parity_mean_rt

# Accuracy is a proportion between 0 and 1
practice_acc; magnitude_acc; parity_acc

# The summary is exactly one row
nrow(participant_summary) == 1

# Optional: check for missing values you expect or need to handle
anyNA(participant_summary)

If anything looks off, revisit the earlier steps (subsetting, RT filtering, reverse scoring).


7 · Summary

We now have a clear, single-row summary for one participant that includes a questionnaire score (0–4 with items 2, 4, and 7 reversed), filtered reaction times (250–900 ms), and accuracy and mean RT for practice, magnitude, and parity.

Next, we will refactor this working code into small, reusable functions and apply them to every participant file automatically. This will let us combine all participant summaries into one data frame and save the results for analysis.