; PSY 1903
PSY 1903 Programming for Psychologists

Data Types in R

R has a variety of data types and structures to manage different kinds of information. Here's an overview of the main ones you'll encounter, including atomic data types and data structures like vectors, arrays, data frames, and more.

Data Types

1. Atomic Data Types:

Atomic data types are the simplest types in R and represent individual pieces of data.

  • Numeric: Numeric values represent real numbers (e.g., 3.5, -4, 100). They can be integers or decimal numbers.

    x <- 3.5  # Numeric value
    
  • Integer: Integers are whole numbers, created by appending an "L" (e.g., 5L, -2L).

    x <- 5L  # Integer
    
  • Character (String): Character values are text, or strings, enclosed in quotes ("Hello", 'World').

    x <- "Hello, R!"  # Character
    
  • Logical (Boolean): Logical values are either TRUE or FALSE, often used for conditions and comparisons.

    x <- TRUE  # Logical value
    

Data Structures:

Data structures are collections of these atomic data types and can represent more complex data.

  • Vector: A vector is a sequence of elements of the same type, making it the most basic data structure in R.

    • Vectors can hold any atomic type, but they must be homogeneous (all elements of the same type).
    • You can check the type of a vector with typeof() and the structure with str().
    • You can create a vector by including a list of numbers or "words" separated by commas within the concatenate function: c(). Note, words must be in quotes.
    numeric_vector <- c(1.5, 2.3, 5.0)  # Numeric vector
    character_vector <- c("apple", "banana", "cherry")  # Character vector
    typeof(character_vector) # Will output "character" in the console window
    
  • List: A list is a more flexible structure than a vector because it can hold elements of different types, including vectors, other lists, and even functions. Lists are heterogeneous.

    my_list <- list(1.5, "apple", TRUE, c(1, 2, 3))  # Mixed elements
    
  • Matrix: A matrix is a two-dimensional collection of elements of the same type (usually numeric). You create a matrix with matrix() and define the number of rows and columns.

    my_matrix <- matrix(1:9, nrow = 3, ncol = 3)  # 3x3 matrix
    
  • Array: An array is a multi-dimensional generalization of a matrix, capable of having more than two dimensions. Elements in an array must be of the same type.

    my_array <- array(1:12, dim = c(3, 2, 2))  # 3D array
    
  • Data Frame: A data frame is a two-dimensional structure similar to a table or spreadsheet. Columns in a data frame can have different types (e.g., numeric, character, logical), so it's both homogeneous across rows and heterogeneous across columns. You can access columns using $ notation such as my_data$name, and view data with functions like head().

    my_data <- data.frame(
      id = 1:3,
      name = c("Alice", "Bob", "Charlie"),
      score = c(85.5, 92.0, 88.5)
    )
    

3. Special Data Types:

R can also support a few additional data types that don't fall into the above categories.

  • Factor: A factor represents categorical data and stores unique categories (levels). It’s commonly used for grouping and statistical modeling.

    colors <- factor(c("red", "green", "blue", "green", "red"))
    
  • Function: Functions are first-class objects in R, meaning you can store them in variables, pass them as arguments, and return them from other functions. You define functions using the function() keyword.

    my_function <- function(x, y) {
      return(x + y)
    }
    my_function(3, 5)  # Calls the function and returns 8
    

Summary of R Data Types and Structures

Type Description
Atomic Types Numeric, Integer, Character, Logical
Vector 1D, homogeneous data (all same type)
List 1D, heterogeneous data (can store multiple types)
Matrix 2D, homogeneous data
Array Multi-dimensional, homogeneous data
Data Frame 2D, heterogeneous data (columns can vary in type)
Factor Categorical data with unique levels
Function First-class objects for performing tasks