Table of Contents

    R Programming Language Data Types: Examples and Usage

    R Programming Language Data Types: Examples and Usage

    Data Types

    R has a wide variety of data types including scalars, vectors (numerical, character, logical), matrices, data frames, and lists.

    Vectors

    a <- c(1,2,5.3,6,-2,4) # numeric vector
    b <- c("one","two","three") # character vector
    c <- c(TRUE,TRUE,TRUE,FALSE,TRUE,FALSE) #logical vector

    Refer to elements of a vector using subscripts.

    a[c(2,4)] # 2nd and 4th elements of vector

    Matrices

    All columns in a matrix must have the same mode(numeric, character, etc.) and the same length. The general format is

    mymatrix

    byrow=TRUE indicates that the matrix should be filled by rows. byrow=FALSE indicates that the matrix should be filled by columns (the default). dimnames provides optional labels for the columns and rows.

    # generates 5 x 4 numeric matrix 
    y<-matrix(1:20, nrow=5,ncol=4)

    # another example
    cells <- c(1,26,24,68)
    rnames <- c("R1", "R2")
    cnames <- c("C1", "C2") 
    mymatrix <- matrix(cells, nrow=2, ncol=2, byrow=TRUE,
      dimnames=list(rnames, cnames))

    Identify rows, columns or elements using subscripts.

    x[,4] # 4th column of matrix
    x[3,] # 3rd row of matrix 
    x[2:4,1:3] # rows 2,3,4 of columns 1,2,3

    Arrays

    Arrays are similar to matrices but can have more than two dimensions. See help(array) for details.

    Data Frames

    A data frame is more general than a matrix, in that different columns can have different modes (numeric, character, factor, etc.). This is similar to SAS and SPSS datasets.

    d <- c(1,2,3,4)
    e <- c("red", "white", "red", NA)
    f <- c(TRUE,TRUE,TRUE,FALSE)
    mydata <- data.frame(d,e,f)
    names(mydata) <- c("ID","Color","Passed") # variable names

    There are a variety of ways to identify the elements of a data frame.

    myframe[3:5] # columns 3,4,5 of data frame
    myframe[c("ID","Age")] # columns ID and Age from data frame
    myframe$X1 # variable x1 in the data frame

    Lists

    An ordered collection of objects (components). A list allows you to gather a variety of (possibly unrelated) objects under one name.

    # example of a list with 4 components - 
    # a string, a numeric vector, a matrix, and a scaler 
    w <- list(name="Fred", mynumbers=a, mymatrix=y, age=5.3)

    # example of a list containing two lists 
    v <- c(list1,list2)

    Identify elements of a list using the [[]] convention.

    mylist[[2]] # 2nd component of the list
    mylist[["mynumbers"]] # component named mynumbers in list

    Factors

    Tell R that a variable is nominal by making it a factor. The factor stores the nominal values as a vector of integers in the range [ 1... k ] (where k is the number of unique values in the nominal variable), and an internal vector of character strings (the original values) mapped to these integers.

    # variable gender with 20 "male" entries and 
    # 30 "female" entries 
    gender <- c(rep("male",20), rep("female", 30)) 
    gender <- factor(gender) 
    # stores gender as 20 1s and 30 2s and associates
    # 1=female, 2=male internally (alphabetically)
    # R now treats gender as a nominal variable 
    summary(gender)

    An ordered factor is used to represent an ordinal variable.

    # variable rating coded as "large", "medium", "small'
    rating <- ordered(rating)
    # recodes rating to 1,2,3 and associates
    # 1=large, 2=medium, 3=small internally
    # R now treats rating as ordinal

    R will treat factors as nominal variables and ordered factors as ordinal variables in statistical proceedures and graphical analyses. You can use options in the factor( ) and ordered( ) functions to control the mapping of integers to strings (overiding the alphabetical ordering). You can also use factors to create value labels. For more on factors see the UCLA page.

    Useful Functions

    length(object) # number of elements or components
    str(object)    # structure of an object 
    class(object)  # class or type of an object
    names(object)  # names

    c(object,object,...)       # combine objects into a vector
    cbind(object, object, ...) # combine objects as columns
    rbind(object, object, ...) # combine objects as rows 

    object     # prints the object

    ls()       # list current objects
    rm(object) # delete an object

    newobject <- edit(object) # edit copy and save as newobject 
    fix(object)               # edit in place