R is a free software environment for statistical computing and graphics. The following is some basics about R data types.

## Comparison of vector,list,matrix and dataframe

vector | list | matrix | dataframe | |
---|---|---|---|---|

creation | c | list | matrix | data.frame |

same type | Y | N | Y | N |

class | class of its elements | list | matrix | data.frame |

name | names | names | names(for col name) | names (for column name) |

dimnames | – | – | dimnames(c)<-list(c(row1,row2),c(col1,col2)) | sames as matrix |

arithmetic | member-wise (cycling rule) | – | cbind or cbind | – |

index | [] numeric or name | [] -list | [row,] [row,column] [,col] | [[]] -column vector |

[[]]-member(vector) | [n] and [[]] retrieve member as deconstructed | [] -column slice |

## Basic Data Types

### Character

1 2 3 |
x <- as.character(3.14) fname <- "Joe"; lname <-"Smith" paste(fname, lname) |

1 |
Joe Smith |

To extract a substring, we apply the substr function.

1 |
substr("Mary has a little lamb.", start=3, stop=12) |

Replace:

1 |
sub("little", "big", "Mary has a little lamb.") |

1 |
Mary has a big lamb. |

### Complex

A complex value in R is defined via the pure imaginary value i.

1 2 |
x<-1+2i class(x) |

1 |
complex |

### Vector

#### Vectors can be combined via the function c. Elements will be coerced into *same type* if not already.

1 2 3 4 |
a <- c(1,2,6) b <- c("a", "b","c","d") c <- c(a,b) c |

1 |
[1] "1" "2" "6" "a" "b" "c" "d" |

#### Arithmetic operations of vectors are performed member-by-member, i.e., memberwise.

**Recycling Rule**

If two vectors are of unequal length, the shorter one will be recycled in order to match the longer vector.

1 2 3 |
u <- c(10, 20, 30) v <- c(1, 2, 3, 4, 5, 6, 7, 8, 9) u + v |

1 |
[1] 11 22 33 14 25 36 17 28 39 |

#### Vector Index

- We retrieve values in a vector by declaring an index inside a
*single*square bracket “[]” operator. - Negative Index: it would strip the member whose position has the same absolute value as the negative index.
- Out-of-Range Index will be reported as NA
- can be used with numeric index vector, such as a[c(2,3,3)] etc
- or, with logical index vector, such as a[c(TRUE,FALSE)], NB should have the
*same length*as the original.

### Matrix

Matrix is contructed with *matrix* function. I can be combined based on rows with *rbind*, or columns with *cbind*, tranposed with *t*, decontructed with c.

### List

A list is a generic vector containing other objects.

We retrieve a list slice with the single square bracket “[]”

and its member can be retrieved with double brakets “[[]]”

### Dataframe

A data frame is used for storing data tables. It is a list of vectors of *equal length*.

### Data Frame Column Vector

We reference a data frame column with the double square bracket “[[]]” operator.

1 2 3 4 |
mtcars[[9]] # or mtcars[["am"]] # or by is name mtcars$am #or by $ operator mtcars[,"am"] # or with single bracket |

### Data Frame Column Slice

We retrieve a data frame column slice with the single square bracket “[]” operator.

1 2 3 4 |
mtcars[1] mtcars["mpg"] # or by its column name mtcars[c("mpg", "hp")] # or with a index vector summary(mtcars) |

Min. :10.40 | Min. :4.000 | Min. : 71.1 | Min. : 52.0 | Min. :2.760 | Min. :1.513 | Min. :14.50 | Min. :0.0000 | Min. :0.0000 | Min. :3.000 | Min. :1.000 |

1st Qu.:15.43 | 1st Qu.:4.000 | 1st Qu.:120.8 | 1st Qu.: 96.5 | 1st Qu.:3.080 | 1st Qu.:2.581 | 1st Qu.:16.89 | 1st Qu.:0.0000 | 1st Qu.:0.0000 | 1st Qu.:3.000 | 1st Qu.:2.000 |

Median :19.20 | Median :6.000 | Median :196.3 | Median :123.0 | Median :3.695 | Median :3.325 | Median :17.71 | Median :0.0000 | Median :0.0000 | Median :4.000 | Median :2.000 |

Mean :20.09 | Mean :6.188 | Mean :230.7 | Mean :146.7 | Mean :3.597 | Mean :3.217 | Mean :17.85 | Mean :0.4375 | Mean :0.4062 | Mean :3.688 | Mean :2.812 |

3rd Qu.:22.80 | 3rd Qu.:8.000 | 3rd Qu.:326.0 | 3rd Qu.:180.0 | 3rd Qu.:3.920 | 3rd Qu.:3.610 | 3rd Qu.:18.90 | 3rd Qu.:1.0000 | 3rd Qu.:1.0000 | 3rd Qu.:4.000 | 3rd Qu.:4.000 |

Max. :33.90 | Max. :8.000 | Max. :472.0 | Max. :335.0 | Max. :4.930 | Max. :5.424 | Max. :22.90 | Max. :1.0000 | Max. :1.0000 | Max. :5.000 | Max. :8.000 |

### Data Frame Row Slice

We retrieve rows from a data frame with the single square bracket operator, just like what we did with columns. However, in additional to an index vector of row positions, we append an extra *comma character*. This is important, as the extra comma signals a wildcard match for the second coordinate for column positions.

1 2 3 4 5 |
mtcars[c(3, 24),] # with numeric indexing mtcars["Camaro Z28",] # or with name mtcars[c("Datsun 710", "Camaro Z28"),] # or name vectors L = mtcars$am == 0 mtcars[L,] #or with logical indexing |

MAY

About the Author:

Beyond 8 hours - Computer, Sports, Family...