Understanding R from PHP perspective: Arrays in R

Published Sep 30, 2017Last updated Oct 02, 2017
Understanding R from PHP perspective: Arrays in R

R can be quite intimidating when you first start learning it from a PHP background. It has many other basic concepts that you don't see them in PHP, notably: vector.

What the hell is a vector? I just didn't get it when I first started. You see this word everywhere in R tutorials.

A vector, simply put, in PHP terms, is a value. It is the most basic R data object. There are six types of (atomic) vectors: logical, integer, double, complex, character and raw. Even when you write just one value in R, it becomes a vector of length 1 and belongs to one of these vector types.

1. Single Element Vector

# Atomic vector of type character:
str("R")

# Atomic vector of type logical:
str(TRUE)

When we execute the above code, it produces the following result:

chr "R"
logi TRUE

PHP:

var_dump("PHP");
var_dump(TRUE);

Result:

string(3) "PHP"
bool(true)

You notice that 'characters' in R are 'strings' in PHP.

2. Multiple Elements Vector

In PHP, an array is multiple values in one single variable:

$cars = array("Volvo", "BMW", "Toyota");

We use the array() function to create an indexed array. In R, it is a bit complicated. The most basic form of an array in R is vectors. R has array() function but it is different from PHP. R array() is a n-dimensional array (or three-dimensional array) in PHP. A one-dimensional array is called vectors in R, a two-dimensional array is called matrix in R. We use c() function to create a one-dimensional array (we will call it vectors from now on).

Quoted from w3schools:

... the c function used for concatenating different values and vectors for creating longer vectors.

Vectors/ One-dimensional arrays

R:

> x <- c(11, 12, 13)

# Extract the 3rd value from the vector 'x':
> x[3]
[1] 13

# Replace the 3rd value in the vector 'x':
> x[3] <- 14
> x[3]
[1] 14

PHP:

> $x = [11, 12, 13]

# Extract the 3rd value from the array 'x':
echo $x[2];
13

# Replace the 3rd value in the array 'x':
$x[2] = 14;
echo $x[2];
14

Factors (One-dimensional arrays)

Quoted from R-bloggers:

Factors are categorical variables that are super useful in summary statistics, plots, and regressions. They basically act like dummy variables that R codes for you.

R:

> x <- c("male", "male", "male", "male", "male", "female", "female", "female", "female", "others", "others", "others", "others")
> factor(x)
[1] male male male male male female female female female others others others others
Levels: female male others

# Or we can see all the levels (categories) this way:
> levels(factor(x))
[1] "female" "male"   "others"

# factor is useful when we want to know the statistic of genders:
> summary(factor(x))
female   male others
     4      5      4

PHP:

$x = array("male", "male", "male", "male", "male", "female", "female", "female", "female", "others", "others", "others", "others");
print_r(array_count_values($x));
# Array ( [male] => 5 [female] => 4 [others] => 4 )

Matrices/ Two-dimensional arrays

Matrices (or two-dimensional arrays) are built from vectors (one-dimensional arrays).

Quoted from r-tutor:

A matrix is a collection of data elements arranged in a two-dimensional rectangular layout.

You can create a matrix in two ways:

  1. Use the command matrix(vector, nrow = 3#3 , ncol = 2#2).
  2. Use cbind() or rbind() - in plain English: column-bind, row-bind

The R example below using row-bind: rbind().

R:

> x <- c(11, 12, 13)
> y <- c(55, 33, 12)
> foo <- rbind(x, y)
> foo
  [,1] [,2] [,3]
x   11   12   13
y   55   33   12

# Extract the 1st row of foo:
> foo[1,]
[1] 11 12 13

# Extract the 1st column of foo:
> foo[,1]
 x  y
11 55

PHP:

$foo = [
    [11, 12, 13],
    [55, 33, 12]
];
print_r($foo);
# Array (
#    [0] => Array ( [0] => 11 [1] => 12 [2] => 13 )
#    [1] => Array ( [0] => 55 [1] => 33 [2] => 12 )
# )

# Extract the 1st row of foo:
print_r($foo[0]);
# Array ( [0] => 11 [1] => 12 [2] => 13 )

# Extract the 1st column of foo:
$fistColumn = array_column($foo, 0);
print_r($fistColumn);
# Array ( [0] => 11 [1] => 55 )

Arrays/ n-dimensional arrays

Arrays, specifically in R, are similar to matrices but can have more than two dimensions. In other words, a set of stacked matrices of identical dimensions. Contrary to PHP, arrays are a generic term that covers one-dimensional and multi-dimensional arrays.

Quoted from w3school:

The vector variables that you have implemented at so far on the last tutorial are all single – dimensional / one dimensional objects. This is because they have only length but no other dimensions. Arrays can contain multi dimensional rectangular shaped data storage structure. “Rectangular” in the sense, each row is having the same length and similarly for each column and other dimensions. Matrices are a special type of two – dimensional arrays.

We can use array() function to create arrays. It takes vectors as input and uses the values in the dim parameter to create an array. Its format:

array(data_vector, dim_vector)

Where dim_vector:

c(numbers of row, numbers of column, numbers of set)

Notes: dim stands for dimension. This is another thing about R - the short forms are difficult to decipher. Contrary to PHP, JavaScript or other popular languages, we are always told to name our variables, functions and classes as descriptive as possible for maintainability. R simply cares not.

R:

> x <- array(1:9, c(2,3,2))
> x
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9    2
[2,]    8    1    3

# Extract the 1st metrix of x:
> x[, , 1]
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

# Extract the 1st row of 1st metrix of x:
> x[, , 1][1,]
[1] 1 3 5

Looking at x[, , 1] or x[, , 1][1,] is quite a pain due to the lack of readability. Personally I rarely use arrays in R to avoid confusions.

Quoted from Advanced R by Hadley Wickham:

Matrices are used commonly as part of the mathematical machinery of statistics. Arrays are much rarer, but worth being aware of.

PHP:

# Assuming that you have created a three-dimensional array...

# Extract the 1st row of 1st array of x:
print_r($x[0][0]);
# Array ( [0] => 1 [1] => 3 [2] => 5 )

Even for PHP alone, it can be intimidating reading three-dimensional arrays. Imagine this: $x[0][0][0][0];... Generally I don't use deep indices to select an element. I use associated arrays instead for complex data.

Quoted from w3schools:

PHP understands multidimensional arrays that are two, three, four, five, or more levels deep. However, arrays more than three levels deep are hard to manage for most people.

Conclusion

I hope this article with R and PHP comparisons is helpful and giving you an idea if you ever stumble on the same learning curve. Let me know what you think and what you had struggled when first learning R. Any suggestions and errors, please leave a comment below.

References

  • https://cran.r-project.org/doc/manuals/R-intro.html
  • http://www.r-tutor.com/r-introduction/vector
  • http://statmethods.net/input/datatypes.html
  • https://www.tutorialspoint.com/r/r_vectors.htm
  • https://r.iq.harvard.edu/docs/zelig/3.4-8/Ways_to_create.html
  • https://www.r-bloggers.com/data-types-part-3-factors/
  • https://www.rdocumentation.org/packages/base/versions/3.4.1/topics/array
  • http://adv-r.had.co.nz/Data-structures.html
  • https://www.programiz.com/r-programming/vector
Discover and read more posts from LAU TIAM KOK
get started
Enjoy this post?

Leave a like and comment for LAU

1