Matrix calculations

construction

To build up a matrix, there are many possibilities. The most simple one is to tell R about explicitly about every element. Of course, you have to build a matrix first. This is done by the matrix command. It takes some data and builds a matrix. If there are not enough elements, they are repeated to fill up the whole matrix. The c command is just there to build a vector. Next you can use the already mentioned fix or edit command to enter more values. To replace a specific element, use the setter functionality: matrix[row,col] <- value or matrix[element number] <- value - here you can use boolean values, too. Some examples:
> m <- matrix(c(1,2),4,3)
> m
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    2    2    2
[3,]    1    1    1
[4,]    2    2    2

> m[m>=2] <- 10
> m
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]   10   10   10
[3,]    1    1    1
[4,]   10   10   10

> m[3,3] <- 0
> m
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]   10   10   10
[3,]    1    1    0
[4,]   10   10   10

> m[,3] <- 99
> m
     [,1] [,2] [,3]
[1,]    1    1   99
[2,]   10   10   99
[3,]    1    1   99
[4,]   10   10   99
Upper and lower triangular matrices are created with the lower.tri and upper.tri commands. They return TRUE/FALSE statements which are used to address the correct entries.
> matrix(0,4,4) -> m

> m[lower.tri(m)] <- 1
> m
     [,1] [,2] [,3] [,4]
[1,]    0    0    0    0
[2,]    1    0    0    0
[3,]    1    1    0    0
[4,]    1    1    1    0

> m[upper.tri(m, diag=TRUE)] <- 9
> m
     [,1] [,2] [,3] [,4]
[1,]    9    9    9    9
[2,]    1    9    9    9
[3,]    1    1    9    9
[4,]    1    1    1    9
Another elegant way is to glue vectors or matrices together. The cbind and rbind command does it for rows and columns:
> cbind(m,m)
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    1    1    1    1    1
[2,]   10   10   10   10   10   10
[3,]    1    1    0    1    1    0
[4,]   10   10   10   10   10   10

> rbind(m,m)
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]   10   10   10
[3,]    1    1    0
[4,]   10   10   10
[5,]    1    1    1
[6,]   10   10   10
[7,]    1    1    0
[8,]   10   10   10

> rbind(1:9,1:3,7)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]    1    2    3    4    5    6    7    8    9
[2,]    1    2    3    1    2    3    1    2    3
[3,]    7    7    7    7    7    7    7    7    7
Using negative indices, you tell R to miss out those elements (or rows, columns).
> matrix(sample(4:15),4) -> m
> m
     [,1] [,2] [,3]
[1,]    8   10    6
[2,]    5    9   12
[3,]   13    7   11
[4,]   15    4   14
> m[-2:-3,]
     [,1] [,2] [,3]
[1,]    8   10    6
[2,]   15    4   14
The rep command is useful for repeating. It has some powerful parameters, which could help creating matrices.
> rep(1:5,4)
 [1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
 
> rep(1:5,each=2, length.out=15)
 [1] 1 1 2 2 3 3 4 4 5 5 1 1 2 2 3 
 
> matrix(rep(1:5,each=2, length.out=16),4,byrow=TRUE)
     [,1] [,2] [,3] [,4]
[1,]    1    1    2    2
[2,]    3    3    4    4
[3,]    5    5    1    1
[4,]    2    2    3    3
More tricky is the dim command. It returns the dimensions of the matrix, but the other way round it can set them, too. Here an example with a new m:
> m <- matrix(1:18,2)
> m
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]    1    3    5    7    9   11   13   15   17
[2,]    2    4    6    8   10   12   14   16   18
> dim(m)
[1] 2 9
> dim(m) <- c(9,2)
> m
      [,1] [,2]
 [1,]    1   10
 [2,]    2   11
 [3,]    3   12
 [4,]    4   13
 [5,]    5   14
 [6,]    6   15
 [7,]    7   16
 [8,]    8   17
 [9,]    9   18
Diagonal matrix is build with the diag command. If you want some offset, you have to be more creative. row and col return the positions, they can be used in boolean expressions - rest should be clear.
> diag(8,4)
     [,1] [,2] [,3] [,4]
[1,]    8    0    0    0
[2,]    0    8    0    0
[3,]    0    0    8    0
[4,]    0    0    0    8

> diag(seq(0,1,length.out=6)) -> m
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    0  0.0  0.0  0.0  0.0    0
[2,]    0  0.2  0.0  0.0  0.0    0
[3,]    0  0.0  0.4  0.0  0.0    0
[4,]    0  0.0  0.0  0.6  0.0    0
[5,]    0  0.0  0.0  0.0  0.8    0
[6,]    0  0.0  0.0  0.0  0.0    1

# diag with offset
> m[row(m) == col(m)+2] <- c(9,4,-11,5)
> m
     [,1] [,2]  [,3] [,4] [,5] [,6]
[1,]    0  0.0   0.0  0.0  0.0    0
[2,]    0  0.2   0.0  0.0  0.0    0
[3,]    9  0.0   0.4  0.0  0.0    0
[4,]    0  4.0   0.0  0.6  0.0    0
[5,]    0  0.0 -11.0  0.0  0.8    0
[6,]    0  0.0   0.0  5.0  0.0    1
If you want to transform a matrix to a vector, use as.vector(matrix).
Flipping upside down or from left to right is only possible with indexing. If you need this more often, a custom function could be written. You could use this to idea to flip only parts of a matrix or to do a rotation.
> m <- matrix(1:12,4)
> m
     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12

> m[nrow(m):1,]
     [,1] [,2] [,3]
[1,]    4    8   12
[2,]    3    7   11
[3,]    2    6   10
[4,]    1    5    9

> m[nrow(m):1,ncol(m):1]
     [,1] [,2] [,3]
[1,]   12    8    4
[2,]   11    7    3
[3,]   10    6    2
[4,]    9    5    1

> t(m[nrow(m):1,])   # rotation by 90 deg
     [,1] [,2] [,3] [,4]
[1,]    4    3    2    1
[2,]    8    7    6    5
[3,]   12   11   10    9
kronecker is a very flexible function for calculating the generalized kronecker product. It could be used to simulate Matlab's repmat function, but it's abilities are beyond that.
> m <- matrix(runif(6),3)
> m
          [,1]      [,2]
[1,] 0.2549469 0.6108653
[2,] 0.2650235 0.3189252
[3,] 0.4786128 0.8176978

> kronecker(diag(1,3),m)
           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
 [1,] 0.2549469 0.6108653 0.0000000 0.0000000 0.0000000 0.0000000
 [2,] 0.2650235 0.3189252 0.0000000 0.0000000 0.0000000 0.0000000
 [3,] 0.4786128 0.8176978 0.0000000 0.0000000 0.0000000 0.0000000
 [4,] 0.0000000 0.0000000 0.2549469 0.6108653 0.0000000 0.0000000
 [5,] 0.0000000 0.0000000 0.2650235 0.3189252 0.0000000 0.0000000
 [6,] 0.0000000 0.0000000 0.4786128 0.8176978 0.0000000 0.0000000
 [7,] 0.0000000 0.0000000 0.0000000 0.0000000 0.2549469 0.6108653
 [8,] 0.0000000 0.0000000 0.0000000 0.0000000 0.2650235 0.3189252
 [9,] 0.0000000 0.0000000 0.0000000 0.0000000 0.4786128 0.8176978

> kronecker(matrix(1:4,2),m,FUN="+")
         [,1]     [,2]     [,3]     [,4]
[1,] 1.254947 1.610865 3.254947 3.610865
[2,] 1.265023 1.318925 3.265023 3.318925
[3,] 1.478613 1.817698 3.478613 3.817698
[4,] 2.254947 2.610865 4.254947 4.610865
[5,] 2.265023 2.318925 4.265023 4.318925
[6,] 2.478613 2.817698 4.478613 4.817698

basic functions

t, min, max & co.

t transposes the matrix. If the values are complex, you have to use Conj, too:
> m <- cbind(c(1i,4),c(-.3i+1,2))

> m
     [,1]   [,2]
[1,] 0+1i 1-0.3i
[2,] 4+0i 2+0.0i

> Conj(t(m))
       [,1] [,2]
[1,] 0-1.0i 4+0i
[2,] 1+0.3i 2+0i
max/min tells you straightforward the maximum and minimum element. If you want to know which element index it is, just take which.max.
pairwise maximum: pmax and pmin
> a <- runif(10)
> a
 [1] 0.50733122 0.60042481 0.68056024 0.86560839 0.44542177 0.74983840
 [7] 0.25642679 0.00837266 0.42336540 0.21190927
 
> b <- runif(5)
> b
[1] 0.9463352 0.1807076 0.4240673 0.3246081 0.8375254

> pmax(a,b)
 [1] 0.9463352 0.6004248 0.6805602 0.8656084 0.8375254 0.9463352 0.2564268
 [8] 0.4240673 0.4233654 0.8375254
Values of b are repeated silently.
If you want to know more than just min and max, rank returns the ranking of all elements. Could be used to order them.
> m <- matrix(runif(10),5)
> m
          [,1]      [,2]
[1,] 0.1827412 0.6714100
[2,] 0.7780485 0.9693234
[3,] 0.7844098 0.6602473
[4,] 0.4153272 0.5644285
[5,] 0.1804519 0.4953415

> rank(m)
 [1]  2  8  9  3  1  7 10  6  5  4
 
> m[rank(m)]
 [1] 0.7780485 0.6602473 0.5644285 0.7844098 0.1827412 0.9693234 0.4953415
 [8] 0.6714100 0.1804519 0.4153272
To get the sum of all values in a matrix, just use sum, which could be easily combined with subsets. If you just want to know row or column sums, you have to use the more general apply function to apply the function to the first or second index (if more dimensions, there is more possible, also combinations of some indices together). If you want something else than a sum, just replace with another vector function or program one yourself. Here a three dimensional matrix.
> m <- 1:24
> dim(m) <- 4:2
> m
, , 1

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12

, , 2

     [,1] [,2] [,3]
[1,]   13   17   21
[2,]   14   18   22
[3,]   15   19   23
[4,]   16   20   24

> apply(m,1:2,mean)
     [,1] [,2] [,3]
[1,]    7   11   15
[2,]    8   12   16
[3,]    9   13   17
[4,]   10   14   18

> apply(m,3,sum)
[1]  78 222
 
Sorting is done with sort or when just meant for a row or column, use the apply function:

cumulative

cumsum, cumprod, cummax and cummin return cumulative values.
> example(cumsum)

cumsum> cumsum(1:10)
 [1]  1  3  6 10 15 21 28 36 45 55

cumsum> cumprod(1:10)
 [1]       1       2       6      24     120     720    5040   40320  362880
[10] 3628800

cumsum> cummin(c(3:1, 2:0, 4:2))
[1] 3 2 1 1 1 0 0 0 0

cumsum> cummax(c(3:1, 2:0, 4:2))
[1] 3 3 3 3 3 3 4 4 4
margin.table computes sums in contingency tables.
> array(1:32,dim=c(4,4,2)) -> t

> t
, , 1

     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16

, , 2

     [,1] [,2] [,3] [,4]
[1,]   17   21   25   29
[2,]   18   22   26   30
[3,]   19   23   27   31
[4,]   20   24   28   32

> margin.table(t,c(1,2))
     [,1] [,2] [,3] [,4]
[1,]   18   26   34   42
[2,]   20   28   36   44
[3,]   22   30   38   46
[4,]   24   32   40   48

> margin.table(t,2:3)
     [,1] [,2]
[1,]   10   74
[2,]   26   90
[3,]   42  106
[4,]   58  122

function(matrix)

Functions like variance var, covariancecov, correlationcor, eigenvalues, eigenvectors and so on work straightforward. They return a list object, which holds all the information. To access a specific element, you have to append $identifier as always to the function or to the variable where you store the result.
> eigen(matrix(1/1:16,nr=4))
$values
[1] 1.179585e+00 1.363303e-01 4.117886e-03 4.257033e-05

$vectors
           [,1]       [,2]        [,3]         [,4]
[1,] -0.7989567 -0.2354152 -0.02858180 -0.001756516
[2,] -0.4435888  0.4453984  0.54539392  0.161220234
[3,] -0.3182774  0.5938981 -0.15569642 -0.732375522
[4,] -0.2521839  0.6272838 -0.82309609  0.661536881
Decompositions work the same way: qt, chol, svd.
> svd(matrix(5/30:15,nr=4))
$d
[1] 9.506581e-01 1.030435e-02 6.405979e-05 1.930725e-07

$u
           [,1]       [,2]       [,3]       [,4]
[1,] -0.4627515 -0.6523668  0.5344744 -0.2731589
[2,] -0.4853732 -0.2914693 -0.4107503  0.7146627
[3,] -0.5104109  0.1473443 -0.5784381 -0.6190152
[4,] -0.5382887  0.6839247  0.4593807  0.1773745

$v
           [,1]        [,2]       [,3]        [,4]
[1,] -0.3698892 -0.58392055  0.5937209 -0.41196398
[2,] -0.4306703 -0.42438386 -0.1418523  0.78377252
[3,] -0.5154256 -0.08213729 -0.7196409 -0.45793767
[4,] -0.6419047  0.68715981  0.3308933  0.07924272
These are only some examples. Make sure to search the help, and online resources, for more.
There is no direct function to calculate the inverse, but there is a solve function which is able to do this. Of course, for solving, use the solve command.
> A=matrix(c(4,-5,2,3),nr=2)
> A
     [,1] [,2]
[1,]    4    2
[2,]   -5    3

> b=c(3,0.01)
> b
[1] 3.00 0.01

> solve(A,b) -> sol
> sol
[1] 0.4081818 0.6836364

> A %*% sol
     [,1]
[1,] 3.00
[2,] 0.01

huge matrices

Handling bigger matrices is more complicated than small ones. Calculations do not only take more time, but there are also memory constraints in the computer. Solutions for this problem are generally about reducing the amount of memory taken by the matrix. R stores all object in memory, that's a problem with large datasets. Sparse matrices try to reduce the memory usage by storing only the relevant bits of information and omitting areas where no information is kept (e.g. only zeros). Another possibility is to page out the data and retrieving only the needed data for each step of calculation. Another possibility is a more general system of compressing data in memory using data compression algorithms. I want to introduce some methods in R for those methods.

sparse matrix

A promising package is SparseM. It's PDF documentation (possibly in /usr/local/lib/R/site-library/SparseM/doc/SparseM.pdf after installing it) talks about twenty different storage models. It uses csr by default, this is similar to the also supported csc model, which is used by Matlab. It is possible to do linear algebra stuff as well as programming with this package. Here a short example which reveals the underlying storage principle:
> set.seed(1)
> m <- matrix(runif(150*150),150)
> m[abs(m)<0.999] <- 0
> m.csr <- as.matrix.csr(m)
> m.csr
An object of class “matrix.csr”
Slot "ra":
 [1] 0.9996333 0.9991947 0.9990684 0.9998202 0.9994554 0.9998635 0.9992230
 [8] 0.9996335 0.9998552 0.9992397 0.9999306 0.9994426 0.9990106

Slot "ja":
 [1] 130  24 129  89   6  18 102 110  82  93   7 113 117

Slot "ia":
  [1]  1  1  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  4  4 
 [26]  4  4  4  5  5  6  6  6  6  6  6  6  6  6  7  8  8  8  8  8  8  8  8  8  8 
 [51]  8  8  8  8  8  9  9  9  9  9  9  9  9  9  9 10 10 10 10 10 10 10 10 10 10 
 [76] 10 10 10 10 10 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12
[101] 12 12 12 12 13 13 13 13 13 13 13 13 13 14 14 14 14 14 14 14 14 14 14 14 14
[126] 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14
[151] 14

Slot "dimension":
[1] 150 150

> image(m.csr)
visualization of the matrix: (only a few dots)

database

One method of storing externally is using a database. The possibly best choice is PostgreSQL, because it is not only one of the fastest but also very reliably. Another valuable feature are stored procedures - it can be programmed in about a dozen languages. This means, even complex calculations can be made entirely in the database and you just get the peace of information you want to know. On CRAN, there is an information site,