Operation on large matrix

Question

I have a large dataset and need to perform a series of matrix operations on it. I have the following line of code:

summa = sig^2.*inv(spEye(nobs) - rho.*weights);

where sig and rho are parameters to be estimated and nobs = 116279. I can't create a full matrix due to RAM restrictions and am wondering if there is a different method that might work here. I read through https://www.aptech.com/questions/inverse-of-sparse-matrix/.

Thanks,

Jason

7 Answers

Your Answer

jfhawkin · Answer 1

To clarify, weights is also a sparse matrix of 1's with the same dimensions of 116279x116279.

link

jfhawkin

15

aptech · Answer 2

Do you mean that weights is a 116,279x116,279 matrix where every element is a 1? Is rho a 116,279x1 vector?

link

aptech

1,773

jfhawkin · Answer 3

Weights is a sparse matrix of 1/0 with about 1% of the cells being 1s. Rho is a single parameter. I found the function for multiplying a sparse matrix by a scalar, but the inverse and determinant are still an issue.

link

jfhawkin

15

jfhawkin · Answer 4

I am basically trying to reproduce eq. 9 in the following paper:

https://link.springer.com/content/pdf/10.1007%2Fs00362-008-0178-4.pdf

link

jfhawkin

15

jfhawkin · Answer 5

I spoke with my supervisor and came up with an alternative method. The objective was to make full use of the 167,000 records in the dataset. Sampling about 2000 records multiple times and using a bootstrap method should still give a good result.

As a note, I also explored the spConjGradSol() function and some other avenues, but that function doesn't seem to help in my case since the inverse is nested within a larger function. I tried estimating with about 29,000 records, but the inversion is still too much for my 16GB of RAM.

link

jfhawkin

15

aptech · Answer 6

Another thing you could try, though it might be slow, is to compute the matrix inverse by a linear solve with the identity matrix. For example:

// Example dense matrix
A = {   25   22   30   14   22, 
         7    2   21   20   28,
        27    7   19   15   12, 
        30   19   10   18   19, 
         1    1    3   17    2 };

A_i = inv(A);
print "Inverse of A = " A_i;

A_i_cmp = eye(rows(A)) / A;
print "eye(rows(A)) / A = " A_i_cmp;

Which will print out

Inverse of A =
 -0.0269  -0.0080   0.0472   0.0160  -0.0270
  0.0491  -0.0276  -0.0615   0.0201   0.0251
  0.0380  -0.0051   0.0248  -0.0531   0.0091
 -0.0055  -0.0023  -0.0005   0.0035   0.0628
 -0.0214   0.0452  -0.0256   0.0319  -0.0467

eye(rows(A)) / A =
 -0.0269  -0.0080   0.0472   0.0160  -0.0270
  0.0491  -0.0276  -0.0615   0.0201   0.0251
  0.0380  -0.0051   0.0248  -0.0531   0.0091
 -0.0055  -0.0023  -0.0005   0.0035   0.0628
 -0.0214   0.0452  -0.0256   0.0319  -0.0467

You will have to modify that for the sparse matrix and perform the linear solve 1 column at a time like this:


for i(1, cols(A), 1);
    // Create 1 column of the identity matrix
    e_tmp = zeros(rows(A), 1);
    e_tmp[i] = 1;

    // Perform the linear solve
    A_i_col = e_tmp / A;
    print A_i_col;
endfor;

Which will print out:

I don't think that is the most elegant solution, but it might be worth trying for a couple of columns at least to compare to your bootstrap solution.

I tried to look at that paper, but do not have access to the full version so I could not see equation 9. However, if your matrix is positive definite, there might be a better solution for at least the determinant.

link

aptech

1,773

jfhawkin · Answer 7

Great! This is very helpful. I will do some experimenting with it.

link

jfhawkin

15

jfhawkin · Answer 8

To clarify, weights is also a sparse matrix of 1's with the same dimensions of 116279x116279.

link

jfhawkin

15

aptech · Answer 9

Do you mean that weights is a 116,279x116,279 matrix where every element is a 1? Is rho a 116,279x1 vector?

link

aptech

1,773

jfhawkin · Answer 10

Weights is a sparse matrix of 1/0 with about 1% of the cells being 1s. Rho is a single parameter. I found the function for multiplying a sparse matrix by a scalar, but the inverse and determinant are still an issue.

link

jfhawkin

15

jfhawkin · Answer 11

I am basically trying to reproduce eq. 9 in the following paper:

https://link.springer.com/content/pdf/10.1007%2Fs00362-008-0178-4.pdf

link

jfhawkin

15

jfhawkin · Answer 12

I spoke with my supervisor and came up with an alternative method. The objective was to make full use of the 167,000 records in the dataset. Sampling about 2000 records multiple times and using a bootstrap method should still give a good result.

As a note, I also explored the spConjGradSol() function and some other avenues, but that function doesn't seem to help in my case since the inverse is nested within a larger function. I tried estimating with about 29,000 records, but the inversion is still too much for my 16GB of RAM.

link

jfhawkin

15

aptech · Answer 13

Another thing you could try, though it might be slow, is to compute the matrix inverse by a linear solve with the identity matrix. For example:

// Example dense matrix
A = {   25   22   30   14   22, 
         7    2   21   20   28,
        27    7   19   15   12, 
        30   19   10   18   19, 
         1    1    3   17    2 };

A_i = inv(A);
print "Inverse of A = " A_i;

A_i_cmp = eye(rows(A)) / A;
print "eye(rows(A)) / A = " A_i_cmp;

Which will print out

Inverse of A =
 -0.0269  -0.0080   0.0472   0.0160  -0.0270
  0.0491  -0.0276  -0.0615   0.0201   0.0251
  0.0380  -0.0051   0.0248  -0.0531   0.0091
 -0.0055  -0.0023  -0.0005   0.0035   0.0628
 -0.0214   0.0452  -0.0256   0.0319  -0.0467

eye(rows(A)) / A =
 -0.0269  -0.0080   0.0472   0.0160  -0.0270
  0.0491  -0.0276  -0.0615   0.0201   0.0251
  0.0380  -0.0051   0.0248  -0.0531   0.0091
 -0.0055  -0.0023  -0.0005   0.0035   0.0628
 -0.0214   0.0452  -0.0256   0.0319  -0.0467

You will have to modify that for the sparse matrix and perform the linear solve 1 column at a time like this:


for i(1, cols(A), 1);
    // Create 1 column of the identity matrix
    e_tmp = zeros(rows(A), 1);
    e_tmp[i] = 1;

    // Perform the linear solve
    A_i_col = e_tmp / A;
    print A_i_col;
endfor;

Which will print out:

I don't think that is the most elegant solution, but it might be worth trying for a couple of columns at least to compare to your bootstrap solution.

I tried to look at that paper, but do not have access to the full version so I could not see equation 9. However, if your matrix is positive definite, there might be a better solution for at least the determinant.

link

aptech

1,773

jfhawkin · Answer 14

Great! This is very helpful. I will do some experimenting with it.

link

jfhawkin

15

Operation on large matrix

7 Answers

Your Answer

7 Answers

You must login to post answers.

Have a Specific Question?

Need Support?