Hi,

Can anyone help me with this:

I have a correlation matrix NxN and I need to find all the index vectors of dimension greater or equal to 3, where all pairs included in each index vector are jointly correlated (in particular they obtain correlation(a,b)>c0 ).

for example assume we have the following table of correlation pairs:

variable index variable index

4.0000000 5.0000000 (4 and 5 with correlation > c0)

4.0000000 6.0000000 (4 and 6 with orrelation > c0)

4.0000000 9.0000000 (4 and 9 with correlation > c0)

5.0000000 9.0000000 (5 and 9 with correlation > c0)

5.0000000 10.000000 (5 and 10 with correlation > c0)

6.0000000 9.0000000 (6 and 9 with correlation > c0)

assume index vectors with column size greater or equal to 3.

Then the index vectors are

index_vec1 = 4|6|9 and

index_vec2 = 4|5|9.

Notice that 5,6 do not obtain correlation>c0, so they are not "correlated".

Can anyone help?

Thanks,

T.

## 2 Answers

1

accepted

Here is a start. I'll see if I can find some time later to make it loop over the vector to find all of the index vectors rather than just one of them.

c = { 4 5, 4 6, 4 9, 5 9, 5 10, 6 9 }; //Sort by first column, then secondarily by second column c = sortmc(c, 1|2); //Grab first variable var_1 = c[1,1]; //Grab first correlating variable var_2 = c[1,2]; //Select rows of 'var_1' except for //first row which references 'var_2' c_1 = selif(c, 0|(c[2:rows(c),1] .== var_1)); //Remove observations of 'var_1' c = delif(c, c[.,1] .== var_1); //Create vector of 'var_2's correlations c_2 = selif(c, (c[.,1] .== var_2)); //Find variables with which 'var_1' //and 'var_2' correlate idx_1 = selif(c_1[.,2], sumr(c_1[.,2] .== c_2[.,2]')); //Add 'var_1' and 'var_2' to the list //of shared correlations idx_1 = var_1 | var_2 | idx_1; print "idx_1 = " idx_1;

0

I think that, this is similar to the maximal clique problem

http://en.wikipedia.org/wiki/Clique_problem

Is there a Gauss Code for this?

Thanks

T.

## Your Answer

## 2 Answers

Here is a start. I'll see if I can find some time later to make it loop over the vector to find all of the index vectors rather than just one of them.

c = { 4 5, 4 6, 4 9, 5 9, 5 10, 6 9 }; //Sort by first column, then secondarily by second column c = sortmc(c, 1|2); //Grab first variable var_1 = c[1,1]; //Grab first correlating variable var_2 = c[1,2]; //Select rows of 'var_1' except for //first row which references 'var_2' c_1 = selif(c, 0|(c[2:rows(c),1] .== var_1)); //Remove observations of 'var_1' c = delif(c, c[.,1] .== var_1); //Create vector of 'var_2's correlations c_2 = selif(c, (c[.,1] .== var_2)); //Find variables with which 'var_1' //and 'var_2' correlate idx_1 = selif(c_1[.,2], sumr(c_1[.,2] .== c_2[.,2]')); //Add 'var_1' and 'var_2' to the list //of shared correlations idx_1 = var_1 | var_2 | idx_1; print "idx_1 = " idx_1;

I think that, this is similar to the maximal clique problem

http://en.wikipedia.org/wiki/Clique_problem

Is there a Gauss Code for this?

Thanks

T.