How to do "countifs" (like in Excel) in Gauss please?

Hi,

I am trying to count the occurrences of each element in one column based on the criterion in another column.

For example,

Given:

1.0000000 2.0000000
1.0000000 2.0000000
1.0000000 4.0000000
3.0000000 6.0000000
3.0000000 6.0000000
4.0000000 9.0000000

I would like to count the number of occurrences of each element in the second column for the same element in the first column. My desired outcome would be: (the last column counts the number of occurrences)

1.0000000 2.0000000 2.0000000
1.0000000 4.0000000 1.0000000
3.0000000 6.0000000 2.0000000
4.0000000 9.0000000 1.0000000

Any ideas on how this could be realized please?

Thank you very much!

 

1 Answer



0



Here is a procedure that will accomplish what you want:


new;
x = { 1.0000000 2.0000000,
    1.0000000 2.0000000,
    1.0000000 4.0000000,
    3.0000000 6.0000000,
    3.0000000 6.0000000,
    4.0000000 9.0000000 };

x_counted = countIfs(x);
print "x_counted = " x_counted;
proc (1) = countIfs(x);
    local idx, tmp, out, j, tmp_u, tmp_inner;
    local i;
    
    out = {};
    i = 1;
    
    //Keep looping until we have removed all elements from 'x'
    do until scalmiss(x);
        //Grab chunk of first row matches
        tmp = selif(x, x[.,1] .== x[.,1]);
        
        //Remove chunk of first row matches from 'x' for next iteration
        x = delif(x, x[.,1] .== x[.,1]);
        
        //Grab unique elements from second column
        tmp_u = unique(tmp[.,2]);
        
        for (1, rows(tmp_u), 1);
            //Grab all rows where the second column is the same
            tmp_inner = selif(tmp, tmp[.,2] .== tmp[1,2]);
            
            //Create output row 
            out = out | (tmp_inner[1,.] ~ rows(tmp_inner));
            
            //Remove rows with this iteration's second column value from 'tmp'
            tmp = delif(tmp, tmp[.,2] .== tmp[1,2]);
        endfor;
    endo;
    retp(out);
endp;

aptech

1,773

Your Answer

1 Answer

0

Here is a procedure that will accomplish what you want:


new;
x = { 1.0000000 2.0000000,
    1.0000000 2.0000000,
    1.0000000 4.0000000,
    3.0000000 6.0000000,
    3.0000000 6.0000000,
    4.0000000 9.0000000 };

x_counted = countIfs(x);
print "x_counted = " x_counted;
proc (1) = countIfs(x);
    local idx, tmp, out, j, tmp_u, tmp_inner;
    local i;
    
    out = {};
    i = 1;
    
    //Keep looping until we have removed all elements from 'x'
    do until scalmiss(x);
        //Grab chunk of first row matches
        tmp = selif(x, x[.,1] .== x[.,1]);
        
        //Remove chunk of first row matches from 'x' for next iteration
        x = delif(x, x[.,1] .== x[.,1]);
        
        //Grab unique elements from second column
        tmp_u = unique(tmp[.,2]);
        
        for (1, rows(tmp_u), 1);
            //Grab all rows where the second column is the same
            tmp_inner = selif(tmp, tmp[.,2] .== tmp[1,2]);
            
            //Create output row 
            out = out | (tmp_inner[1,.] ~ rows(tmp_inner));
            
            //Remove rows with this iteration's second column value from 'tmp'
            tmp = delif(tmp, tmp[.,2] .== tmp[1,2]);
        endfor;
    endo;
    retp(out);
endp;


You must login to post answers.

Have a Specific Question?

Get a real answer from a real person

Need Support?

Get help from our friendly experts.