Find all rows with matching value

Question

Hi there,

I'm trying to write a code that will return which rows of a column-vector match a certain value. So, consider the following example:

x = { 10 10 20 20 30 40 30 20 20 10 20 30 30 20 40 };

x = x';

my_rows = find_rows(40,x);

And I would expect that the new variable "my_rows" would contain the values 6 and 15 (because the number 40 was in rows 6 and 15). Does this find/index function exist?

I saw that the function indnv gets close, but it only returns the first index value: 6.

Is there a way to find this quickly without having to build a for-loop and using an if statement?

I ask because my dataset is quite large and even though it's a simple operation, the size of my original dataset (230,000 obs) makes it take quite a while to process.

Thank you!

4 Answers

Your Answer

aptech · Answer 1

The function indexcat does exactly what you are looking for.

// Use commas to create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };

// Find the indices of all rows, containing '40'
my_rows = indexcat(x, 40);

will set my_rows equal to

6
15

link

aptech

1,773

phildias · Answer 2

Thank you!

And is there a slightly different version, where I find all rows that are NOT equal to a certain value?

So, in the example I gave, it would look like this:

x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };

my_rows = indexcat(x,40);

other_rows = opposite_indexcat(x,40)

And then my_rows contains {6, 15} while other_rows contains {1,2,3,4,5,7,8,9,10,11,12,13,14}.

Does this exist too?

Thanks again!

link

phildias

20

aptech · Answer 3

I don't believe there is a function which is the exact opposite of indexcat, you can find the rows which do not match a certain value with the combination of a couple of GAUSS functions.

Method 1

// Create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };

// Create the sequence 1, 2, 3...rows(x)
idx = seqa(1, 1, rows(x));

// Remove the rows of 'idx' which correspond
// to the rows of 'x' that equal 40
// Note that the `.==` operator will return a vector
// of 0's and 1's
idx_2 = delif(idx, x .== 40);

Method 2

// Create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };

// 1. Find the indices of 'x' which equal 40.
// 2. Remove the rows found in step 1, from
//    the sequence 1, 2, 3...rows(x)
idx_2 = delrows(seqa(1, 1, rows(x)), indexcat(x, 40));

link

aptech

1,773

phildias · Answer 4

Awesome, thank you! This is much faster than my original for-loop =)

link

phildias

20

aptech · Answer 5

The function indexcat does exactly what you are looking for.

// Use commas to create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };

// Find the indices of all rows, containing '40'
my_rows = indexcat(x, 40);

will set my_rows equal to

6
15

link

aptech

1,773

phildias · Answer 6

Thank you!

And is there a slightly different version, where I find all rows that are NOT equal to a certain value?

So, in the example I gave, it would look like this:

x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };

my_rows = indexcat(x,40);

other_rows = opposite_indexcat(x,40)

And then my_rows contains {6, 15} while other_rows contains {1,2,3,4,5,7,8,9,10,11,12,13,14}.

Does this exist too?

Thanks again!

link

phildias

20

aptech · Answer 7

I don't believe there is a function which is the exact opposite of indexcat, you can find the rows which do not match a certain value with the combination of a couple of GAUSS functions.

Method 1

// Create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };

// Create the sequence 1, 2, 3...rows(x)
idx = seqa(1, 1, rows(x));

// Remove the rows of 'idx' which correspond
// to the rows of 'x' that equal 40
// Note that the `.==` operator will return a vector
// of 0's and 1's
idx_2 = delif(idx, x .== 40);

Method 2

// Create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };

// 1. Find the indices of 'x' which equal 40.
// 2. Remove the rows found in step 1, from
//    the sequence 1, 2, 3...rows(x)
idx_2 = delrows(seqa(1, 1, rows(x)), indexcat(x, 40));

link

aptech

1,773

phildias · Answer 8

Awesome, thank you! This is much faster than my original for-loop =)

link

phildias

20

Find all rows with matching value

4 Answers

Your Answer

4 Answers

You must login to post answers.

Have a Specific Question?

Need Support?