I have a numeric vector x that I'll need to rank. Before ranking however, I'd like to check whether it has any elements that occur more than once.
I suppose one way that I could do that is to check whether
rows(x) == rows(unique(x));
That seems like a lot of work, collecting all the unique entries only to count them and then discard them.
Is there a quicker or more elegant way?
I believe it will be quite a bit faster to first sort the data and then do a vectorized comparison of each element xi with the next element xi+1.
sort_x = sortc(x, 1); rep = sort_x[1:rows(sort_x)-1] .== sort_x[2:rows(sort_x)]; if sumc(rep); //we have repetitions else; //all data unique endif;
Some quick and limited testing shows this to be about 40% faster.