I am a new user to Gauss, could you please tell me how can i drop the top 5% observations of my data set, I did not find any code related to this. But I think I can line up all the observations and delete the 5%* the number of observations' rows in the end.
Can anyone help me with this?
Thank you very much
I am not certain that I am understanding your question correctly. Below is something that may help you. If it does not answer your question, let us know and we will be happy to provide more help.
Deleting the first 5% of your observations
Let us suppose that we have one variable with 100 observations. For our example, we will create a random normal vector to represent this variable. We can remove the first 5% of observations by using an indexing operation to select the final 95% like this:
//create example variable x_1 = rndn(100, 1); //assign 'x_1' to equal the last 95% of observations x_1 = x_1[6:100];
Since we will not always have 100 observations and would like code that works even if we get more data, we should make the code more abstract. In this next code snippet, we will use the rows function to calculate the length of our vector and the ceil function to round up in case 5% of our total number of rows is not a whole number.
//create example variable x_1 = rndn(100, 1); //calculate index of first row we want to keep start_idx = ceil(rows(x_1) * 0.05); //assign 'x_1' to equal the last 95% of observations x_1 = x_1[start_idx:rows(x)];