Empirical CDF

Hello,

I am trying to get the graph of an empirical cumulative distribution function, but I cant seem to find the commands that I use in other softwares such as Matlab. In there it is ecdf(x). Is there a way to do this in Gauss?

Thanks.

 

1 Answer



0



I am not sure if any of the GAUSS applications have an ECDF function, but I think this procedure will do what you need.

proc (1) = ecdf(x);
local bp_start, bp_end, bp_inc, bp_pts, out, num_breaks;

	num_breaks = rows(x);

	//calculate breakpoints
	bp_start = minc(x);
	bp_end = maxc(x);
	bp_inc = (bp_end - bp_start)./num_breaks;
	bp_pts = seqa(bp_start, bp_inc, num_breaks + 1);

	//pre-allocate 'out' vector to avoid concatenation
	out = zeros(rows(bp_pts), 1);
	out2 = out;

	//Clear code to calculate how many in each breakpoint
	for i(1, rows(bp_pts), 1);
		out[i] = sumc(x .< bp_pts[i]);
	endfor;

	retp(out);
endp;

You can adjust the number of breakpoints num_breaks to trade-off between speed and the smoothness of the curve. Though, if speed is a concern you might want to replace the for loop at the end with a matrix based operation like this:

//Faster matrix based code to calculate how many in each breakpoint
x = reshape(x', rows(bp_pts), rows(x));
out = sumr(x .< bp_pts);

Your Answer

1 Answer

0

I am not sure if any of the GAUSS applications have an ECDF function, but I think this procedure will do what you need.

proc (1) = ecdf(x);
local bp_start, bp_end, bp_inc, bp_pts, out, num_breaks;

	num_breaks = rows(x);

	//calculate breakpoints
	bp_start = minc(x);
	bp_end = maxc(x);
	bp_inc = (bp_end - bp_start)./num_breaks;
	bp_pts = seqa(bp_start, bp_inc, num_breaks + 1);

	//pre-allocate 'out' vector to avoid concatenation
	out = zeros(rows(bp_pts), 1);
	out2 = out;

	//Clear code to calculate how many in each breakpoint
	for i(1, rows(bp_pts), 1);
		out[i] = sumc(x .< bp_pts[i]);
	endfor;

	retp(out);
endp;

You can adjust the number of breakpoints num_breaks to trade-off between speed and the smoothness of the curve. Though, if speed is a concern you might want to replace the for loop at the end with a matrix based operation like this:

//Faster matrix based code to calculate how many in each breakpoint
x = reshape(x', rows(bp_pts), rows(x));
out = sumr(x .< bp_pts);

You must login to post answers.

Have a Specific Question?

Get a real answer from a real person

Need Support?

Get help from our friendly experts.

Try GAUSS for 14 days for FREE

See what GAUSS can do for your data

© Aptech Systems, Inc. All rights reserved.

Privacy Policy