### Aptech Systems, Inc. Worldwide Headquarters

Address:

Aptech Systems, Inc.

2350 East Germann Road, Suite #21

Chandler, AZ 85286Phone: 360.886.7100

FAX: 360.886.8922**Ready to Get Started?**### Request Quote & Product Information

### Industry Solutions

### Products

### Resources

### Support

### Training & Events

Want more guidance while learning about the full functionality of GAUSS and its capabilities? Get in touch for in-person training or browse additional references below.

### Tutorials

Step-by-step, informative lessons for those who want to dive into GAUSS and achieve their goals, fast.

### Have a Specific Question?

### Q&A: Register and Login

### Support Plans

Premier Support and Platinum Premier Support are annually renewable membership programs that provide you with important benefits including technical support, product maintenance, and substantial cost-saving features for your GAUSS System or the GAUSS Engine.

### User Forums

Join our community to see why our users are considered some of the most active and helpful in the industry!

### Where to Buy

Available across the globe, you can have access to GAUSS no matter where you are.

### Recent Tags

applications character vectors CML CMLMT Constrained Optimization datasets dates dlibrary dllcall error error handling errors Excel file i/o floating network GAUSS Engine graphics GUI hotkeys installation Java API license licensing linux loading data loops matrices matrix matrix manipulation Maxlik MaxLikMT Memory optimization Optmum output PQG graphics procs RAM random numbers string functions strings structures threading Time Series writing data### Recent Questions

### Features

### Time Series 2.0 MT

### Industry Solutions

### Find out more now

### Time Series MT 2.1

# Resources

# Too many data, then::::: error G0030: Insufficient memory

When I run my program, the error is:

d:\gauss\ex(20) : error G0030 : Insufficient memory Currently active call: nonpara [20] d:\gauss\ex Stack trace: nonpara called from d:\gauss\ex, line 10

I guess the reason is that the sample size is too large. My GAUSS version is GAUSS11. My code is as follows (the code file name is ex):

format /rd 10,4; clear n,y,x; load data[5515478,11]= d:\gauss\data.txt; y=data[2:5515478,4]; x=data[2:5515478,9]; n=rows(x); yx=nonpara(y,x,x); "Nonparametric Estimation at 1-20 sample points"; yx[1:20]; end; proc nonpara(y,x,x0); //nonparametric estimation:y,x,x0:nx1 local h,kx,ykx,r; h=1.06*stdc(x)*(n^(-1/5)); kx=(x-x0′)./h; kx=pdfn(kx); r=minc(rows(kx)|cols(kx)); kx=diagrv(kx,zeros(r,1)); ykx=y.*kx; r=minc(rows(ykx)|cols(ykx)); ykx=diagrv(ykx,zeros(r,1)); r=meanc(ykx)./meanc(kx); retp(r); //nx1 endp;

## 5 Answers

On the third line in your procedure:

kx=(x-x0′)./h;

The variable `kx` will end up with a size of 5,515,478 rows and 5,515,478 columns for a total size of about 221 terabytes. That is too large for any computer these days.

Some GAUSS procedures, such as `ols`, have the option of passing in a dataset and telling GAUSS to process it in chunks for this reason. However, since this is user generated code, you will need to do it.

You will need to process the statements that create the largest matrices a few rows of at a time. Here is a start

proc nonpara(y,x,x0); //nonparametric estimation:y,x,x0:nx1 local h,kx,ykx,r, i, inc, index_end; //h is not too big, at just rows(x) by 1, so leave as is h=1.06*stdc(x)*(n^(-1/5)); //number of rows to process at a time //change this based upon computer memory size inc = 10 for i(1, rows(x), inc); index_end = i + inc - 1; kx=(x[i:index_end]-x0′)./h; kx=pdfn(kx); //rest of code to process 'inc' rows of data endfor; retp(r); //nx1 endp;

If this is not clear, or you need help with part of the conversion, please post your questions.

Thanks a lot. Now I program the rest of "proc nonpara(y,x,x0)" if I don't use the leave-one-out estimation and x0 is not a multi-dimensional vector. Is it rightly coded in this case?

proc nonpara(y,x,x0); //nonparametric estimation:y,x:nx1,x0:1x1 local h,kx,ykx,r1,r2, i, inc, index_end; //h is not too big, at just rows(x) by 1, so leave as is h=1.06*stdc(x)*(n^(-1/5)); //number of rows to process at a time //change this based upon computer memory size inc = 10; for i(1, rows(x), inc); index_end = i + inc - 1; kx=(x[i:index_end]-x0')./h; kx=pdfn(kx); ykx=y[i:index_end].*kx; r1=sumc(ykx); r2=sumc(kx); endfor; retp(r1/r2); //nx1 endp;

However, I still have some questions for this program. Please give instructions.

First, in the original code, x0 is also a vector, for example, x itself, which also has a large dimension. So in "kx=(x[i:index_end]-x0′)./h;", the memory will still be insufficient.

Second, I need a leave-one-out estimation for nonparametric estimation at each sample point xi. So in the previous code I use "kx=diagrv(kx,zeros(r,1));" to make the elements in the diagnal of the large metrix kx all being zero. Now in this revised version how to achieve this leave-one-out estimation?

Thanks.

Your code has a couple of problems:

- You need to add zeros to the diagonal of the of
`kx`and`ykx`. But placing zeros on the diagonal of each chunk of rows we process is not the same as placing zeros on the diagonal of the full size`kx`matrix. - You forgot to take the mean of
`r1`and`r2`

Take a look at this procedure and I will look answer your new questions a little later.

proc nonpara(y,x,x0); //nonparametric estimation:y,x:nx1,x0:1×1 local h,kx,ykx,r1,r2, i, inc, index_end; //h is not too big, at just rows(x) by 1, //so leave as is h=1.06*stdc(x)*(rows(x)^(-1/5)); r1 = zeros(rows(x), 1); r2 = zeros(rows(x), 1); //number of rows to process at a time //change this based upon computer memory size inc = 5; for i(1, rows(x), inc); index_end = i + inc - 1; kx=(x[i:index_end]-x0')./h; kx=pdfn(kx); //We don't want 0's on the diagonal //of each chunk we process for j(1, inc-1, 1); kx[j,i+j-1] = 0; endfor; ykx=y[i:index_end].*kx; r1 = r1 + sumc(ykx); r2 = r2 + sumc(kx); endfor; //Calculate mean in return line retp(((r1)./rows(x))./((r2)./rows(x))); //nx1 endp;

Thanks for this careful instruction. Now suppose I take the first 5515 data for the nonparametric estimation. In the procedure x0 is 1 by 1 for simplicity. I program this code according to your suggestion as follows. When i let it run, the error is:

error G0058 : Index out of range

Currently active call: nonpara [38] d:\gauss\ex

Stack trace:

nonpara called from d:\gauss\ex, line 15

That is, there is something wrong in" kx=(x[i:index_end]-x0)./h;" Why? Please help.

-------------------------------

clear n,y,x; load data[5515478,11]=d:\gauss\data.txt; y=data[2:5515,4]; x=data[2:5515,9]; n=rows(x); a=minc(x); b=maxc(x); s=10; h0=(b-a)/s; x0=seqa(a,h0,s); //ten points in [a,b] yx=x0; for i(1, rows(x0), 1); yx[i]=nonpara(y,x,x0[i]); endfor; s~yx; end; proc nonpara(y,x,x0); //y,x:nx1, x0:1x1 local h,kx,ykx,r1,r2, i,j, inc, index_end; // h is not too big, at just rows(x) by 1, so leave as is h=1.06*stdc(x)*(rows(x)^(-1/5)); r1 = 0; r2 = 0; // number of rows to process at a time //change this based upon computer memory size inc = 1000; for i(1,rows(x),inc); if i>5*inc+1; //5001 to rows(x)=5514 kx=(x[(5*inc+1):rows(x)]-x0)./h; kx=pdfn(kx); ykx=y[5*inc+1:rows(x)].*kx; else; index_end = i + inc - 1; kx=(x[i:index_end]-x0)./h; kx=pdfn(kx); ykx=y[i:index_end].*kx; endif; r1 = r1 + sumc(ykx); //1x1 r2 = r2 + sumc(kx); //1x1 endfor; retp(r1/r2); endp;

A couple of things here:

- Don't make the code work for rows that are not evenly divisible by 'inc' until you get it working correctly for rows that are evenly divisible by 'inc' first. So pull out the entire block that starts with:
if i > 5*inc+1;

Adding this part last will make it much simpler to get it working.

- Your code needs to add zeros on the diagonal like the original code does.
- My last posted revision of the procedure assumed that all three inputs to the procedure would have the same number of rows. Some changes need to be made to accomodate that. For example, the
`for`loop for adding zeros on the diagonal will have to change since`kx`will not necessarily have the same number of columns as`x`has rows.

Here is a version that should work for any input where `y` and `x` have equal column length and the variable `inc` is a factor. Try this out, think about the changes you see and then try adding a clean-up loop to take care of the left over rows after we loop through `i`*`inc` times. I recommend starting with small data and using the debugger and then bumping up the problem size after it seems to work with the smaller data.

proc nonpara(y,x,x0); //nonparametric estimation:y,x:nx1,x0:1×1 local h,kx,ykx,r1,r2, i, j, inc, index_end, index_zeros, num_zeros; //h is not too big, at just rows(x) by 1, //so leave as is h=1.06*stdc(x)*(rows(x)^(-1/5)); r1 = 0; r2 = 0); //number of rows to process at a time //change this based upon computer memory size inc = 5; index_zeros = 1; if rows(x0) < inc; num_zeros = inc; else; num_zeros = rows(x0); endif; for i(1, rows(x), inc); index_end = i + inc - 1; kx=(x[i:index_end]-x0')./h; kx=pdfn(kx); //We don't want 0's on the diagonal //of each chunk we process if (index_zeros < num_zeros); for j(1, inc, 1); kx[j, index_zeros+j-1] = 0; endfor; index_zeros = index_zeros + inc; endif; ykx=y[i:index_end].*kx; r1 = r1 + sumc(ykx); r2 = r2 + sumc(kx); endfor; //Calculate mean in return line retp(((r1)./rows(x))./((r2)./rows(x))); //nx1 endp;