I am trying the following but do not complete it since 1 day ago.
DAT data size is about 1GB. Should I split it into several pieces?
My work station has enough memory (128GB) and high level CPU.
What is the format of the data? i.e. is it comma delimited text with a header like this:
x,y,z 1,2,3 4,5,6
or is it separated by semi-colons, or is it a binary data file?
I tried both the comma and tab delimited texts without a header. I cannot obtain fmt file yet.
You can convert a comma separated file without a header to a GAUSS dataset with the GAUSS ATOG utility. atog.exe comes with GAUSS and is located in your GAUSSHOME directory.
Let's assume that our GAUSSHOME directory is C:\gauss and that we have a file named my_data.csv that we would like to convert to a GAUSS dataset. The contents of my_data.csv are:
12,0.065,79 10,0.04,84 9,0.07,103 11,0.055,92
Let's assume that we want to create a GAUSS dataset with the following variable names: prod, WACC and price. The first thing we need to do is create a command file for atog.exe to use. In it we need to specify the 1) Input data file 2) Outgoing data file 3) Incoming variable names 4) Final variable names 5) (optional) to preserve variable name case.
We will create an ATOG command file named my_atog_cmd.txt in C:\gauss with the following contents:
input my_data.csv; output my_data.dat; invar prod WACC price; outvar prod WACC pric; preservcase;
Now we need to execute the ATOG utility. We can either:
- Open a Windows Command Prompt
- Go to C:\gauss
- Execute the following command:
or from GAUSS, we can:
- Set our GAUSS working directory to C:\gauss
- From the GAUSS command prompt we can execute ATOG by entering this command:
If the exec of ATOG succeeds, it will print out a return value of 0 and you will have a new file in C:\gauss named my_data.dat. This conversion for a large file might take 30 seconds or so, depending on how large the file is and how fast your computer is. However, when you load the data from the GAUSS dataset into GAUSS, the same size data should load in a second or two.
You can then load in the matrix of all variables simply with the loadd command like this:
X = loadd("C:\\gauss\\my_data.dat");
Many thanks! I can get Gauss data set!