Reading whitespace delimited data from a txt file

Hi,

I want to read data from a txt file. The data is organized so that each line is a record, e.g. with a string in column 1-7, a string in column 8-12, values in column 13-20, 21-28 and 29-36. It can happen that some values are blank, e.g. that in one record the value i columns 8-12 is a blank.

Until now I have used a very troublesome solution where firstly the file is imported to Excel and all blanks are replaced by some specific character, normally &,  and then saved again as a txt file with just one Space between each variable in the record.

Then it is converted to a GAUSS dat-file with the Atog.exe running this file

input myfile.txt;
output myfile.dat;
msym &;
invar $ var1 var2 # var3 var4 var5 ;
outvar var1 var2 var3 var4 var5 ;
outtyp d;

I hope that someone can suggest a smarter way to do this?

Thanks in advance.

3 Answers



0



I'm sure we can come up with a better solution, but let me make sure I am understanding the problem.

Are you saying that in the original data file, spaces can represent a separator or a missing value?

Can I see a couple of representative lines from one of these files?

aptech

1,773


0



Thanks,

Here is a few records from the data file. The columns have a fixed with. In this example the first three variables should be read as strings and the last two as numerical values. In the real file there are more columns with numerical values.

V621105         3110          03113                       24             200
V621105         5210                                       0              23
V621105         5251                                      84               0
V621105         5252                                     165               0
V621105         6001                                     382               0
V621105         6007                                      63               0
V621200         0100          130000                      17               
V621200         0100          140000                       3               
V621200         0700                                    1222               
V621200         0900                                       0               
V621200         2000          140000                      17               9


0



Delimited text data is really easy to read in recent versions of GAUSS. In the current version of GAUSS if the file was comma-separated, you could read it into GAUSS like this:

X = loadd("myfile.csv");

A file where a space could mean a separator or a missing value is somewhat difficult.

  1. Where do these files come from?
  2. Is there a way to get a version of the file with a different delimiter such as a tab, comma, semi-colon, etc?

aptech

1,773

Your Answer

3 Answers

0

I'm sure we can come up with a better solution, but let me make sure I am understanding the problem.

Are you saying that in the original data file, spaces can represent a separator or a missing value?

Can I see a couple of representative lines from one of these files?

0

Thanks,

Here is a few records from the data file. The columns have a fixed with. In this example the first three variables should be read as strings and the last two as numerical values. In the real file there are more columns with numerical values.

V621105         3110          03113                       24             200
V621105         5210                                       0              23
V621105         5251                                      84               0
V621105         5252                                     165               0
V621105         6001                                     382               0
V621105         6007                                      63               0
V621200         0100          130000                      17               
V621200         0100          140000                       3               
V621200         0700                                    1222               
V621200         0900                                       0               
V621200         2000          140000                      17               9
0

Delimited text data is really easy to read in recent versions of GAUSS. In the current version of GAUSS if the file was comma-separated, you could read it into GAUSS like this:

X = loadd("myfile.csv");

A file where a space could mean a separator or a missing value is somewhat difficult.

  1. Where do these files come from?
  2. Is there a way to get a version of the file with a different delimiter such as a tab, comma, semi-colon, etc?


You must login to post answers.

Have a Specific Question?

Get a real answer from a real person

Need Support?

Get help from our friendly experts.

Try GAUSS for 14 days for FREE

See what GAUSS can do for your data

© Aptech Systems, Inc. All rights reserved.

Privacy Policy