I loaded a CSV datafile with many variables using the GAUSS function `loadd()`

. The first row of the CSV file contains the variable names, including date variable. I loaded first the dependent variable as follows:

`y = loadd(fname, "wage");`

and then I know I could load all the variables except `wage`

using:

`x = loadd(fname, ". -wage").`

However, I do not want to create `x`

, but selecting some variables from my CSV file. For example, some of the variables contained in my CSV file are `x1, x2, x3, x4, x5,..., x80`

. How can select `x1, x2, x10, x15, x79 and x80`

from my CSV file? Can GAUSS do it?

## 3 Answers

0

Yes, you can do it in GAUSS and I will post a code snippet with a GAUSS `proc`

which will work. However, in most cases, you are actually better off to load all the variables in a single matrix.

If you would like to explore that option (which may well serve you much better), please post a new question asking something like "Is there a way to avoid using column vectors" or something like that and we will gladly help you to find a better solution.

Here is the code snippet to create the local variables, though I don't recommend it. The code snippet assumes that you have a file named `mydata.csv`

in your current working directory which contains at least six variables: `x1`

, `x2`

, `x10`

, `x15`

, `x79`

and `x80`

.

```
new;
cls;
fname = "mydata.csv";
// Create new global variables in your GAUSS workspace,
// named 'x1', 'x2', 'x10', 'x15', 'x79' and 'x80'
// from 'mydata.csv'.
loadvars(fname, "x1 + x2 + x10 + x15 + x79 + x80");
/*
** Creates global column vectors.
**
** fname String, name of a data file (CSV,DAT,XLSX,SAS,STAT)
** which can be properly loaded by 'loadd'.
**
** vars Simple formula string which can only contain
** variable names and the '+' operator. i.e.
** "GDP + Inflation + Exports".
**
** No transformations are allowed.
**
*/
proc (0) = loadvars(fname, vars);
local x;
// Load specified variables as colmns of 'x'
x = loadd(fname, vars);
// Split the forumla string into
// a column vector, making splits at each '+'
// sign.
vars = strsplit(vars, "+");
// Trim any empty space from either side of
// the variable name
vars = strtrim(vars);
for i(1, cols(x), 1);
// Create each GLOBAL column vector
call varput(x[.,i], vars[i]);
endfor;
endp;
```

0

OK and thank you for your help. Could you please explain how to use those global variables? I thought that after creating these column vector variables, I could concatenate them using `~`

to form a matrix x, where `x = x1~x2~x10~x15~x79~x80`

, but it did not work. I am trying form `x`

, since I want to compute LS estimates using these specific independent variables. So, I need to form `xtx = x'x`

.

0

If you want to create a matrix `x`

, where `x = x1 ~ x2 ~ x10 ~ x15 ~ x79 ~ x80`

. Then all you need is `loadd`

.

```
new;
cls;
fname = "mydata.csv";
// Create new global variable named 'x' in your GAUSS workspace,
// with 6 columns which correspond to the variables: 'x1', 'x2',
// 'x10', 'x15', 'x79' and 'x80' in the file 'mydata.csv'
x = loadd(fname, "x1 + x2 + x10 + x15 + x79 + x80");
// Add a column of ones so you can estimate the intercept
x = ones(rows(x), 1) ~ x;
// Create moment matrix
xtx = x'x;
```

Based on your last message, I am pretty sure that this is exactly what you are looking for.

### Possible problems with global column vectors

I think you will be fine just using the solution above and can probably ignore this. However, for completeness, here is an answer to the global variable problem. My best guess is that you did something like this:

```
new;
cls;
fname = "mydata.csv";
// Create new global variables in your GAUSS workspace,
// named 'x1', 'x2', 'x10', 'x15', 'x79' and 'x80'
// from 'mydata.csv'.
loadvars(fname, "x1 + x2 + x10 + x15 + x79 + x80");
// Concatenate all column vectors into one matrix
x = x1 ~ x2 ~ x10 ~ x15 ~ x79 ~ x80;
/*
** Creates global column vectors.
**
** fname String, name of a data file (CSV,DAT,XLSX,SAS,STAT)
** which can be properly loaded by 'loadd'.
**
** vars Simple formula string which can only contain
** variable names and the '+' operator. i.e.
** "GDP + Inflation + Exports".
**
** No transformations are allowed.
**
*/
proc (0) = loadvars(fname, vars);
local x;
// Load specified variables as colmns of 'x'
x = loadd(fname, vars);
// Split the forumla string into
// a column vector, making splits at each '+'
// sign.
vars = strsplit(vars, "+");
// Trim any empty space from either side of
// the variable name
vars = strtrim(vars);
for i(1, cols(x), 1);
// Create each GLOBAL column vector
call varput(x[.,i], vars[i]);
endfor;
endp;
```

And you got an error message like this `Undefined symbol: 'x1'`

. If that is what happened, the reason is that GAUSS turns the compiles and checks the code for errors before it runs the code.

One of the errors it checks for is `undefined symbols`

. This is a very good thing for you because otherwise, you could have some code running for hours or days and then have it fail because you misspelled a variable name at the end of the code.

This can be resolved by either using `varindxi`

with the `open`

command when opening a GAUSS dataset or by using the `declare`

command.

The `declare`

command allows you to tell GAUSS that it may not see a particular variable, but you will create it during the program so GAUSS does not have to return an error. `declare`

can also set a default value, or assign a particular value if the variable has not otherwise been set.

`varindxi`

is a flag passed to the `open`

command when you open a GAUSS dataset. It tells GAUSS to create variables to index the columns of the dataset. These index variables will be `i`

and then your variable name. So if you loaded a matrix from a GAUSS dataset with three variables `GDP`

, `Inflation`

and `Exports`

, then after the load you would have three new scalar values as well `iGDP = 1`

, `iInflation = 2`

and `iExports = 3`

, because `GDP`

would be the first column of the loaded matrix.