Bayesian Importance Sampling

Introduction

In this tutorial, we will

  1. Write a GAUSS procedure for evaluating a univariate t-density at a specified point.
  2. Write a GAUSS procedure for getting a random draw from the t(n) distribution.
  3. Calculate the posterior mean and standard deviation of a $\theta \text{ ~ } N(0,1)$ using Monte Carlo integration.
  4. Calculate the posterior mean and standard deviation of $\theta$ using importance sampling with an importance function of $t(0, 1, v)$.
  5. Calculate the mean and standard deviation of the importance function.

Background

Importance sampling is a Bayesian estimation technique which estimates a parameter by drawing from a specified importance function rather than drawing directly from the joint posterior distribution. Under the condition that the importance function's support includes the posterior's support, then

$$\hat{g} = \frac{\sum_{r=1}^{R} w(\theta^{(r)})g(\theta^{(r)})}{\sum_{r=1}^{R} w(\theta^{(r)})}$$

where $w(\theta^{(r)})$ is the importance weight given by

$$w(\theta^{(r)}) = \frac{p(\theta^{(r)}|y)}{q(\theta^{(r)})}$$

where $q(\theta^{(r)})$ is the importance function.

There are several steps to implementing the importance sampling algorithm.

  1. Set up sampler specifications including the number of iterations, the parameters of the importance function, and initialize all Monte Carlo and importance sums to zero.
  2. Draw $\theta$ from the normal posterior for the Monte Carlo integration.
  3. Draw $\theta$ from the importance function.
  4. Calculate the importance sampling weight $w(\theta^{(r)}) = \frac{p(\theta^{(r)}|y)}{q(\theta^{(r)})}$ and the weight squared.
  5. Repeat 2-4 for the desired number of iterations.
  6. Average the draws and squared draws to find the Monte Carlo estimates.
  7. Find the weighted average of the draws and squared draws.
  8. Find the mean and standard deviation of the weights.

GAUSS procedures

Our first step in this tutorial will be creating two GAUSS procedures which will be used to calculate the importance weights. GAUSS procedures are user-defined functions which allow you to organize and easily reuse multiple lines of commands in a compact manner. A GAUSS procedure definition must always begin with the procstatement and end with the endp statement.

proc (number_returns) = procedure_name(argument_list);
    local variable_1, variable_2, ...., variable_k;

    . . .
    . . .
   retp(return_list);
endp;

Create procedures to compute t random variates and the univariate t density

We will define two procedures, one for taking a random draw from the t(0, 1, v) distribution and another for evaluating the t-distribution at a point. To create a function for drawing from the t distribution we start with fact that a random variable from the t-distribution is given by

$$T = Z\sqrt{\frac{v}{V}}$$

where $Z$ is a standard normal random variable with zero mean and variance 1 and V is a chi-squared random variable with $v$ degrees of freedom. Therefore, we can build a function for drawing a t-distribution random variable using GAUSS's rndn command to draw from the standard normal distribution and the rndChiSquare.

//Clear the GAUSS workspace
new;
cls;

//Define procedure to draw t-distributed random numbers
proc (1) = rndTDist(n, df);
    //Local variables used only inside procedure
    local z, x, t;

    //Z drawn standard normal distribution
    z = rndn(n,1);

    //Draw from the chi-squared distribution
    x = rndChiSquare(n, 1, df);

    //Compute the t-distribution RV
    t = (z*sqrt(df))./sqrt(x);

    retp(t);
endp;     

Next, we need to a procedure to evaluate the t-distribution at a point. The p.d.f for a general t-distribution, $t(\mu, \Sigma, \nu)$, with mean $\mu$, scale matrix $\Sigma$, and $\nu$ degrees-of-freedom is given by

$$\frac{1}{c_t}|\Sigma|^{-\frac{1}{2}}\big[\nu + (y - \mu)'\Sigma(y - \mu)\big]^{-\frac{\nu + \kappa}{2}}$$

where

$$c_t = \frac{\pi^{\frac{\kappa}{2}}\Gamma\big(\frac{\nu}{2}\big)}{\nu^{\frac{\nu}{2}}\Gamma\big(\frac{\nu + \kappa}{2}\big)}$$

We will set $\kappa$ equal to one and simplify the entry of this into GAUSS by taking the first taking the log of everything, then taking the exponential of our result. First, the log of $c_t$

$$log\big(\Sigma^\frac{1}{2}c_t\big) = \frac{1}{2}\log\big(\Sigma\big) + \frac{1}{2}log\big(pi\big) + log\Big(\Gamma\big(\frac{\nu}{2}\big)\Big) - \frac{\nu}{2}log\big(\nu\big) - log\Big(\Gamma\big(\frac{\nu + 1}{2}\big)\Big)$$

In GAUSS the lnfact command can be used to find $\log\big(\Gamma(\theta)\big)$ using the fact that $log\big(\Gamma\big(\theta\big)\big) = lnfact(\theta-1)$.

Taking the log of the density we find

$$log(t(\mu, \Sigma, \nu)) = -log\big(c_t\big) - \frac{1}{2}log\big(|\Sigma|\big) + log\big(\nu + \frac{(y - \mu)^2}{\Sigma}\big)$$

//Define a procedure to compute the t PDF
proc (1) = tDensity(apoint, a_mean, acapv, adof);

    local ln_constant, dens;    
    //evaluate univariate t density with arguments amean, acapv and adof at apoint

    ln_constant =.5*ln(acapv) + .5*ln(pi) + lnfact(.5*adof-1) - lnfact(.5*(adof+1)-1) - .5*adof*ln(adof);
    dens = -.5*(adof+1)*ln(adof+((apoint-a_mean)^2)/acapv) - ln_constant;

    retp(exp(dens));
endp;

Once we have written these procedures we can call these functions the same as we would any GAUSS function. For example, we can draw a 2x1 vector of values from a standard t-distribution with 2 degrees of freedom

//Set random number seed for repeatable random numbers
rndseed 34532;

t = rndTDist(2, 2);
print "Random draws from the t-distribution" 
t;
Random draws from the t-distribution
 -0.0378
 -2.0359 

Set sampler specifications

With our t-distributions written, we are now ready to set up sampler specifications. We need to specify the number of iterations, the parameters of the importance function and initialize all Monte Carlo and importance sums to zero.

//Number of iterations for Monte Carlo and Importance sampling
keep_draws = 10000;

//Mean of importance function
mean_is = 0;

//Scale factor of imporatance function
scale_is = 1;

//Degrees of freedom of importance function
dof_is = 2;

//Start all MC and Importance sampling sums at zero
theta_mean_mc = 0;
th2mo_mc = 0;
theta_mean_is = 0;
th2mo_is = 0;
wsum = 0;
wsum2 = 0;

Compute importance sampling

These two functions will be used inside our for loop which we are now ready to build. Within our for loop, we will combine our Monte Carlo integration and importance sampler. For the Monte Carlo sampler, we will

  1. Make a current draw from the $N(0,1)$ distribution.
  2. Add the current draw to the sum of all Monte Carlo draws.
  3. Add the square of the current draw to the sum of all squares of the Monte Carlo draws.

For the importance sampler, we will

  1. Make a current draw from the $t(\mu, \Sigma, \nu)$ distribution.
  2. Compute the importance weight using $w(\theta^{(r)}) = \frac{p(\theta^{(r)}|y)}{q(\theta^{(r)})}$.
  3. Add the current importance sampler draw to the sum of all importance sampler draws.
  4. Add the square of the current importance sampler draw to the sum of all squares of the importance sampler draws.
  5. Add the current importance weight draw to the sum of all importance weights.
  6. Add the square of the current importance weight to the sum of the square of all importance weights.
for i(1, keep_draws, 1);

    //Monte Carlo draw
    //Current draw from N(0,1)
    theta_draw = rndn(1, 1);    

    // Add current draw to sum of MC draws
    theta_mean_mc = theta_mean_mc + theta_draw;

    //Add square of current draw to sum of squares of MC draws
    th2mo_mc = th2mo_mc + theta_draw^2;

    //Importance sampling draw.
    //Current importance sampler draw from t-distribution
    theta_draw = mean_is + sqrt(scale_is)*rndTDist(1, dof_is);

    //Importance sampling weight. Note: tdens evaluates the t density at the point drawn
    //by the importance sampler
    w = pdfn(theta_draw)/tDensity(theta_draw, mean_is, scale_is, dof_is);

    //Weighted sum of the importance sampler draws
    theta_mean_is = theta_mean_is + w*theta_draw;

    //Weighted sum of the square of the importance sampler draws
    th2mo_is = th2mo_is + w*theta_draw^2;

    //Find sum of weights 
    wsum = wsum + w;

    //Find sum of squared weights
    wsum2 = wsum2 + w^2;

endfor;

Finally, we will find the sample averages to estimate our parameters of interest.

//Monte Carlo Statistics
//Mean
theta_mean_mc = theta_mean_mc / keep_draws;

//Standard deviation
th2mo_mc = th2mo_mc / keep_draws;
thsd_mc = sqrt(th2mo_mc - theta_mean_mc^2);

print "Monte Carlo Posterior Mean and Standard Deviation";
theta_mean_mc thsd_mc;

//Importance Sampler
//Mean
theta_mean_is = theta_mean_is / wsum;

//Standard deviation
th2mo_is = th2mo_is / wsum;
thsd_is = sqrt(th2mo_is - theta_mean_is^2);

print "Importance Sampling Posterior Mean and Standard Deviation";
theta_mean_is thsd_is;

//Importance sampler weights
//Mean
wmean = wsum / keep_draws;

//Standard deviation
wstd = sqrt(wsum2/keep_draws - wmean^2);

print "Mean and standard deviation of importance sampling weights";
print wmean wstd;

The code above will produce the following output.

Monte Carlo Posterior Mean and Standard Deviation
 -0.0002   0.9900
Importance Sampling Posterior Mean and Standard Deviation
  0.0034   1.0016
Mean and standard deviation of importance sampling weights
  1.0012   0.3793

Note: procedures tDensity and pdfTDist are based upon code by James LeSage, University of Toledo.

Have a Specific Question?

Get a real answer from a real person

Need Support?

Get help from our friendly experts.

REQUEST A FREE QUOTE

Thank you for your interest in the GAUSS family of products.

© Aptech Systems, Inc. All rights reserved.

Privacy Policy | Sitemap