I just wanted to check couple of things. I would appreciate your response a lot.
1. My 1st question was how could I replicate the relative gradients that are printed while I try to run maxLik using BFGS. I could replicate the numerical gradient with my analytical gradient. However could not replicate the relative gradient.
I suppose it is the gradient of each parameter divided by the parameter value itself or the gradient of each parameter divided by the parameter and multiplied by the value of the function itself, is that correct?
2. My second question is, if a parameter value goes towards zero during iteration, is it reasonable for its relative gradient to go to a very large value? Is this because the relative gradient is obtained by dividing the gradient with the parameter value? Theoretically the gradient of the particular parameter I am talking about goes to zero as the parameter goes to zero and both the numerical and the analytical gradient shows the same. However the relative gradient becomes very high as the parameter approaches zero. I would appreciate if someone could confirm if this is a reasonable behavior or if there is something more into it.
Thanks and Regards
Maxlik relative gradient
Section 2.3.3 of the Maxlik manual explains the relative gradient:
Convergence is declared when the relative gradient is less than max GradTol. The relative gradient is a scaled gradient and is used for determining convergence in order to reduce the effects of scale. It is defined as the absolute value of the gradient times the absolute value of the parameter vector divided by the larger of zero and the absolute value of the function. By default, _max_GradTol = 1e-5.
The code is inside of the file maxutil.src. You can look for assignments to the variable relgrad.