常见模型如下:门槛回归模型(threshold regression,也称门限回归):
汉森(Bruce E. Hansen)在门限回归模型上做出了很多贡献。Hansen于1996年在《Econometrica》上发表⽂章《Inference when a nuisance parameter is not identified under the null hypothesis》,提出了时间序列门限⾃回归模型(TAR)的估计和检验。之后,他在门限模型上连续追踪,发表了⼏篇经典⽂章,尤其是1999年的《Threshold effects in non-dynamic panels: Estimation, testing and inference》(Hansen (1999) ⾸次介绍了具有个体效应的⾯板门限模型的计量分析⽅法, 该⽅法以残差平⽅和最⼩化为条件确定门限值, 并检验门限值的显著性, 克服了主观设定结构突变点的偏误。具体思路是:选定某⼀变量作为门限变量, 根据搜寻到的门限值将回归模型区分为多个区间, 每个区间的回归⽅程表达不同, 根据门限划分的区间将其他样本值进⾏归类, 回归后⽐较不同区间系数的变化。),2000年的《Sample splitting and threshold estimation》和2004年与他⼈合作的《Instrumental Variable Estimation of a Threshold Model》。
其中,So是在零假设下的残差平⽅和。由于LM统计量并不服从标准的分布。因此, Hansen(2000)提出了通过“⾃举法”( Bootstrap)来获得渐进分布的想法,进⽽得出相应的概率p值,也称为 Bootstrap P值。
这种⽅法的基本思想是:在解释变量和门槛值给定的前提下,模拟( Simulate)产⽣⼀组因变量序列,并使其满⾜
N(0,e2),其中e是式(4)的残差项。每得到⼀个⾃抽样样本,就可以计算出⼀个模拟的エM统计量。将这⼀过程重复1000次。Hansen(1996)认为模拟产⽣的LM统计量⼤于式(6)的次数占总模拟次数的百分⽐就是“⾃举法”估计得到的P值。这⾥的Bootstrap P值类似于普通计量⽅法得出的相伴概率P值。例如,当 Bootstrap P值⼩于0.01时,表⽰在1 %的显著性⽔平下通过了LM检验,以此类推。
在Stata 15中,进⾏门槛回归的命令为 threshold,语法格式为:threshold depvar [indepvars] [if] [in],
threshvar(varname) [options]
其中,其中, depvar为被解释变量, indepvars为相关变量(解释变量)。必选项 threshvar( varname) 表⽰变量varname为门槛变量,选项 nthresholds(#)指的是number of thresholds,这个命令
默认只有⼀个门槛值( default is nthresholds(1))。也可以通过选择项 nthresholds(#) 来指定多个门槛值,⽐如 nthresholds(2) 表⽰有 2 个门槛值,not allowed with optthresh。
optthresh(#[, ictype]), select optimal number of thresholds less than or equal to #; not allowed with nthresholds,计算最优的门槛个数,⼀般有Bayesian information criterion (BIC)、Akaike information criterion (AIC) 、Hannan-Quinn information criterion (HQIC)三个信息准则。其中默认使⽤BIC信息准则进⾏选择。
菜单操作步骤为:Statistics > Time series > Threshold regression model
webuse usmacro
threshold fedfunds, regionvars(l.fedfunds inflation ogap) ap)
threshold fedfunds, regionvars(l.fedfunds inflation ogap) ap) optthresh(5)
bootstrap检验方法xthreg depvar [indepvars] [if] [in], rx(varlist) qx(varname) [thnum(#) grid(#) trim(numlist) bs(numlist) thlevel(#)
xthreg depvar [indepvars] [if] [in], rx(varlist) qx(varname) [thnum(#) grid(#) trim(numlist) bs(numlist) thlevel(#)
gen(newvarname) noreg nobslog thgiven options]
depvar被解释变量,indepvars 解释变量,qx(varname) is the threshold variable,门限变量,thnum(#) is the number of thresholds,在stata13.0中门槛值是必要项⽬,需要等于⼤于1,⼩于等于3,默认值为1,也就是⾄少存在三个门槛值。
rx(varlist) is the regime-dependent variable. Time-series operators are allowed. rx is required. 区制变量或者制度变量qx(varname) is the threshold variable. Time-series operators are allowed. qx is required. 门限变量或者门槛变量thnum(#) is the number of thresholds. In the current version (Stata 13), # must be equal to or less than 3. The default is thnum(1). 门槛个数
grid(#) is the number of grid points. grid is used to avoid consuming too much time when computing large samples. The default is grid(300). ⽹格点数
trim(numlist) is the trimming proportion to estimate each threshold. The number of trimming proportions must be equal to the number of thresholds specified in thnum. The default is trim(0.01) for all thresholds. For example, to fit a triple-threshold model, you may set trim(0.01 0.01 0.05).
bs(numlist) is the number of bootstrap replications. If bs is not set, xthreg does not use bootstrap for the threshold-effect test. bootstrap迭代次数
thlevel(#) specifies the confidence level, as a percentage, for confidence intervals of the threshold. The default is thlevel(95). 置信区间,默认为95%,即thlevel(95)
gen(newvarname) generates a new categorical variable with 0, 1, 2, ... for each regime. The default is gen(_cat). noreg suppresses the display of the regression result. 不显⽰回归结果
nobslog suppresses the iteration process of the bootstrap. 不显⽰bootstrap迭代过程
thgiven fits the model based on previous results. options are any options available for [XT] xtreg.
Time-series operators are allowed in depvar, indepvars, rx, and qx.
use hansen1999
Estimate a single-threshold model
xthreg i q1 q2 q3 d1 qd1, rx(c1) qx(d1) thnum(1) trim(0.01) grid(400) bs(300)
输出结果包括四个部分。第⼀部分输出门限估计值和⾃举法的结果。第⼆部分列表输出门限值及置信区间,Th-1代表单⼀门限估计值,Th-21 和Th-22代表双门限回归的两个估计值,有时Th-21和Th-1相同。第三部分列出了门限检验,包括RSS、MSE、F统计量及概率值,以及10%、5%、1%的置信⽔平。第四部分是固定效应回归结果。
语法格式为:xtthres varlist [if] [in] , thres(varname) dthres(varname) [ qn(#) bs1(#) bs2(#) bs3(#) levle(#) minobs(#) ] thres(varname) specifies threshold variable, as denoted by q_it in Hansen(1999). Note that this option should not be omitted.
dthres(varname) specifies the variable that will show threshold effects, as denoted by x_it in Hansen(1999). This variable will be multipled by the indicator function I(.). Note that this option should not be omitted either.
qn(#) specifies the number of distinct values to be search in finding out the optimal estimate of threshold effects, r_hat, which will minimize the sum of square residuals of the model. The default value is 400.
bs1(#), bs2(#), bs3(#) specify the Bootstrap times in single threshold, double threshold and triple threshold model respectively. The default values are all 300.
level(#) specifies the confidence level, in percent, for confidence intervals. The default is level(95) or as set by set level; see help level.
minobs specifies the minimum number of observations in each of the regimes when searching for r_hats. The default is 10.
xtthres tobin size tang prof, th(grow) d(tl)
xtthres tobin size tang prof, th(grow) d(tl) bs2(200) bs3(100) minobs(30)
xtthres tobin size tang prof if year<=2001, th(grow) d(tl) qn(200)
cd E:stataresults //设置⼯作路径,保存输出结果
use E:statapersonal18datahansen1999, clear // 调⼊ Hansen99 数据
*-Table 1: Summary statistics
tabstat i q1 c1 d1, s(min p25 p50 p75 max) format(%6.3f) c(s)
xtthres i q1 q2 q3 d1 qd1, th(d1) d(c1) min(120) bs1(300) bs2(300) bs3(200)
