Aplicación 3.2: Inestabilidad paramétrica
En esta aplicación se estimarán varias regresiones que, tras el correspondiente contraste de estabilidad paramétrica, permiten que los parámetros estructurales del modelo de regresión varíen a lo largo de la muestra.
Diferenciación salarial por sexo (datos de corte transversal)
En este ejemplo, al igual que en la aplicación 2.5, se usará como especificación econométrica de partida el siguiente modelo log-lineal, que se corresponde con la ecuación de salarios de Mincer:
\[ log(SALARIO_{i}) = \beta_1 + \beta_2 EDUC_{i} + \beta_3 EXPER_{i} + e_{i} \]
donde \(SALARIO\), \(EDUC\) y \(EXPER\) representan, respectivamente, el salario percibido, el nivel de educación y el grado de experiencia de cada uno de los 526 indviduos que componen la base muestral.
Code
# Lectura de librerías
library(tidyverse)
library(car)
# Lectura de datos
<- read_csv("data/SAL_SEX.csv")
SAL_SEX dim(SAL_SEX)
[1] 526 4
Code
str(SAL_SEX)
spc_tbl_ [526 × 4] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ EDUC : num [1:526] 11 12 11 8 12 16 18 12 12 17 ...
$ EXPER : num [1:526] 2 22 2 44 7 9 15 5 26 22 ...
$ MUJER : num [1:526] 1 1 0 0 0 0 0 1 1 0 ...
$ SALARIO: num [1:526] 3.1 3.24 3 6 5.3 ...
- attr(*, "spec")=
.. cols(
.. EDUC = col_double(),
.. EXPER = col_double(),
.. MUJER = col_double(),
.. SALARIO = col_double()
.. )
- attr(*, "problems")=<externalptr>
Code
head(SAL_SEX)
# A tibble: 6 × 4
EDUC EXPER MUJER SALARIO
<dbl> <dbl> <dbl> <dbl>
1 11 2 1 3.1
2 12 22 1 3.24
3 11 2 0 3
4 8 44 0 6
5 12 7 0 5.3
6 16 9 0 8.75
Code
tail(SAL_SEX)
# A tibble: 6 × 4
EDUC EXPER MUJER SALARIO
<dbl> <dbl> <dbl> <dbl>
1 12 2 0 5.65
2 16 14 1 15
3 10 2 1 2.27
4 15 13 0 4.67
5 16 5 0 11.6
6 14 5 1 3.5
Code
# Distribución de la variable dependiente para la muestra completa
Boxplot(~SALARIO, data=SAL_SEX, main="", ylab="SALARIO", id=list(method = "none"))
Code
# Conversión de variable numérica MUJER (0/1) al tipo cualitativo (factor)
class(SAL_SEX$MUJER)
[1] "numeric"
Code
<- factor(SAL_SEX$MUJER, labels=c("Hombre", "Mujer"))
SEXO summary(SEXO)
Hombre Mujer
274 252
Code
# Distribución de la variable dependiente por sexo
Boxplot(SALARIO~SEXO, data=SAL_SEX, ylab="SALARIO", id=list(method = "none"))
Code
# Ecuación de salarios para la muestra completa
summary(lm_SAL <- lm(log(SALARIO) ~ EDUC + EXPER , data = SAL_SEX))
Call:
lm(formula = log(SALARIO) ~ EDUC + EXPER, data = SAL_SEX)
Residuals:
Min 1Q Median 3Q Max
-2.05800 -0.30136 -0.04539 0.30601 1.44425
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.216854 0.108595 1.997 0.0464 *
EDUC 0.097936 0.007622 12.848 < 2e-16 ***
EXPER 0.010347 0.001555 6.653 7.24e-11 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4614 on 523 degrees of freedom
Multiple R-squared: 0.2493, Adjusted R-squared: 0.2465
F-statistic: 86.86 on 2 and 523 DF, p-value: < 2.2e-16
Code
# Gráficas parciales con diferenciación por sexo
scatterplot(log(SALARIO) ~ EDUC| SEXO, data=SAL_SEX, smooth=FALSE,
boxplots=FALSE, ylab="log(Salario)")
Code
scatterplot(log(SALARIO) ~ EXPER| SEXO, data=SAL_SEX, smooth=FALSE,
boxplots=FALSE, ylab="log(Salario)")
Code
# Diferenciación por sexos:
# ¿existe realmente una diferencia salarial por sexos estadísticamente significativa?
# Hombres
summary(lm_SAL_h <- lm(log(SALARIO) ~ EDUC + EXPER ,
data = SAL_SEX[which(SAL_SEX$MUJER==0),]))
Call:
lm(formula = log(SALARIO) ~ EDUC + EXPER, data = SAL_SEX[which(SAL_SEX$MUJER ==
0), ])
Residuals:
Min 1Q Median 3Q Max
-1.20460 -0.29936 -0.01032 0.28558 1.25532
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.249794 0.144264 1.732 0.0845 .
EDUC 0.101813 0.009656 10.544 < 2e-16 ***
EXPER 0.014908 0.002148 6.941 2.87e-11 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4457 on 271 degrees of freedom
Multiple R-squared: 0.3106, Adjusted R-squared: 0.3055
F-statistic: 61.04 on 2 and 271 DF, p-value: < 2.2e-16
Code
# Mujeres
summary(lm_SAL_m <- lm(log(SALARIO) ~ EDUC + EXPER ,
data = SAL_SEX[which(SAL_SEX$MUJER==1),]))
Call:
lm(formula = log(SALARIO) ~ EDUC + EXPER, data = SAL_SEX[which(SAL_SEX$MUJER ==
1), ])
Residuals:
Min 1Q Median 3Q Max
-1.96983 -0.21953 -0.06965 0.21896 1.22467
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.334836 0.141456 2.367 0.0187 *
EDUC 0.082314 0.010458 7.871 1.08e-13 ***
EXPER 0.004116 0.001894 2.173 0.0307 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.399 on 249 degrees of freedom
Multiple R-squared: 0.1996, Adjusted R-squared: 0.1932
F-statistic: 31.06 on 2 and 249 DF, p-value: 9.089e-13
Code
# Comparación de coeficientes estimados
compareCoefs(lm_SAL_h, lm_SAL_m)
Calls:
1: lm(formula = log(SALARIO) ~ EDUC + EXPER, data =
SAL_SEX[which(SAL_SEX$MUJER == 0), ])
2: lm(formula = log(SALARIO) ~ EDUC + EXPER, data =
SAL_SEX[which(SAL_SEX$MUJER == 1), ])
Model 1 Model 2
(Intercept) 0.250 0.335
SE 0.144 0.141
EDUC 0.10181 0.08231
SE 0.00966 0.01046
EXPER 0.01491 0.00412
SE 0.00215 0.00189
Code
# Test de Chow de cambio estructural
<- lm(log(SALARIO) ~ (EDUC + EXPER)*MUJER, data = SAL_SEX)
lm_SAL_int anova(lm_SAL, lm_SAL_int)
Analysis of Variance Table
Model 1: log(SALARIO) ~ EDUC + EXPER
Model 2: log(SALARIO) ~ (EDUC + EXPER) * MUJER
Res.Df RSS Df Sum of Sq F Pr(>F)
1 523 111.345
2 520 93.477 3 17.868 33.133 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
# Regresión diferenciada por sexos (variables de interacción)
summary(lm_SAL_int)
Call:
lm(formula = log(SALARIO) ~ (EDUC + EXPER) * MUJER, data = SAL_SEX)
Residuals:
Min 1Q Median 3Q Max
-1.96983 -0.26177 -0.03718 0.25663 1.25532
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.249794 0.137237 1.820 0.069308 .
EDUC 0.101813 0.009186 11.084 < 2e-16 ***
EXPER 0.014908 0.002043 7.296 1.11e-12 ***
MUJER 0.085042 0.203534 0.418 0.676247
EDUC:MUJER -0.019499 0.014417 -1.352 0.176818
EXPER:MUJER -0.010792 0.002868 -3.763 0.000187 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.424 on 520 degrees of freedom
Multiple R-squared: 0.3698, Adjusted R-squared: 0.3637
F-statistic: 61.03 on 5 and 520 DF, p-value: < 2.2e-16
Code
# Lectura de librerías
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import statsmodels.formula.api as smf
import statsmodels.stats as smstats
import scipy as sp
# Lectura de datos
= pd.read_csv("data/SAL_SEX.csv")
SAL_SEX SAL_SEX.shape
(526, 4)
Code
SAL_SEX.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 526 entries, 0 to 525
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 EDUC 526 non-null int64
1 EXPER 526 non-null int64
2 MUJER 526 non-null int64
3 SALARIO 526 non-null float64
dtypes: float64(1), int64(3)
memory usage: 16.6 KB
Code
SAL_SEX.head()
EDUC EXPER MUJER SALARIO
0 11 2 1 3.10
1 12 22 1 3.24
2 11 2 0 3.00
3 8 44 0 6.00
4 12 7 0 5.30
Code
SAL_SEX.tail()
EDUC EXPER MUJER SALARIO
521 16 14 1 15.00
522 10 2 1 2.27
523 15 13 0 4.67
524 16 5 0 11.56
525 14 5 1 3.50
Code
# Distribución de la variable dependiente para la muestra completa
1)
plt.figure(=SAL_SEX["SALARIO"])
sns.boxplot(x plt.show()
Code
# Conversión de variable numérica a cualitativa (factor)
'SEXO']=SAL_SEX['MUJER'].astype('category')
SAL_SEX['SEXO']=SAL_SEX['SEXO'].cat.rename_categories(['Hombre', 'Mujer'])
SAL_SEX[# Distribución de la variable dependiente por sexo
2)
plt.figure(=SAL_SEX["SALARIO"] , y=SAL_SEX["SEXO"])
sns.boxplot(x plt.show()
Code
# Ecuación de salarios para la muestra total
= smf.ols(formula='np.log(SALARIO) ~ EDUC + EXPER', data=SAL_SEX).fit()
lm_SAL print(lm_SAL.summary())
OLS Regression Results
==============================================================================
Dep. Variable: np.log(SALARIO) R-squared: 0.249
Model: OLS Adj. R-squared: 0.246
Method: Least Squares F-statistic: 86.86
Date: Sun, 09 Feb 2025 Prob (F-statistic): 2.68e-33
Time: 13:52:09 Log-Likelihood: -338.01
No. Observations: 526 AIC: 682.0
Df Residuals: 523 BIC: 694.8
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 0.2169 0.109 1.997 0.046 0.004 0.430
EDUC 0.0979 0.008 12.848 0.000 0.083 0.113
EXPER 0.0103 0.002 6.653 0.000 0.007 0.013
==============================================================================
Omnibus: 7.740 Durbin-Watson: 1.789
Prob(Omnibus): 0.021 Jarque-Bera (JB): 9.485
Skew: 0.165 Prob(JB): 0.00872
Kurtosis: 3.569 Cond. No. 130.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Code
# Gráficas parciales con diferenciación por sexo
'l_SALARIO']=np.log(SAL_SEX['SALARIO'])
SAL_SEX[3)
plt.figure(="EDUC", y="l_SALARIO", hue="SEXO", data=SAL_SEX);
sns.lmplot(x plt.show()
Code
4)
plt.figure(="EXPER", y="l_SALARIO", hue="SEXO", data=SAL_SEX);
sns.lmplot(x plt.show()
Code
# Diferenciación por sexos:
# ¿existe realmente una diferencia salarial por sexos estadísticamente significativa?
# Hombres
= smf.ols(formula='np.log(SALARIO) ~ EDUC + EXPER',
lm_SAL_h =(SAL_SEX['MUJER'] == 0), data=SAL_SEX).fit()
subsetprint(lm_SAL_h.summary())
OLS Regression Results
==============================================================================
Dep. Variable: np.log(SALARIO) R-squared: 0.311
Model: OLS Adj. R-squared: 0.305
Method: Least Squares F-statistic: 61.04
Date: Sun, 09 Feb 2025 Prob (F-statistic): 1.30e-22
Time: 13:52:11 Log-Likelihood: -165.86
No. Observations: 274 AIC: 337.7
Df Residuals: 271 BIC: 348.5
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 0.2498 0.144 1.732 0.084 -0.034 0.534
EDUC 0.1018 0.010 10.544 0.000 0.083 0.121
EXPER 0.0149 0.002 6.941 0.000 0.011 0.019
==============================================================================
Omnibus: 0.921 Durbin-Watson: 1.872
Prob(Omnibus): 0.631 Jarque-Bera (JB): 0.674
Skew: 0.098 Prob(JB): 0.714
Kurtosis: 3.143 Cond. No. 131.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Code
# Mujeres
= smf.ols(formula='np.log(SALARIO) ~ EDUC + EXPER',
lm_SAL_m =(SAL_SEX['MUJER'] == 1), data=SAL_SEX).fit()
subsetprint(lm_SAL_m.summary())
OLS Regression Results
==============================================================================
Dep. Variable: np.log(SALARIO) R-squared: 0.200
Model: OLS Adj. R-squared: 0.193
Method: Least Squares F-statistic: 31.06
Date: Sun, 09 Feb 2025 Prob (F-statistic): 9.09e-13
Time: 13:52:11 Log-Likelihood: -124.54
No. Observations: 252 AIC: 255.1
Df Residuals: 249 BIC: 265.7
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 0.3348 0.141 2.367 0.019 0.056 0.613
EDUC 0.0823 0.010 7.871 0.000 0.062 0.103
EXPER 0.0041 0.002 2.173 0.031 0.000 0.008
==============================================================================
Omnibus: 17.664 Durbin-Watson: 2.103
Prob(Omnibus): 0.000 Jarque-Bera (JB): 56.098
Skew: 0.007 Prob(JB): 6.58e-13
Kurtosis: 5.311 Cond. No. 133.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Code
# Regresión diferenciada por sexos (variables de interacción)
= smf.ols(formula='np.log(SALARIO) ~ (EDUC + EXPER)*MUJER',
lm_SAL_int =SAL_SEX).fit()
dataprint(lm_SAL_int.summary())
OLS Regression Results
==============================================================================
Dep. Variable: np.log(SALARIO) R-squared: 0.370
Model: OLS Adj. R-squared: 0.364
Method: Least Squares F-statistic: 61.03
Date: Sun, 09 Feb 2025 Prob (F-statistic): 5.27e-50
Time: 13:52:11 Log-Likelihood: -292.01
No. Observations: 526 AIC: 596.0
Df Residuals: 520 BIC: 621.6
Df Model: 5
Covariance Type: nonrobust
===============================================================================
coef std err t P>|t| [0.025 0.975]
-------------------------------------------------------------------------------
Intercept 0.2498 0.137 1.820 0.069 -0.020 0.519
EDUC 0.1018 0.009 11.084 0.000 0.084 0.120
EXPER 0.0149 0.002 7.296 0.000 0.011 0.019
MUJER 0.0850 0.204 0.418 0.676 -0.315 0.485
EDUC:MUJER -0.0195 0.014 -1.352 0.177 -0.048 0.009
EXPER:MUJER -0.0108 0.003 -3.763 0.000 -0.016 -0.005
==============================================================================
Omnibus: 12.139 Durbin-Watson: 1.802
Prob(Omnibus): 0.002 Jarque-Bera (JB): 22.040
Skew: 0.063 Prob(JB): 1.64e-05
Kurtosis: 3.995 Cond. No. 333.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Code
# Test de Chow de cambio estructural
sm.stats.anova_lm(lm_SAL,lm_SAL_int)
df_resid ssr df_diff ss_diff F Pr(>F)
0 523.0 111.344709 0.0 NaN NaN NaN
1 520.0 93.476534 3.0 17.868174 33.132916 1.306612e-19
Exportaciones antes y después de la entrada de España en la Unión Europea (datos de series temporales)
Tanto la teoría económica como la experiencia econométrica internacional nos señalan que las exportaciones de bienes y servicios de un país dependen de dos variables básicas, un indicador de actividad económica mundial y un indicador de precios relativos. Para estimar una ecuación de exportaciones para el caso español, se han tomado series de datos históricos para el período 1970-1997 de las siguientes variables: las exportaciones reales de bienes y servicios, excluido el turismo (\(XGS\)); el índice anual del producto interior bruto real mundial (\(WGDP\)); y el tipo de cambio efectivo real respecto al conjunto de las diez monedas principales, corregido por la relación de precios de exportación de España respecto a la media ponderada de los precios de exportación de los principales países (\(REER\)).
El modelo de regresión que se usará como soporte del ejemplo es el siguiente:
\[ log(XGS_{t}) = \beta_1 + \beta_2 log(WGDP_{t}) + \beta_3 log(REER_{t}) + e_{t} \]
Code
# Lectura de librerías
library(tidyverse)
library(car)
# Lectura de datos
<- read_delim("data/EXP_ESP_Y.csv", delim = ";")
EXP_ESP # División de la muestra entre preUE (1970-1985) y postUE (1986-1997)
= match(as.Date("1986-01-01"), EXP_ESP$date)
Y1986 Y1986
[1] 17
Code
# Variables numérica y cualitativa de subperíodos
<- factor(c(rep(0, 16), rep(1, 12)), labels=c("preUE", "postUE"))
UE UE
[1] preUE preUE preUE preUE preUE preUE preUE preUE preUE preUE
[11] preUE preUE preUE preUE preUE preUE postUE postUE postUE postUE
[21] postUE postUE postUE postUE postUE postUE postUE postUE
Levels: preUE postUE
Code
class(UE)
[1] "factor"
Code
$D1986 <- as.numeric(UE)-1
EXP_ESP$D1986 EXP_ESP
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
Code
class(EXP_ESP$D1986)
[1] "numeric"
Code
# Formato de series temporales
<- ts(EXP_ESP[,2:5], start=c(1970), end = c(1997), frequency = 1)
EXP_ESP_ts plot(EXP_ESP_ts)
Code
# Ecuación de exportaciones para el período completo (1970-1997)
<- lm(log(XGS) ~ log(WGDP) + log(REER), data = EXP_ESP_ts)
lm_X_ESP summary(lm_X_ESP)
Call:
lm(formula = log(XGS) ~ log(WGDP) + log(REER), data = EXP_ESP_ts)
Residuals:
Min 1Q Median 3Q Max
-0.08302 -0.04419 -0.02013 0.03883 0.14376
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.24239 0.84432 -0.287 0.776
log(WGDP) 2.04618 0.06235 32.820 <2e-16 ***
log(REER) -0.34878 0.21480 -1.624 0.117
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.06663 on 25 degrees of freedom
Multiple R-squared: 0.9853, Adjusted R-squared: 0.9841
F-statistic: 836.6 on 2 and 25 DF, p-value: < 2.2e-16
Code
# Gráficas parciales con diferenciación por subperíodos
scatterplot(log(XGS) ~ log(WGDP)| D1986, data=EXP_ESP_ts,
smooth=FALSE, boxplots=FALSE,
ylab="log(XGS)")
Code
scatterplot(log(XGS) ~ log(REER)| D1986, data=EXP_ESP_ts,
smooth=FALSE, boxplots=FALSE,
ylab="log(XGS)")
Code
# Diferenciación por subperíodos
# PreUE
<- window(EXP_ESP_ts, start=1970, end = 1985)
preUE <- lm(log(XGS) ~ log(WGDP) + log(REER) , data = preUE)
lm_X_ESP_preUE summary(lm_X_ESP_preUE)
Call:
lm(formula = log(XGS) ~ log(WGDP) + log(REER), data = preUE)
Residuals:
Min 1Q Median 3Q Max
-0.085980 -0.029433 0.003653 0.027247 0.084831
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.5078 1.6493 1.520 0.1523
log(WGDP) 1.8144 0.0846 21.447 1.57e-11 ***
log(REER) -0.6945 0.3187 -2.179 0.0483 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.05124 on 13 degrees of freedom
Multiple R-squared: 0.9763, Adjusted R-squared: 0.9727
F-statistic: 267.8 on 2 and 13 DF, p-value: 2.725e-11
Code
# PostUE
<- window(EXP_ESP_ts, start=1986, end = 1997)
postUE <- lm(log(XGS) ~ log(WGDP) + log(REER) , data = postUE)
lm_X_ESP_postUE summary(lm_X_ESP_postUE)
Call:
lm(formula = log(XGS) ~ log(WGDP) + log(REER), data = postUE)
Residuals:
Min 1Q Median 3Q Max
-0.052680 -0.032117 0.005373 0.029904 0.037074
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.6788 0.8889 -1.889 0.09153 .
log(WGDP) 2.6425 0.1032 25.610 1.02e-09 ***
log(REER) -0.7220 0.1933 -3.735 0.00466 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.03661 on 9 degrees of freedom
Multiple R-squared: 0.9871, Adjusted R-squared: 0.9842
F-statistic: 344.6 on 2 and 9 DF, p-value: 3.133e-09
Code
# Comparación de coeficientes estimados
compareCoefs(lm_X_ESP_preUE,lm_X_ESP_postUE)
Calls:
1: lm(formula = log(XGS) ~ log(WGDP) + log(REER), data = preUE)
2: lm(formula = log(XGS) ~ log(WGDP) + log(REER), data = postUE)
Model 1 Model 2
(Intercept) 2.508 -1.679
SE 1.649 0.889
log(WGDP) 1.8144 2.6425
SE 0.0846 0.1032
log(REER) -0.694 -0.722
SE 0.319 0.193
Code
# ¿Existe diferenciación estadísticamente significativa por períodos?
# Test de Chow de cambio estructural
# Método 1 (ANOVA)
<- lm(log(XGS) ~ (log(WGDP) + log(REER))*D1986,
lm_X_EXP_int data = EXP_ESP_ts)
summary(lm_X_EXP_int)
Call:
lm(formula = log(XGS) ~ (log(WGDP) + log(REER)) * D1986, data = EXP_ESP_ts)
Residuals:
Min 1Q Median 3Q Max
-0.085980 -0.030598 0.004582 0.029904 0.084831
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.50775 1.47498 1.700 0.1032
log(WGDP) 1.81445 0.07566 23.982 < 2e-16 ***
log(REER) -0.69447 0.28504 -2.436 0.0234 *
D1986 -4.18659 1.84754 -2.266 0.0336 *
log(WGDP):D1986 0.82803 0.14968 5.532 1.47e-05 ***
log(REER):D1986 -0.02750 0.37389 -0.074 0.9420
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.04583 on 22 degrees of freedom
Multiple R-squared: 0.9939, Adjusted R-squared: 0.9925
F-statistic: 713.6 on 5 and 22 DF, p-value: < 2.2e-16
Code
anova(lm_X_ESP, lm_X_EXP_int)
Analysis of Variance Table
Model 1: log(XGS) ~ log(WGDP) + log(REER)
Model 2: log(XGS) ~ (log(WGDP) + log(REER)) * D1986
Res.Df RSS Df Sum of Sq F Pr(>F)
1 25 0.110991
2 22 0.046204 3 0.064787 10.283 0.0001978 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
# Método 2 (librería structchange)
library(strucchange)
sctest(log(XGS) ~ log(WGDP) + log(REER), data=EXP_ESP_ts,
type = "Chow", point = Y1986-1)
Chow test
data: log(XGS) ~ log(WGDP) + log(REER)
F = 10.283, p-value = 0.0001978
Code
# Contrastes tipo Chow basados en estimaciones recursivas
# [se elimina un % de observaciones en los extremmos]
<- Fstats(log(XGS) ~ log(WGDP) + log(REER), data = EXP_ESP_ts,
sbtest from = 0.15, to = 0.85)
"Fstats"]] sbtest[[
Time Series:
Start = 1973
End = 1992
Frequency = 1
[1] 26.94858 40.37538 52.72550 53.14715 43.28653 41.27132 51.76225 53.52206
[9] 43.97098 39.04442 33.07102 30.47271 30.84853 32.31100 37.23453 41.05085
[17] 41.68393 41.71401 38.20854 33.80421
Code
plot(sbtest, alpha = 0.05)
Code
# Test de Chow (1960) [versión Chi2]
# [punto de ruptura conocido: 17 - 4 (15% suprimidos a la izquierda) ]
<- sbtest$Fstats[Y1986-4]
Chow_F Chow_F
[1] 30.84853
Code
# Se puede comprobar que CHOW=Chow_F/K
# Chow_F tiene una distribución asintótica Chi^2 mientras que
# CHOW tiene una distribución exacta F_K,T-2*K
<- 1-pchisq(Chow_F,sbtest$nreg)
pval pval
[1] 9.148179e-07
Code
# Contrastes de Andrews (1993) y Andrews y Ploberger (1994): supF, aveF, expF
# [punto de ruptura desconocido]
sctest(sbtest, type = "supF")
supF test
data: sbtest
sup.F = 53.522, p-value = 2.556e-10
Code
sctest(sbtest, type = "aveF")
aveF test
data: sbtest
ave.F = 40.323, p-value = 1.043e-13
Code
sctest(sbtest, type = "expF")
expF test
data: sbtest
exp.F = 24.845, p-value = 0.0007622
Code
# Test CUSUM (Brown, Durbin y Evans, 1975)
plot(efp(log(XGS) ~ log(WGDP) + log(REER), data = EXP_ESP_ts))
Code
# Regresión diferenciada por tramos temporales
summary(lm(log(XGS) ~ (log(WGDP) + log(REER))*D1986, data=EXP_ESP_ts))
Call:
lm(formula = log(XGS) ~ (log(WGDP) + log(REER)) * D1986, data = EXP_ESP_ts)
Residuals:
Min 1Q Median 3Q Max
-0.085980 -0.030598 0.004582 0.029904 0.084831
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.50775 1.47498 1.700 0.1032
log(WGDP) 1.81445 0.07566 23.982 < 2e-16 ***
log(REER) -0.69447 0.28504 -2.436 0.0234 *
D1986 -4.18659 1.84754 -2.266 0.0336 *
log(WGDP):D1986 0.82803 0.14968 5.532 1.47e-05 ***
log(REER):D1986 -0.02750 0.37389 -0.074 0.9420
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.04583 on 22 degrees of freedom
Multiple R-squared: 0.9939, Adjusted R-squared: 0.9925
F-statistic: 713.6 on 5 and 22 DF, p-value: < 2.2e-16
Code
# Lectura de librerías
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import scipy as sp
import statsmodels.api as sm
import statsmodels.formula.api as smf
import statsmodels.stats as smstats
# Lectura de datos
= pd.read_csv("data/EXP_ESP_Y.csv", sep=";",
EXP_ESP =['date'], index_col='date')
parse_dates= pd.date_range(start = '1970', periods = len(EXP_ESP.index), freq = 'Y')
date = date
EXP_ESP.index EXP_ESP.index
DatetimeIndex(['1970-12-31', '1971-12-31', '1972-12-31', '1973-12-31',
'1974-12-31', '1975-12-31', '1976-12-31', '1977-12-31',
'1978-12-31', '1979-12-31', '1980-12-31', '1981-12-31',
'1982-12-31', '1983-12-31', '1984-12-31', '1985-12-31',
'1986-12-31', '1987-12-31', '1988-12-31', '1989-12-31',
'1990-12-31', '1991-12-31', '1992-12-31', '1993-12-31',
'1994-12-31', '1995-12-31', '1996-12-31', '1997-12-31'],
dtype='datetime64[ns]', freq='YE-DEC')
Code
EXP_ESP.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 28 entries, 1970-12-31 to 1997-12-31
Freq: YE-DEC
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 XGS 28 non-null float64
1 REER 28 non-null float64
2 WGDP 28 non-null float64
dtypes: float64(3)
memory usage: 896.0 bytes
Code
EXP_ESP.head()
XGS REER WGDP
1970-12-31 2225.123047 100.495876 104.200000
1971-12-31 2541.091064 101.909402 109.410000
1972-12-31 2881.596924 107.146269 115.646370
1973-12-31 3169.757080 108.713460 123.972908
1974-12-31 3138.059082 108.003898 127.072231
Code
EXP_ESP.tail()
XGS REER WGDP
1993-12-31 9579.586 114.520445 225.825635
1994-12-31 11181.767 109.674712 235.084486
1995-12-31 12299.895 116.021269 243.782612
1996-12-31 13604.854 121.631204 253.777699
1997-12-31 15611.863 116.142901 264.436363
Code
# Variables numérica y cualitativa de subperíodos
'D1986'] = np.where(EXP_ESP.index > '1985-12-31', 1, 0)
EXP_ESP['UE']=EXP_ESP['D1986'].astype('category')
EXP_ESP['UE']=EXP_ESP['UE'].cat.rename_categories(['preUE', 'postUE'])
EXP_ESP[# Ecuación de exportaciones para el período completo (1970-1997)
= smf.ols(formula='np.log(XGS) ~ np.log(WGDP) + np.log(REER)',
lm_X_ESP =EXP_ESP).fit()
dataprint(lm_X_ESP.summary())
OLS Regression Results
==============================================================================
Dep. Variable: np.log(XGS) R-squared: 0.985
Model: OLS Adj. R-squared: 0.984
Method: Least Squares F-statistic: 836.6
Date: Sun, 09 Feb 2025 Prob (F-statistic): 1.26e-23
Time: 13:52:12 Log-Likelihood: 37.697
No. Observations: 28 AIC: -69.39
Df Residuals: 25 BIC: -65.40
Df Model: 2
Covariance Type: nonrobust
================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------
Intercept -0.2424 0.844 -0.287 0.776 -1.981 1.497
np.log(WGDP) 2.0462 0.062 32.820 0.000 1.918 2.175
np.log(REER) -0.3488 0.215 -1.624 0.117 -0.791 0.094
==============================================================================
Omnibus: 2.960 Durbin-Watson: 0.290
Prob(Omnibus): 0.228 Jarque-Bera (JB): 2.618
Skew: 0.693 Prob(JB): 0.270
Kurtosis: 2.434 Cond. No. 485.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Code
# Gráficas parciales con diferenciación por subperíodos
'l_XGS']=np.log(EXP_ESP['XGS'])
EXP_ESP['l_WGDP']=np.log(EXP_ESP['WGDP'])
EXP_ESP['l_REER']=np.log(EXP_ESP['REER'])
EXP_ESP[5)
plt.figure(="l_WGDP", y="l_XGS", hue="UE", data=EXP_ESP);
sns.lmplot(x plt.show()
Code
6)
plt.figure(="l_REER", y="l_XGS", hue="UE", data=EXP_ESP);
sns.lmplot(x plt.show()
Code
# Diferenciación por subperíodos
# PreUE
= smf.ols(formula='np.log(XGS) ~ np.log(WGDP) + np.log(REER)',
lm_X_ESP_preUE =(EXP_ESP['D1986'] == 0), data=EXP_ESP).fit()
subsetprint(lm_X_ESP_preUE.summary())
OLS Regression Results
==============================================================================
Dep. Variable: np.log(XGS) R-squared: 0.976
Model: OLS Adj. R-squared: 0.973
Method: Least Squares F-statistic: 267.8
Date: Sun, 09 Feb 2025 Prob (F-statistic): 2.72e-11
Time: 13:52:14 Log-Likelihood: 26.496
No. Observations: 16 AIC: -46.99
Df Residuals: 13 BIC: -44.68
Df Model: 2
Covariance Type: nonrobust
================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------
Intercept 2.5077 1.649 1.520 0.152 -1.055 6.071
np.log(WGDP) 1.8144 0.085 21.447 0.000 1.632 1.997
np.log(REER) -0.6945 0.319 -2.179 0.048 -1.383 -0.006
==============================================================================
Omnibus: 0.094 Durbin-Watson: 0.755
Prob(Omnibus): 0.954 Jarque-Bera (JB): 0.314
Skew: 0.078 Prob(JB): 0.855
Kurtosis: 2.331 Cond. No. 899.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Code
# PostUE
= smf.ols(formula='np.log(XGS) ~ np.log(WGDP) + np.log(REER)',
lm_X_ESP_postUE =(EXP_ESP['D1986'] == 1), data=EXP_ESP).fit()
subsetprint(lm_X_ESP_postUE.summary())
OLS Regression Results
==============================================================================
Dep. Variable: np.log(XGS) R-squared: 0.987
Model: OLS Adj. R-squared: 0.984
Method: Least Squares F-statistic: 344.6
Date: Sun, 09 Feb 2025 Prob (F-statistic): 3.13e-09
Time: 13:52:14 Log-Likelihood: 24.386
No. Observations: 12 AIC: -42.77
Df Residuals: 9 BIC: -41.32
Df Model: 2
Covariance Type: nonrobust
================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------
Intercept -1.6788 0.889 -1.889 0.092 -3.690 0.332
np.log(WGDP) 2.6425 0.103 25.610 0.000 2.409 2.876
np.log(REER) -0.7220 0.193 -3.735 0.005 -1.159 -0.285
==============================================================================
Omnibus: 3.241 Durbin-Watson: 1.134
Prob(Omnibus): 0.198 Jarque-Bera (JB): 1.223
Skew: -0.318 Prob(JB): 0.543
Kurtosis: 1.572 Cond. No. 620.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Code
# Regresión diferenciada por tramos temporales
= smf.ols(formula='np.log(XGS) ~ (np.log(WGDP) + np.log(REER))*D1986',
lm_X_ESP_int =EXP_ESP).fit()
dataprint(lm_X_ESP_int.summary())
OLS Regression Results
==============================================================================
Dep. Variable: np.log(XGS) R-squared: 0.994
Model: OLS Adj. R-squared: 0.992
Method: Least Squares F-statistic: 713.6
Date: Sun, 09 Feb 2025 Prob (F-statistic): 1.46e-23
Time: 13:52:14 Log-Likelihood: 49.966
No. Observations: 28 AIC: -87.93
Df Residuals: 22 BIC: -79.94
Df Model: 5
Covariance Type: nonrobust
======================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------------
Intercept 2.5077 1.475 1.700 0.103 -0.551 5.567
np.log(WGDP) 1.8144 0.076 23.982 0.000 1.658 1.971
np.log(REER) -0.6945 0.285 -2.436 0.023 -1.286 -0.103
D1986 -4.1866 1.848 -2.266 0.034 -8.018 -0.355
np.log(WGDP):D1986 0.8280 0.150 5.532 0.000 0.518 1.138
np.log(REER):D1986 -0.0275 0.374 -0.074 0.942 -0.803 0.748
==============================================================================
Omnibus: 0.097 Durbin-Watson: 0.854
Prob(Omnibus): 0.953 Jarque-Bera (JB): 0.319
Skew: 0.000 Prob(JB): 0.853
Kurtosis: 2.477 Cond. No. 2.08e+03
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 2.08e+03. This might indicate that there are
strong multicollinearity or other numerical problems.
Code
# Test de Chow de cambio estructural
sm.stats.anova_lm(lm_X_ESP,lm_X_ESP_int)
df_resid ssr df_diff ss_diff F Pr(>F)
0 25.0 0.110991 0.0 NaN NaN NaN
1 22.0 0.046204 3.0 0.064787 10.282842 0.000198