Chapter 5 First Difference Model and Lagged Dependent Variable Model
The contents arrangement and the sort of terms are very difficult, energy expending. I have made many revisions, but I have not finished organizing it yet. I am not sure if it's because of my background in psychology that I misunderstood thins wrongly. Maybe I confused the psychometrical models with a econometrical model. I spent a considerable amount of time demostrating the econometric assumptions that the variational model of psychometrics needs to satisfy.
1 Model Space
As chapter 3 has introduced, panel data has a dual dimensional structure (Xi,t,Yi,t): a cross-sectional dimension (i=1,2,...,N) and a time dimension (t=1,2,...,T). We are ususally interested in changes in the same object and want to expalin the source of those changes. We can model the developemnt and changes of object ΔYi,t by substracting past obervations of the dependent variable from the current observations of the dependent variabel, i.e., the current dependent variable Yi,t minus the lagged dependent variable Yi,t−1:
Similarly, we intuitively guess that changes in the dependent variable overtime ΔYi,t may be due to changes in the independent varibale ΔXi,t:
The greatest advantage of panel data is that it provides information on units at mutiple points in time, allowing us to use not only the current level values of variables but also their lagged values and first differences to construct models. More importantly, we can combine these variables in different ways toconstruct a variety of model structures, each corresponding to a theoretical assumption and estimation strategy.
First, I will present what we consider to be a reasonable relationship in terms of combination, regardless of how they are derived, which wil be addressed in later chapters.
Level
Yi,t=βXi,t+εi,t
ΔYi,t=βXi,t+εi,t
ΔYi,t=βYi,t+εi,t
ΔYi,t=βXi,t+γYi,t+εi,t
Lagged Level
Yi,t=βXi,t−1+εi,t
ΔYi,t=βXi,t−1+εi,t
Yi,t=βYi,t−1+εi,t
ΔYi,t=βYi,t−1+εi,t
Yi,t=βXi,t−1+Yi,t−1+εi,t
ΔYi,t=βXi,t−1+γYi,t−1+εi,t
Change
Yi,t=βΔXi,t+εi,t
ΔYi,t=βΔXi,t+εi,t
Yi,t=βΔYi,t+εi,t
ΔYi,t=βΔYi,t+εi,t
Yi,t=βΔXi,t+γΔYi,t+εi,t
ΔYi,t=βΔXi,t+γΔYi,t+εi,t
Lagged Change
Yi,t=βΔXi,t−1+εi,t
ΔYi,t=βΔXi,t−1+εi,t
Yi,t=βΔYi,t−1+εi,t
ΔYi,t=βΔYi,t−1+εi,t
Yi,t=βΔXi,t−1+γΔYi,t−1+εi,t
ΔYi,t=βΔXi,t−1+γΔYi,t−1+εi,t
Level & Change
Yi,t=βXi,t+γΔXi,t+εi,t
Lagged Level & Change
Yi,t=βXi,t−1+γΔXi,t+εi,t
ΔYi,t=ρYi,t+βΔXi,t+εi,t
Level & Lagged Change
ΔYi,t=ρΔYi,t−1+βΔXi,t−1+γXi,t+εi,t
The tabel above outlines the mdoel space generated by combining diffirent forms of the dependent and expanatory variables (current level, lagged value, and first difference). In order to uniformly represent and highlight the model structure, all error terms in this table are uniformly denoted as εi,t. However, it should be noted that in models where ΔYi,t is the dependent variable, this error term in some cases should be interpreted as Δεi,t=εi,t−εi,t−1, and its statistical properties differ significantly from those of the original error term.
Some of the models — such as change on change or level on lagged level are widely used in applied panel data research, with clear causal logic and estabilished estimation strategies. Others, while algebraically valid, pose logical or econometric challegns. For instance, using Yit to explain ΔYi,t risks circular reasoning and endogeneity, and should be treated with caution.
There also exist less commonly used model forms that, though rarely applied in prcatice, offer promising theoretical avenues — especially for capturing dynamic effects, inertia, or lagged transmisson. I turely believe that we should not only to master standard models but also to think creatively about variable construction and causal assumptions, paving the way for vovel insights in dynamic panel data modeling.
2 First Change Model
In panel data analysis, we often seek to expalin the change in a variable between two points in time, such as income growth, attitude shifts, or improvements in educational performance. On the surface, these models may appear to differ only in the form or structure of the varibales, but in essence, they represent different theoretical assumptions, error structures, and identification logics.
In this section, we will systematically sort out the following four modeling methods with "change" as the core:
1. Lagged Dependent Variable Model
First Change Model
First Difference Model
Stable Gain Model
Actually, first change model, sometimes also known as change score model, comes from pre-post study in Psychology and Pedagogy. First change model is a descriptive model rather than a explanatory model in panel data analysis. There is no control for fixed effec αi, so this model is suitable for intervention studies, regression trend analysis, but not for panel modeling.
2.1 Model Specification and Assumptions
The First Change Model FC model uses the raw change in the outcome as the dependent variable and regresses it on the current level of an explanatory variable:
The basic idea of FC is that, the change of Y is caused by the current level of X.
There are 3 main assumptions ffor CSM:
No autocorrelation in error: ei,t∼i.i.d.(0,σ2)
No omitted variable bias form the initial level Yi,t−1
No unobserved confounding between Xit and unmodeled causes of change
I will use econometric frameworks (I hope oneday I can use the term of sociometric), in particular panel data models and potential outcomes framewokrs, to more rigorously derive and interpret the key 3 assumptions of FCM.
2.2 Assumption 1: Strict exogeneity / Unconfoundedness
Proposition: Given Xit, the error term eit is independent of Xit, and independent of the potential outcomes for all periodes (inparticular t and t−1). More precisely:
Introduce FCM:
where, eit contains all the factors that are not modeled.
If OLS is consistent with β:
If plim β^=β, namely:
which is equivalent to the form of conditional expectation:
Or the stronger conditional independency in probability theory:
But in change modeling, eit contains the following unmodeled factors:
Yi,t−1: previous values of Y, which may affect the dependency patrh of Yit;
Zit: unobserved confounding variables;
αi: the unchanged fiexed effect of unit;
uit: i.i.d. error term.
In order for E[eit∣Xit]=0 to hold, it must be required that:
That is, the explanatory variable Xit must be independent of each other from all factors that are not modeled but affect ΔYit.
For example, if the real data generation process is:
which is dynamic panel model. But we use ΔYi,t=Yi,t−Yi,t−1=αi+β1Xi,t+ei,t,
then, Cov(Xit,eit)=(β2−1)Cov(Xit,Yi,t−1).
2.3 Assumption 2: No Dynamic Selection / Exogeneity of Initial Conditions
Proposition: Yi,t−1 has no impact on ΔYi,t, even hat impact on ΔYi,t, irrel
if the real data generation process is:
but we use ΔYi,t=Yi,t−Yi,t−1=αi+β1Xi,t+ei,t, then, ei,t=Yi,t−1(β2−1)+ui,t,
So, the following conditions must be hold
2.4 Assumption 3: Serially Uncorrelated Errors
Proposition: Error term eit is not serially uncorrelated, that is :
Consider the change model we hypothese:
If the true data generation process model is: Yit=β1Xit+αi+uit, and after the difference we get:
let εi,t−1=ui,t−1−ui,t−2, then
which indicates that even if ui,t∼i.i.d.(0,σ2), residual error εit after difference will result in first order negative serial correlation (MA(1) structure). So, it must be hold: Cov(ei,t−1,eit)=0.
3 First Difference Model (FDM)
In contraction to FCM, first differecen model bases its explanation for ΔYit on ΔXit, instead of Xit. Before introduce the specification of first difference model, we consider the general linear models in two periods: Yi,t=βXi,t+εi,t and Yi,t−1=βXi,t−1+εi,t−1. And we can differentiate them:
in which we have asummed that the coefficient of X is consistent over time, so
where the problem lies in whether we should assume εi,t=εi, that is, εi,t−εi,t−1=0, which means residual term itself does not fluctuate over time. Assuming εi,t=εi,
Clearly, ΔYi,t=βΔXi,t is not a statistical model, but a structural identity, which makes it impossible to do statistical inference. Naturally, some of us might think of adding an error term ui,t∼i.i.d to the right side of this identity to make it statistically inferable, i.e.,
Crutially, Model (3) is alomst impossible to hold true in the panel data structure. But
will hold true in the cross-sectional data structure and experimental data structure, equivalent to Y=βX+u, and in essence, formally linear and statistically capable of consistent estimation and standard inference using OLS.
3.1 Model Specification and Estimation
In perspective of DGP: Yi,t=βXi,t+εi,t, εit=αi+eit, after difference,
where Δεi,t=Δei,t, which is actually the specification of First Difference Model FDM.
The estimate for coefficient β of the FCM by OLS is:
Substituting (5) into (6) yields:
To make the first difference estimator β^FD converges to the true value β with probability, that is, β^FDpβ, the following condition must be satisfied:
and the sufficient condition for proposition (8) to be true is
(9) is equivalent to (10), as ΔXˉ is constant:
3.2 Assumption 1: Strict Exogeneity
Proposition: E[eit∣Xi1,Xi2,αi]=0for t=1,2
As Δei=ei2−ei1, then, E[ΔXi∣Δei]=E[ei2−ei1∣Xi2−Xi1]
To satisfy E[ΔXiΔei]=0, the following must be hold:
so, E[eit∣Xi1,Xi2,αi]=0for t=1,2. It must be assumed that Xi1,Xi2 and ei1,ei2 are uncorrelated, i.e., strict exogeneity is a necessary condition for effective identification by FDM.
3.3 Assumption 2: Fixed Effect
Proposition: αit=αi=constant⇒Δαi=0
If αit is not a constant rather changes over time, then:
Apparently.
3.4 Error term structurally idependent / Homoskedasti
Proposition: Δei∼i.i.d.(0,σ2)
Var(Δei∣Δxi)=σ2
4 Conditional Change Model CCM
The conditional change model CCM, also known as the lagged dependent variable LDV model, is a widely useed approach in panel data analysis. Unlike the first difference model, which focuses on raw changes over time, the CCM explicitly incorporates the lagged value of the dependent variable as a regressor, allowing us to model dynamic adjustments whlie controlling for past states.
However, astute readers will surely notice that we seem to have a priori assumed true structure of the panel data, such as Yi,t=αi+β1Xi,t+β2Yi,t−1+ui,t, which will be discussed later.
In Nerlove's (1958) study on farmer's price response, researches found that farmer's supply behavior had hysteresis in reponding to current prices, that is, supply adjustment often lagged behind price changes. The static linear model used at the time could not explain this phenomenon. Nerlove intoduced the lagged dependent variable Yt−1, which is used to represent the individual's own "adjustment inertia" or "expectation formation lag" machanism. After modeling in this way, the model can naturally expalin the partial adjustment process, that is:
The lag term Yt−1 is actually a minimal approximation of a low-dimensional statae space. In situations where inertia exists but higer-order differential or latent variable dynamics cannot be modeled, Yt−1 provides a highly practical capable of approxmating more complex system memory. This also explains why lagged dependent variables are nearly indispensasble components in social science research across fields.
2.1 Unconditional Change Model
Starting from the most basic linear cross-sectional regression model, we assume that for any unit i and time point t, the dependent variable Yit is determined by the explanatory variable Xit:
where, εi,t is disturbance term. Now, consider the observed value of the same unit in the previous period t−1, and its structure is as follows:
If we differentiate the equations at the above two points, we obtain:
i.e.,
Δεi,t=εi,t−εi,t−1. This regression of change of Y to the change of X is known as unconditional change score or first difference method of panel analysis.
Assume εi,t=ai+ei,t, then
From a statistical perspective, the differencing operation serves not merely as a nummerical transofrmation, bus as a structural reparameterization of the data-generating process. By differenceing observations of the same unit over time, we effetively eliminate all time invariant sources of heterogenity, including both the intercept term α and unit-specific fixed effects ai, assuming they are constant across periods. This transforms the model into one that estimates the marginal efect β purely based on within-unit variation over time, which allows for sonstent estimation even when Xi,t is correlated with the fixed effect ai— a common souce of omitted variable bias in cross-sectional analyses.
Crucially, this first-differenced model alters the statitical properties of the error term. The composite error Δεi,t=εi,t−εi,t−1 inherits a moving average (MA(1)) struture, due to temporal dependence between adjacent errors. This has direct implications for inference: while the OLS estimator of β remains unbiased under standard assumptions, standard errors computed withoutaccounting for this serial correlation may be inconsistent. Robust inference therefore requires either corrected variance estimation or alternative methods such as GMM, which will be discussed in later sections.
The identification of the change score model, which implies the premise of "no dynamic dependence", i.e.,
In other words, lagged dependent variable (or lagged endogenous variable) Yt−1 has no impact on Yt and ΔYt. Im many social processes, the dependent variable exhibits temporal inertia or lagged feefback, whereby Yt−1 carries predictive power for Yt even after accounting for Xt. Ignoring this leads to model misspecification.
2.2 Conditional change Model
Include the lag-dependent variable in regression model (1) yieds the so called conditional change model, as follows:
can be transformed into
1.2 Lagged Variation Model
In panel data analysis, we often focus on th impact of the current or lagged level of the explanatory variable on the dependent variable. Whether for explanatory or dependent variables, the most common approaches are estimation their first differences or dynamic adjustment paths. However, one less commonly discussed structure— models with lagges differences as predictors— actually holds siginificant potential for theoretical interpretation, partucularly when analyzing delayed response mechanisms and trend persistence mechanisms.
The basic idea of lagged difference model is that the trend of cahneg in the explanatory variable, rather than its level, is the key driver of changes or levels in the dependent variable. In other words, the behavior of social individuals or institutions is not directly inflenced by the magnitute of a variable's value but rather responds to its past rate or direction of change. This modeling logic can be expressed in two ways, respectively explaining the level or the trend of change in the dependent variable.
The first type of structure is level on lagged change model, whose basic form is as follows:
The core assumption of this model is that the current statle of the dependent variable is not directly determined by the level of the expalnotory variable, but rather driven by its trend of change in the previous time period. For example, the social impact of a policy change ofther does not manifest immediately but gradually emerges over a period of time following its implementation. Similarly, significant fluctuations in household income may not alter happiness scores in the current period but could lead to a reassessment of psychological well-being in the following year. This model therefore well-suited for capturing institutional lagged response mechanisms, emootional adjustment lags, and lagged path transition processes. Since its explanatory variables are the prevous period' change terms, it has weaker endogeneity risks compared to lagged level models in terms of metrological identification and clearly refects the "change first, then stabilize" response mechanism.
The second type of structure is change on lagged change model. whose bosic form is as follows:
Unlike the former model, this model emphasizes the continuity of change trends. Its hypothetical structure is as follows: if the explanatory variable undergoes rapid cahnge at an earlier time point, this trend will continue ti influence the current change in the dependent variable. For example, in the process of information diffusion, the rapid rise in online public participation rates in later stages; similarly, sustained gowth in education spending may manifest itself in the sustained improvement of student academic performance several periods later. This model is suitable for analyzing acceleration effects, lagged mobilization mechanisms, and trend inertia processes in social diffusion. Affitionally, it can serve as a linear approximation embedded within more complex diffusion or mobilization models.
In areas such as institutional change, policy response, behavioral imitaion, groupp polarization, and risk perception, social behavior is often not as a static response to variable levels, but rather a dynamic response to the path of variables chanegs. Lagged difference models provide the possibility of capturing this non-instantaneous causal chain.
Last updated