# Reduced form

In statistics, and particularly in econometrics, the reduced form of a system of equations is the result of solving the system for the endogenous variables. This gives the latter as a function of the exogenous variables, if any. In econometrics, "structural form" models begin from deductive theories of the economy, while "reduced form" models begin by identifying particular relationships between variables.

Let Y and X be random vectors. Y is the vector of the variables to be explained (endogeneous variables) by a statistical model and X is the vector of explanatory (exogeneous) variables. In addition let $\varepsilon$ be a vector of error terms. Then the general expression of a structural form is $f(Y, X, \varepsilon) = 0$, where f is a function, possibly from vectors to vectors in the case of a multiple-equation model. The reduced form of this model is given by $Y = g(X, \varepsilon)$, with g a function.

## Structural form

As an example, we use a system of two equations. Both equations are linear. The system models the supply and demand of some specific good. The quantity of the demand varies inversely with the price: a higher price decreases demand. The quantity of the supply varies directly with the price: a higher price makes supply more profitable. In formulas:

supply:    $Q = a_S + b_S P \,$
demand:   $Q = a_D + b_D P \,$

with positive bS and negative bD. This is the structural form of the equation system: the equations are derived from the theory (in this case, the economic theory of supply and demand).

The two endogenous variables are the traded quantity Q and the price P, defined by the two equations of the system. Of course there are always as many endogenous variables as there are equations.

## Reduced form

To find the reduced form, one must solve the equations for the endogenous variables. This reduces the system considerably. For instance, we know that the two right-hand sides of the equations are the same (both equal to Q), and hence $a_S + b_S P = a_D + b_D P$. This can be written as $P ( b_S-b_D) = a_D-a_S$, or $P = (a_D-a_S ) / ( b_S-b_D)$. Thus, P is in fact a fixed number, independent of Q. Below, this number is called $\pi_2$, while the similar number for Q is $\pi_1$:

$Q = \pi_1 \,$
$P = \pi_2 \,$

The structure of supply and demand has disappeared. The two $\pi$ coefficients are the reduced form coefficients. They are easily identified from data on Q and P. (However, the four structural form coefficients above can not be identified from data: the parameter identification problem.)

It is easily verified that:

$\pi_1 = (a_D b_S - a_S b_D) /(b_S - b_D) \,$
$\pi_2 = ( a_D-a_S ) / ( b_S-b_D ) \,$

## Structural and reduced forms with an exogenous variable

Exogenous variables are variables which are not determined by the system. If we assume that demand is influenced not only by price, but also by an exogenous variable, Z. The structural form becomes:

supply:    $Q = a_S + b_S P \,$
demand:   $Q = a_D + b_D P + c Z\,$

In the above set of equations, the choice of the endogenous variables can not be derived from the equations themselves; the modeller might alternatively have chosen for instance Q and P as endogenous variables, which would make Z the exogenous variable.

This structural model can be rewritten in the reduced form:

$Q = \pi_{10} + \pi_{11} Z\,$
$P = \pi_{20} + \pi_{21} Z\,$

As before, the four reduced form ($\pi$) coefficients can be derived from the five structural form coefficients. Note that both endogenous variables depend on the exogenous variable Z.

By combining the two reduced form equations to eliminate Z, the structural coefficients of the supply side model ($a_S$ and $b_S$) can be derived from the four reduced form coefficients ($\pi_{10}$, $\pi_{11}$, $\pi_{20}$ and $\pi_{21}$):

$a_S = (\pi_{10}\pi_{21} - \pi_{11}\pi_{20}) / \pi_{21}$
$b_S = \pi_{11} / \pi_{21}$

Note however, that this still does not allow us to identify the structural parameters of the demand-side model. For that, we would need an exogenous variable which is included in the supply-side of the structural model, but not on the demand-side.

## The general linear case

Let y be a column vector of M endogenous variables. In the case above with Q and P, we have M = 2. Let x be a column vector of exogenous variables; in the case above x consists only of Z. The structural linear model (without error terms, as above) is:

$A y = B x \,$

where A and B are matrices; A is a square M  × M matrix. The reduced form of the system is:

$y = \Pi x \,$

Again, each endogenous variable depends on each exogenous variable. It is easily verified that:

$\Pi = A^{-1}B \,$

Without restrictions on the A and B, the coefficients of A and B cannot be identified from data on y and x: each row of the structural model is just a linear relation between y and z with unknown coefficients. (This is again the parameter identification problem.) The M reduced form equations (the rows of the matrix equation y = Π x above) can be identified from the data because each of them contains only one endogenous variable.

## Transformation

From the structural form to the reduced form, a coherency condition might be needed to ensure the reduced form is uniquely determined.