Forecasting house prices for the four census regions and the aggregate US economy in a data-rich environment

11 Nov 2013

This article considers the ability of large-scale (involving 145 fundamental variables) time-series models, estimated by dynamic factor analysis and Bayesian shrinkage, to forecast real house price growth rates of the four US census regions and the aggregate US economy. Besides the standard Minnesota prior, we also use additional priors that constrain the sum of coefficients of the VAR models. We compare 1- to 24-months-ahead forecasts of the large-scale models over an out-of-sample horizon of 1995:01–2009:03, based on an insample of 1968:02–1994:12, relative to a random walk model, a small-scale VAR model comprising just the five real house price growth rates and a medium- scale VAR model containing 36 of the 145 fundamental variables besides the five real house price growth rates. In addition to the forecast comparison exercise across small-, medium- and large-scale models, we also look at the ability of the ‘optimal’ model (i.e. the model that produces the minimum average mean squared forecast error) for a specific region in predicting ex ante real house prices (in levels) over the period of 2009:04 till 2012:02. Factor-based models (classical or Bayesian) perform the best for the North East, Mid-West, West census regions and the aggregate US economy and equally well to a small-scale VAR for the South region. The ‘optimal’ factor models also tend to predict the downward trend in the data when we conduct an ex ante forecasting exercise. Our results highlight the importance of information content in large number of fundamentals in predicting house prices accurately.