BACK NEXT WEB BOOK

Analysis of Land Use Change: Theoretical and Modeling Approaches
Helen Briassoulis, Ph.D.

4 MODELS OF LAND USE CHANGE

4.1. Introduction
4.2. Models of Land Use Change – Classification (link to Table 4.1a)
4.3. Statistical and Econometric models (link to Table 4.1b)
4.3.1 Statistical Models
4.3.2 Econometric Models (EMPIRIC)
4.4. Statistical Models


4.1. Introduction

This chapter presents a representative collection of models of land use change. Prior to this presentation, it is necessary, however, to: (a) clarify the meaning or definition of "model" as it will be used in the present context, (b) indicate which models of land use change will be included and which will be excluded, and (c) discuss the need for and the uses of models in the context of the analysis of land use change.

Models are defined variously. They can be considered as the formal representation of some theory of a system of interest (Wilson 1974, 4). More broadly, models can be considered as abstractions, approximations of reality which is achieved though simplification of complex real world relations to the point that they are understandable and analytically manageable. The representation of reality is expressed through the use of symbols. Mathematical techniques are applied for the manipulation of the relationships among the entities represented by these symbols. Hence, the term symbolic (or operational or empirical) model is used to distinguish it from other types of representation (e.g. conceptual models) (Lonergan and Prudham 1994). It should be noted that the term "model" is used sometimes interchangeably with the term "theory" in the literature both when dealing with theories as defined in the previous chapters and when dealing with mathematical models as defined above. However, the two terms are not equivalent. Theory provides a more general framework of "connected statements used in the process of explanation" while a model is "an idealized and structured representation of the real" (Johnston et al. 1994, 385, 622) or "an experimental design based on a theory" (Harris 1966, 258; see also, Romanos 1977, 135). In this author’s opinion, the use of the term "theory" to denote a mathematical, symbolic model is misleading and unsuccessful (despite the fact that the model may be the mathematical expression of theoretical statements and assertions although this is not always the case).

Getting to the second issue, which models of land use change are included in this chapter and which are excluded, a criterion similar to that applied for the selection of theories of land use change is employed. In other words, a broad distinction is drawn between those models which treat land, land use and, more importantly, land use change explicitly and directly and those where land use and its change are treated in a less indirect and explicit fashion. Certain qualifications as regards this distinction are necessary.

Models which treat land and land use change explicitly are basically those in which the direct object of model building is land use change. In these models, land (and land use) is conceptualized, at a minimum, as "a delineable area of the earth’s terrestrial surface" (see Chapter 1). Land use is characterized by: (a) its areal (stock) and not point character, (b) its relative immobility (compared to a single location), (c) the relative stability of its occupancy (durability), (d) the relatively high cost of change from one type to another (see, for example, Arnott 1986 for the case of residential land use). Hence, models in which land is reduced to a point in space are not considered land use (and change) models here. This is the case with models where supply (production of goods and/or services) or demand (consumption of goods and/or services) have a point representation (for a concise description of the "point" nature of spatial equilibrium analysis, see Takayama and Labys 1986, 171). These models represent more general processes of spatial change – which may imply land use change – as it was the case with theories which did not treat land use explicitly.

As it is the case with all "spatial" models, land use-explicit models employ some type of zonal system for spatial representation. Each zone is characterized by its particular distribution of land use types. The number of zones, however, should be greater than a minimum value to consider the spatial representation offered by the models as satisfactory. This means that models with a two- or three-zone system are extremely crude spatially, at least for the present purposes. The recent trend is, however, towards individual land unit-level models which make the use of a zoning system redundant.

An important distinction which should be clarified is that between "location" and "land use" as frequently they are used interchangeably in the literature or without proper qualifications as to their difference. The distinction is addressed by Beckmann and Thisse (1986) and Andersson and Kuenne (1986), among others, as noted in Chapter 3. Most of the time, "location" refers to a point representation of a (public or private) firm or facility in space. When the term "location" is used to denote "land use" this is usually because the analysis starts from individual location decisions and then proceeds to obtain spatial (location) patterns which represent a market solution, i.e. the result of the aggregate behavior of locators. This contribution refers to models where spatial/areal patterns are used as opposed to point patterns. An additional point about models where space has a point representation is that – at least in the short run – they assume the location of demand and supply points given; hence, they are not models of change. Only their dynamic versions may consider potential changes of the spatial distribution of the demand and supply points.

A similar confusion exists frequently in the literature over the terms "spatial pattern" and "spatial structure". The terms may denote either point patterns (e.g. hexagonal, triangular) or areal patterns (however, simplified and abstracted – e.g. disc, etc.) (see, for example, Andersson and Kuenne’s (1986) reference to Puu’s work). Finally, "market areas" are not considered to constitute necessarily land uses (with the characteristics stated before) although they are expressed as areal entities whose extent is assessed appropriately. The reason is that a market area for a good or service may comprise of several types of land use (e.g. residential, commercial, open space), in general; hence, the exclusion of related models dealing with market areas and their changes.

Models which treat land and land use indirectly or not at all are, in general, these models in which the primary purpose of model building is not modeling of land use change. First, there is a large number of models which deal with changes in some of the determinants of land use (e.g. product or service demand, income, investments, accessibility), they employ a zonal spatial system of reference, and contain a land use component (however crude). In these models, the assessment of changes in land use may be internal or external to the model. These will be considered here although they suffer from incomplete (or, nonexistent) conceptualization of land, land use and its change. Second, there are the "pure" aspatial models which are concerned with broader socio-economic changes that may impinge on, or cause land use change in one way or another – such as the economic base model, the single-region input-output model, economic growth and international trade models, etc. These are not considered here unless they constitute components of larger integrated models.

Based on the above points of clarification, the genre of location models are not considered in this contribution as their emphasis is on particular, individual activities locating in space and not on an area of land used for a given purpose by various locators. Indicatively, the following groups of models are excluded: Central Place theoretic models (Christaller, Losch), Weber-problem models, network models, spatial equilibrium models (and related agricultural, energy and mineral models; see, Takayama and Labys 1986). It is noted that spatial equilibrium models may deal with long-run, equilibrium land use patterns. However, first, these patterns are not unique (Beckmann and Thisse 1986) and, second, they address a state of equilibrium achieved some time in the future and do not deal with the process of change towards this state; hence, they cannot be considered effectively as land use change models.

From the group of location models, this contribution will consider, however, the residential location models. These models usually start from modeling individual (residential location) choice behavior but then they aggregate over individual choices to derive residential land use patterns (or, segments of the housing market). In addition, residential location models account explicitly for the amount of housing "consumed"; i.e. they incorporate directly the land requirements of residential (housing) demand. Moreover, many of their versions are land use change models as their specification includes explicitly factors causing change such as preferences, prices, mobility. Residential land use has all the features specified for land use above and, among all urban land use types, represents the most land-extensive (and intensive) type.

The interrelated issues of the need for and the uses of models for the analysis of land use change are finally addressed briefly. Land use change is the result of a complex web of interactions between bio-physical and socio-economic forces over space and time. Coping with this complexity for practical purposes, at least – such as policy making and land management for sustainable land use – is impossible without some simplification of the complex relationships to manageable and understandable dimensions. Hence, the need for some model, in general, and for some symbolic model, in particular, which will express operationally the relationships of interest (see, also, Turner et al. 1995).

Given the need for models of land use change, the uses of these models are not difficult to derive. A first, general use is to provide decision support in various decision and policy making contexts. More specifically, models can be used to describe the spatial and temporal relationships between the drivers and the resulting patterns of land uses and their changes. Concise, well specified descriptions grounded on rigorous theory are the cornerstones of understanding and defining the exact problem of land use change decision makers are facing (or, simply, interested in) and acting about it (if necessary). Models of land use change can be used also as explanatory vehicles of observed relationships. This is a debatable aspect of model use, however, as it depends on what one means by "explanation". In several operational models explanation is reduced to statistical or mathematical explanation which is not necessarily equivalent to theoretical explanation which attempts to get into the causality of the relationships analyzed and modeled (see, for example, Achen 1982, D. Lee 1973, Sayer 1976, 1979a, 1979b, 1982). Some theories and models have been, in fact, conceived simultaneously in which case the terms "theory" and "model" are used interchangeably to denote a set of theoretical and operational statements about reality (such as von Thunen’s and Alonso’s theories and models).

Very frequently, in practical situations, models are used to predict (or, forecast) future configurations of land use patterns under various scenarios of bio-physical (e.g. climatic) and socio-economic change. Plausible and successful predictions depend, among other things, on the assumptions, specification and theoretical grounding of the models themselves as well as on the scenarios of change from which they borrow the levels (values) of the variables "driving" the prediction. In situations of extremely complex, cloudy and unpredictable futures, simulation is most commonly used in which case theoretical soundness may not be of critical importance.

Models of land use change can play an instrumental role in impact assessment of past or future activities in the environmental and/or the socio-economic spheres. This use has two facets; on the one hand, it may concern assessment of qualitative and/or quantitative changes of land use caused by autonomous or planned changes in one or more of its determinants; on the other, it may concern assessment of the environmental and socio-economic impacts of changes in land use (such as land degradation, desertification, food security, health and safety hazards, unemployment, etc.).

Models of land use change have been and are currently being used to prescribe "optimum" patterns of land use for sustainable use of land resources and development, in general. In this case, they rest usually on optimization techniques which are used to produce land use configurations which satisfy specified objectives as well as a variety of environmental and socio-economic constraints. One of these constraints is the availability of land. Optimization models are commonly used in planning and management contexts.

Evaluation is a final model use which is associated with the last three uses mentioned – prediction, impact assessment and prescription. Models of land use change for the purposes of evaluation per se do not exist as evaluation is an activity which can be performed on any set of alternatives which have to be evaluated on the basis of specific criteria (see, for example, exposition of related methods in Nijkamp and Rietveld 1986, Voogd 1983). Therefore, in the particular case of the analysis of land use change, land use alternatives generated by models (either for the purposes of prediction, impact assessment or prescription) can be evaluated using any of the available evaluation techniques. This topic is not covered in the present contribution. It is noted, however, that the literature refers to particular land use models as evaluation models but this usage is not widely accepted.

The next sections present a selection from the large and variegated pool of land use change models and details certain of them further. In particular, this contribution expands more on models of a more recent origin as past generation models – and especially, the "classics" – are covered by a voluminous literature completely and adequately which will be indicated appropriately. The last section offers a summary account of the main characteristics of the models of land use change presented here.

4.2. Models of land use change – Classification

The literature contains a considerable number and variety of models of land use change where land use and its change are treated explicitly and are the direct object of the modeling exercise. Eight interrelated sources of variation, in a roughly decreasing order of importance, can be discerned in extant models: the purpose for which the model is built, the theory (or, the lack of it) underlying the model (reflecting, in part, the types of the determinants of land use change taken into account), the spatial scale and level of spatial aggregation adopted as well as the degree of "spatial explicitness" of the model, the types of land use considered as principal objects of analysis, the types of land use change processes considered, the treatment of the temporal dimension (which in the case of analysis of change, in general, should be inherent in any project), and the solution techniques used. Hence, there exist:

  1. descriptive, explanatory, prescriptive, predictive and impact assessment models
  2. micro-economic and macro-economic theoretic models, gravity or spatial interaction theory-based models, integrated models as well as a-theoretic models
  3. local, regional, interregional, national and global level models
  4. geo-referenced (fully spatially explicit) and non-geo-referenced (incompletely spatially explicit) models
  5. urban (mostly residential), agricultural (crop), forest sector models
  6. deforestation, urbanization, etc. models
  7. static, quasi-static (or, quasi-dynamic) and dynamic models (however counterintuitive static models of change may sound)
  8. statistical, programming, gravity-type, simulation and integrated models.

It is, thus, evident that, for the purposes of systematic exposition of extant models, it is necessary to adopt a classification scheme as a presentation and discussion vehicle of these models.

The modeling literature suggests several model classification schemes. Wilson (1974) proposes a classification scheme based on the dominant technique used in model building (p. 173-176). Batty (1976) distinguishes between substantive and design criteria for model classification (p.12-15). Issaev et al. (1982) mention four possible approaches to model classification: "(a) construction of a list of attributes characterizing aspects of the models, (b) specification of a set of criteria serving as a general evaluation framework, (c) construction of an ‘ideal’ model as a frame of reference for judging all other models, and (d) cross-comparison of models on the basis of general structure characteristics of these models (Issaev et al. 1982, 4). Stahl (1986) suggests a number of substantive criteria for classifying business location models including issues of theory and model purpose (Stahl 1986, 769-771).

In general, it seems that a general purpose, unambiguous classification scheme of models which can reflect meaningfully the eight sources of variation mentioned previously does not exist for various reasons. The same subject can be modeled at various levels of spatial detail employing corresponding theories (e.g. micro-, macro-economic) as well as within either a static or a dynamic framework. In addition, the same problem a model addresses can be approached by means of more than one modeling techniques and/or model designs. Model specification for the same problem under study may range from very simple to highly sophisticated. Hence, for the present purposes, a decision was made to adopt a classification scheme based on an aggregate, composite criterion, the modeling tradition to which a model belongs. This criterion is governed by the dominant feature of model design and solution technique which is more relevant for model building and discriminates among various model types. Moreover, model design is usually associated with particular model purposes, underlying theories and types of land use modeled (and, usually, the discipline where models originate), and spatial and temporal levels of analysis.

Based on this criterion, four main categories of models were distinguished:

  1. statistical and econometric models
  2. spatial interaction models
  3. optimization models, and
  4. integrated models.

A fifth category has been added which contains those models for which classification is not straightforward as they reflect a variety of modeling traditions. Within each of the above modeling traditions, models are further classified according to criteria particular to this tradition and they are ordered approximately from early to recent models. Because modeling tradition is a composite criterion, it is probable that several models can be classified under more than one category (such as the spatial interaction models which can be considered as simulation or programming models or integrated models which can be classified as simulation models as well).

A selection of models belonging to each modeling tradition is presented and evaluated next based on the following model features/criteria:

Table 4.1a contains the principal groups of models which are presented in the following classified according to modeling tradition. For each group of models the reader is referred to more detailed Tables (Tables 4.1b, 4.1c, 4.1d) where all models included in the group which this contribution presents are shown. The presentation of the models which follows is kept simple and not mathematically sophisticated to make it accessible to a wider audience. For several models already built and/or used and for which there is complete documentation the reader is referred to the original sources. More emphasis is placed on models of a recent origin as: (a) they are not as widely publicized and covered in the literature as the past generation models and (b) it is interesting to see whether the more recent models are in a better position to model land use change than was the case with past modeling efforts.

4.3. Statistical and Econometric Models

Application of statistical techniques to derive the mathematical relationships between dependent variables and sets of independent (or predictor ) variables is widespread in modeling socio-economic and other systems of interest (see, for example, Colenut 1968, Lee 1973). The most commonly used statistical technique is multiple regression analysis (and its variations such as stepwise regression, two-stages least squares) although application of other multivariate techniques is not uncommon (such as factor analysis, canonical analysis, etc.). The application of multiple regression techniques to the analysis of problems involving economic demand and supply has given rise to what are known as econometric models. Simply stated, these are systems of equations which express the relationships between demand and/or supply and their determinants as well as between demand and supply themselves (for economic/market equilibrium) (Batty 1976, Wilson 1974). A large body of specialized statistical techniques has been developed for estimating their coefficients broadly known as econometric analysis (or, techniques) (see, for example, Judge et al. 1982). A selection of models belonging to this modeling tradition are presented in the following (Table 4.1b).

4.3.1. Statistical Models

Statistical models whose direct object of analysis is land use change date since the 1960s at least and they are still employed in several related studies. Frequently, they constitute components of larger models employed for the analysis of land use change and its determinants. A distinction can be drawn between continuous models which treat land use as a continuous variable (area of land devoted to a land use type) and discrete models those which treat land use as a discrete variable (different land use types are distinguished). In the following, the basic structure of a statistical model for land use change is indicated and then existing models are presented and discussed briefly.

In a statistical model of land use change, the study area is usually subdivided into a number of zones (or, grid cells if a grid system is adopted) the size and shape of each cell depending on the level of aggregation chosen as well as the availability of data. (The case where the observation units are individual land parcels instead of zones or grid cells is discussed later.) In the continuous case, for each zone, the distribution of land use types (the dependent variables ) as well as the values of other environmental and socio-economic predictor variables (e.g. population, employment, soil conditions, slope, climate (temperature, rainfall, etc.) are given. A multiple regression equation for each land use type is fit to these data (usually referring to a given year). The general form of the equation is:

where: LUT: is the area of land occupied by land use type i (in each cell) and X1, X2, … Xn the predictor variables used. The term "" is the error term of the statistical model.

This model form can be used to assess the changes in the area covered by a given land use type for specified changes in one or more of the predictor variables by substituting their values in the equation shown above. Early applications were made by Chapin and Weiss (1968) (see, also, Chapin 1965) in the context of a broader "probabilistic model of residential growth" (known as the North Carolina model) and by Swerdloff and Stowers (1966 cited in Chapin and Kaiser 1979) using the same data set. The dependent variable in their model was the attractiveness of a zone of the study region for residential growth measured as (Chapin and Kaiser 1979):

An exploratory regression analysis was applied first to a larger set of candidate variables which the literature indicated that influenced land use to identify the variables which had a statistically significant relationship to the dependent variable. The independent which were included finally in the Chapin and Weiss version of the model were:

The independent variables of the Swerdloff and Stowers version of the same model were (Chapin and Kaiser 1979):

A similar statistical model is used in the CHANGE module of the CLUE model which is discussed below under the category of integrated models (Veldkamp and Fresco 1996b, Verburg et al. 1997). The CHANGE module uses linear regression models to estimate the changes in the area of given land use types which are caused by changes in the values of environmental and socio-economic driving factors projected from other modules of the CLUE model.

Discrete statistical models (or, discrete choice models) are used to represent choice situations in general (see, for example, McFadden 1978, Hensher 1981, Anas 1982). In the case of land use modeling, each land use type is described as a function of a number of characteristics (which usually differ from one cell to another). For each cell, the utility of every land use type is assessed as a function of these characteristics. The probability of choosing a particular land use type in a given cell is calculated as a function of the utilities associated with the land use types considered. The most common mathematical forms used in discrete choice models are the logit and probit models. Examples of this approach are presented next. The discussion of discrete choice model resumes in section 4.6.3A on integrated land use-transportation models.

In the context of a larger modeling exercise for the analysis of land use change in Japan, Kitamura et al. (1997) and Morita et al. (1997) use a multinomial logit model to assess changes in land use by type. The model assesses the probability of choice of a particular land use type in each of the cells in which the study area is subdivided as a function of the values of a set of predictor/ explanatory variables . These probabilities are interpreted as land use proportions for each of a specified number of land use types. The mathematical form of the model is as follows:

          (4.2)

          (4.3)

where:

Pij the land use proportion of land use type i in cell j
Vij the utility of the ith land use type in cell j
Xjk the kth explanatory variable in cell j
ik the multiple regression coefficients of the explanatory variables Xjk

The above formulation calculates first the utility of each land use type in each cell of the study area as a linear function of the values of a set of predictor variables (equation 4.3) and then uses this utility to estimate the probability of a particular land use type occurring in each cell. The predictor variables are shown below. As it was the case with the previous multiple regression model, changes in the predictor variables calculated from other modules of the larger model are fed into equation (4.3) to estimate changes in the utility of each land use type. These changes are then used in equation (4.2) to estimate changes in the proportion of each land use type in each cell of the study region.

Four land use types were taken into account in this study: farmland, forestry land, built-up area, and other land. The dependent variables were operationalized, thus, as: farmland share, forestry land share, built-up area share and other land share. The predictor/explanatory variables used were grouped into three groups: socio-economic driving forces, land use policy and planning factors, natural factors. The socio-economic driving forces category included the variables:

The land use policy and planning factors category included the variables:

The natural factors category, lastly, included the variables:

In the framework of the same modeling exercise for Japan, another statistical technique, canonical correlation analysis, has been used to explore the environmental and socio-economic determinants of land use change as well as to test the temporal stability of the land use patterns of the study area in the study period (Hoshino 1996). Canonical correlation analysis (CCA) is a multivariate statistical technique used to explore the structure of the relationships between a dependent and a set of independent variables which is especially suitable when the independent variables are correlated with each other. The study area was subdivided in its 138 municipalities which were chosen as the basic unit of analysis. Four types of land use were distinguished as above: farmland, forest land, residential land, other land (public uses, etc.). The results of CCA indicated the relationships between particular land use types and the determinants used in the predictor set.

Recently, discrete choice models have been built where the observation units are individual land parcels. The advantage of these models as compared to those using a zone or grid system is that the observation unit is the decision making (the owner of the land parcel); hence, they model the actual land use choice or land use conversion behavior of the individual parcel owner. The independent variables entering the relationship are the actual factors which affect the land use choice or land use conversion decision. These may include policy variables (e.g. land use regulations and restrictions) as well as the physical/ environmental characteristics of the site. Hence, these models allow for more realistic representation of the land use change process. Moreover, the environmental (as well as the social and economic) impacts of land use change at a very disaggregate level – that of the land parcel – can be assessed directly. Models following this approach can be found in Bockstael (1996), Geoghegan et al. (1997), Bockstael and Bell (1997), Bell and Bockstael (1999), Irwin and Bockstael (1999).

For example, in Bockstael (1996), Bockstael and Bell (1997), and Irwin and Bockstael (1999), profit-maximizing individuals are assumed to own undeveloped parcels of land and to make decisions to convert them to residential land use. Land conversion depends on expected returns which are assumed to be a function of expected sales price of the land in residential use, conversion costs, and the opportunity cost in terms of alternative uses. The expected price of a converted parcel is assessed as a function of commuting distance to urban centers, provision of public services, zoning restrictions, and indices of surrounding land uses (Bockstael and Irwin 1999). The spatial explicitness of this modeling approach permits also the analysis of the spatial and temporal dynamics of land use change. For example, in Irwin and Bockstael (1999), the theoretical framework of agent-based theories in the urban and regional economics theorization tradition discussed in chapter 3 is used to model: (a) the attracting effects among developed land parcels which exogenous features create (e.g. central city, road, public services) and (b) the repelling effects rising out of interactions among land users. The authors demonstrate that this model provides a viable explanation of the fragmented residential development pattern observed in urban fringe areas (in the United States where the model has been applied). In fact, fragmented land use patterns can be modeled only at the disaggregate level of land parcels which provides the necessary detail to represent land fragmentation. The authors present also a duration model of residential land use conversion. The land conversion decision is a function of both exogenous landscape features and a temporally lagged interaction effect among neighboring land users (Bockstael and Irwin 1999). A general observation with respect to models in this direction is that they are based on microeconomic foundations providing, hence, theoretically sound models (in an economic sense, at least) of land use change decisions.

Before turning to the econometric models, the statistical models of land use change presented are briefly discussed and evaluated. Their purpose is description, explanation and (conditional and unconditional) prediction of land use changes as functions of selected determinants. They are mostly cross-sectional, static models operating on the basis of annual data. Usually they are national or regional level models based on a zonal system of spatial reference where the zones usually coincide with administrative districts (for which data is available). Exceptions are the recently built models employing parcel level data which can be considered local level models, they do not employ a zonal system, and they may incorporate lagged values of certain independent variables which makes them quasi-static models. Most models consider four to five major types of land use (arable land, permanent crops, pastures and range lands, natural vegetation, other uses); i.e. they employ a rather coarse level of land use detail. Naturally, the spatially explicit land use models can accommodate a finer level of land use detail if available data permit.

There is no specific theory underlying most statistical models except for the broad theoretical claim that land use changes result from changes in environmental and socio-economic driving forces. In other words, they adopt an instrumentalist view of theory which has "a rather shadowy, secondary role in providing a source of assumptions on which models …. can be based" (Sayer 1979a, 858). Some of the statistical techniques used (such as CCA) attempt to elicit the structural relationships between land use change and its determinants but this is a mechanistic procedure devoid of theoretical meaning and guidance. An exception again are the spatially explicit, discrete statistical models of individual land user behavior which are grounded in microeconomic theory of consumer behavior (Bockstael 1996, Bockstael and Bell 1997, Irwin and Bockstael 1999).

Model specification in most models rests on ratio (quantitative) variables mostly which means that the qualitative aspects of land use change which cannot be quantified and measured do not enter the model. This leaves much to be desired from these models. An exception again are the spatially explicit, discrete statistical models which can accommodate nonquantitative aspects (e.g. personal, cultural, and other characteristics of the land users) of the land use change environment. However, whether the statistical treatment and the operationalization of such characteristics are valid are open questions which should be investigated in the light of recent advances in (and currency of) qualitative analysis. The socio-economic variables include frequently population growth and population distribution among age groups as a proxy measure for several socio-economic determinants (e.g. literacy, demand, etc.). This may not be the best choice given that population has almost always good statistical correlation with many variables including the dependent (land use change) variables. Hence, the models appear to possess significant statistical explanatory power when, in fact, this may be an inevitable numerical result. It has been argued that population has to be considered either as one (but not the primary or the sole) of the independent variables influencing land use change or, preferably, it is best to be viewed as an intermediate variable affected by others (see, for example, Sunderlin and Resosudarmo 1999 for an analysis of this issue in the context of forest cover loss). This issue relates to the broader issue of endogeneity of the independent variables. In other words, one or more of the independent variables included in a model may be endogenous to land use (the independent variable) in that it is affected by changes in it. An example is the use of the variable "distance to road" as an explanatory variable; but when building roads (i.e. changing the value of this variable) may cause land use changes which, in their turn, modify the distance to roads (see, for example, Pfaff 1999).

Multiple regression and related multivariate models reveal correlations or associations among variables, a fact which has nothing to do with causation. Causal models require a rigorous grounding on theory which the models discussed previously lack. Most of the statistical models of land use change are linear multiple regression models which suffer from the linearity assumption. In other words, a unit change in any one of the independent variables produces always the same level (amount) of land use change of a specific type, an assumption difficult to verify and defend on both theoretical and practical grounds. As regards the particular statistical techniques used, it appears from the model descriptions available in the published literature that the spatial data used are analyzed by means of conventional statistical techniques rather than by spatial statistical techniques. This is a serious, weak point of model estimation as spatial data suffer almost always from spatial autocorrelation which should be taken into account and corrected by means of appropriate spatial statistical techniques (see, for example, LeSage 1999). In addition, it should be noted that multiple regression models suffer from multicollinearity as the set of predictor variables are usually correlated to one another. This is a problem when using the models for explanation but not when using them for prediction.

The data used are rather coarse, aggregate data easily available from population, agricultural and other censuses as well as from governmental agencies. However, they are spatially explicit which offers the models better spatial representation than possible aspatial versions. Lack of a host of other types of (geo-referenced) data, however, limits the utility of the models as several determinants (especially the socio-economic) of land use change are under- (or, mis-) represented by proxy variables while some others are not represented at all. The models presented have been used in real world applications. More specifically, the CLUE model has been calibrated with data from Costa Rica. The multinomial logit model is part of a larger modeling exercise on the analysis of land use change in Japan (this model will be discussed in greater detail in the section on integrated models below). The models by Bockstael (1996), Geoghegan et al. (1997), Bockstael and Bell (1997), Bell and Bockstael (1999) have been applied to the central Maryland region in the U. S. The later models present the highest data requirements of all statistical models presented above.


4.3.2. Econometric models

To this author’s knowledge, no econometric models of land use change are known where land use is treated explicitly either as a continuous or as a discrete variable. The common practice in econometric modeling is to estimate changes in some determinants of land use (say, population, housing demand, retail demand, employment) and then convert those estimates to land use requirements (by land use type) with the use of land use/activity coefficients . One of the well known econometric models in this tradition is the EMPIRIC model which is presented briefly in the following as it represents a prototype model built in the decade of the 1960s and was used as a rather simple vehicle to model metropolitan structure.

EMPIRIC is essentially an indirect model of land use change as the direct object of model building was not the analysis of land use change but rather, modeling the distribution of employment and population within metropolitan areas. Several publications describe the main features and structure of EMPIRIC to which the reader is referred such as Hill (1965) and Rothenberg-Pack (1978). The latter includes a detailed bibliography on various related documents and discusses the various applications of the model.

EMPIRIC is a regional activity allocation model built with the broad purpose of providing future forecasts of population, economic activity and land use patterns in metropolitan areas under various policy (development) scenarios; i.e. it was mostly suitable for impact assessment and policy analysis. More specific and detailed purposes for which the model has been used are found in the related literature, a selection of which are presented here drawing on Rothenberg-Pack (1978). Hill (1965 cited in Rothenberg-Pack 1978, 31), a member of the consulting firm which first developed the EMPIRIC model, concluded in the presentation of the 1965 version of the model: "the model may enable the planner to simulate a chain of events in urban development starting with the variables under public control (such as the transportation system and zoning regulations) and ending with a sufficient and desirable pattern of residential and industrial development." In its 1967 version (Brand et al. 1967 cited in Rothenberg-Pack 1978, 32), EMPIRIC could be used to generate the development pattern (Z) associated with a set of policies (Y) and aggregate growth assumptions (X). Boyce et al. (1970 cited in Rothenberg-Pack 1978, 32-33) summarize the model’s uses: "the primary use was seen to be the provision of ‘…….forecasts of land use activities which could be input to traffic forecasting’, or ‘forecasts of population and employment for use or input to traffic models and transportation systems analysis’ or, more generally, ‘future year land use patterns for various planning purposes, particularly input to traffic models’. The secondary uses were variously judged to be: ‘sufficiently sensitive to public policy inputs to enable use as a design tool producing different future year patterns of land use’ or to be ‘a regional planning tool which will help in the evaluation of alternative regional plans, or, less generally, .. to be applied ‘in testing alternative regional plans for feasibility of implementation as well as for functional utility." However, as Rothenberg-Pack (1978, 31) observes, "the model can provide inputs to the evaluation stage but cannot carry out evaluation…the model does not derive the "best" policy through an optimizing process; rather it simulates the impacts of prespecified policy mixes." Hence, its uses were more limited than its users and builders had aspired.

EMPIRIC is a spatially explicit model which assumes a study region subdivided into a number of zones. The model consists of three main elements or modules:

  1. the activity allocation model
  2. the forecast monitoring module, and
  3. the land consumption module.

The activity allocation model receives exogenous population and employment forecasts and distributes them to the subareas (districts) of the study region through a system of simultaneous equations . There is one equation for each of four household income classes (and a fifth group of unrelated individuals) and for each of four employment (industry) groups. The independent (explanatory) variables are:

  1. the base period activity levels (number of households in each income class and number of employees in each industry group)
  2. changes of these levels over the forecast period
  3. other base period characteristics of each zone such as distribution of land uses, densities, etc., and
  4. base period and forecast period values of policy variables (alternative forms of transportation accessibility and availability of water and sewer services).

The model consists of a system of nine simultaneous equations of the general form:

where,

Rik denotes change in activity i in zone k
Rnk denotes (simultaneous) changes in other activities by category in zone k
Rnk(t0) denotes the base year values of the activities in zone k
Cpk(t0) denotes base period characteristics of each zone k
Zmk(t0) denotes initial or base year values of policy variables in zone k
Zmk(t) denotes changes in the policy variables over the forecast period in zone k
t0 denotes the base year and
t1 denotes the forecast year

ain, bin, cip, dim, eim, fim are the regression coefficients estimated by fitting the model to available, cross-sectional , data usually by means of Ordinary Least Squares or Two-Stages Least Squares , the latter technique being more appropriate to fitting simultaneous regression equations (Batty 1976).

The forecast monitoring module adjusts the initial unconstrained forecast activity allocations generated by the activity allocation model to be consistent with preset minimum and maximum activity constraints. Examples of such constraints which could be simulated include: (a) a fair share housing project which required a minimum number of low income households in each district or, (b) an urban renewal policy manifested in counter growth activity locations or, (c) location preferences of certain activities (Rothenberg-Pack 1978).

The land consumption module, finally, converts the adjusted activity allocations to land use requirements on the basis of activity-specific density calculations. Following Rothenberg-Pack (1978, 234), "each district is first allocated to an urbanization category based on the proportion of its area which is developed in the base period (four urbanization categories are defined ranging from urban core to fringe). Within each of the urbanization categories, permitted development densities for each activity land use category are specified, based upon the average densities observed in the base period for that type of district. If the land required to accommodate all of the population and employment activity allocated to the district would push the district into another urbanization category or exceeds the initially specified available land for development, then the urbanization category or the average density may be adjusted to reflect development pressures." This basic process may be subject to a number of discretionary policies or "overrides"; i.e. arbitrary changes in the data inputs to the module (the initial land uses, the amount of vacant land, permitted densities), or forced allocations to, or limits upon land for particular uses. Major instruments of land use control (i.e. control of land use change to achieve specific goals such as via zoning or land reservations), i.e. policy variables, are accommodated in the last two modules of EMPIRIC (Rothenberg-Pack 1978).

The version of the EMPIRIC described above is the most commonly used in its various applications. It is a static model where the dynamics of the urban system modeled is implicit. In Batty’s (1976) words: "Although certain lags are built into the system, their explanation is also largely statistical and, as the dynamic process which these models are attempting to simulate is implicit, there are few guiding principles in the choice of the time interval. Furthermore, these models do not attempt to identify the mover pool and their equilibrium properties are unspecified" (Batty 1976, 299-300). EMPIRIC lacks an underlying theory, as it is frequently the case with regression models. Batty (1976) notes: "One central problem with .. linear models revolves around their rather inductive bias in that the emphasis upon explanation is completely statistical and lacks little of the causal focus of the activity allocation models" (Batty 1976, 299-300). Rothenberg-Pack (1978) notes that in the activity allocation model itself, "the lack of a theoretical or behavioral base results in the differential specification of similar policy variables in ways which are very difficult to rationalize; moreover, their relationship to some activities and not to others is in many cases difficult to rationalize" (Rothenberg-Pack 1978, 244). Kain (1986) observes that: "relying heavily on the persistence of land use patterns, the model provides very little insight about the forces causing changes in metropolitan structure and is nearly worthless in situations where conditional forecasts are required" (Kain 1986, 850).

Focusing on the land consumption module which is of central interest in this work, policy-induced land use changes are estimated in a simplistic, mechanistic, linear, and additive way by means of activity coefficients without due consideration of nonlinearities in land consumption (more households may not consume proportionately more land), the suitability of available land for the forecast uses and the interactions among allocated forecast uses in each district (the positive or negative neighborhood effects of development in one zone on the adjacent zones). In other words, the lack of a theoretical framework for assessing future changes in land use as a function of autonomous or policy changes leaves much to be desired of EMPIRIC. An important point as regards the allowable "overrides" to the basic land use calculations is mentioned by Rothenberg-Pack (1978): "(their use)…..has important implications for the subsequent use of the estimated activity allocation model (AAM), since the AAM parameter values are very likely to be sensitive to the land use and zoning constraints obtaining during the calibration period" (Rothenberg-Pack 1978, 244). Similarly, the alleged uses of EMPIRIC for policy impact analyses has been seriously questioned on the above and on more detailed grounds

Despite the various criticisms this model has received, it is one of the few urban models which has had many applications in the "golden age of quantitative analysis" – the 1960s and 1970s. Cities to which the model has been applied include: Boston, Atlanta, Denver, Puget-Sound, Minneapolis-St. Paul, Washington, DC. (for details see, Rothenberg-Pack 1978; also, Kain 1986).


4.4. Spatial interaction models

The spatial interaction modeling tradition draws from the original efforts to model interaction of human activities in space based on the analogy of the Law of Gravity in Physics. This was one of the first manifestations of the application of the Social Physics theoretical approach mentioned in chapter 3. Hence, the models included in this group are the well known gravity-type models and their newer versions known more generally as spatial interaction models (Table 4.1b). These models have been used to model a variety of types of interactions arising out of a host of human activities such as the journey-to-work, shopping, circulation, and mobility, in general. As there are numerous accounts of these models in the literature whose emphasis varies with their purpose and the particular subject studied, the present discussion will be confined mostly to those aspects which bear more closely on how these models handle issues of land use and its change.

"Spatial interaction is a broad term encompassing any movement over space that results from a human process. It includes journey-to-work, migration, information and commodity flows…." (Haynes and Fotheringham 1984, 9). In a generic sense, the study of spatial interaction involves the study of both the interacting entities and the form of interaction between them. In the case of analysis of land use or spatial structure, the interacting entities are individuals residing or engaging in some activity (mostly work or shopping) in origin and destination zones which are characterized by corresponding types of land uses – e.g. residential areas, retail areas, employment areas. Although interaction results from the actions of individuals, i.e. from human activities, the description of these models is commonly worded in terms of interactions between different land use types (e.g. between residential and employment areas). These interactions take several forms such as journeys-to-work, shopping trips, flows of goods and information, etc.

Naturally and logically, the strength of interaction between land use types will depend on the magnitude and nature of the associated activity; hence, changes in activities (which are reflected also in changes in their interactions) may cause some kind of land use change – either qualitative (the amount of land occupied by a given use remains unchanged but its character and intensity change) or quantitative or both. The opposite is also true. Changes in land use may induce changes in the associated activities as well as in the interactions between them. Finally, changes in the ease of interaction between two areas – such as those brought about by changes in accessibility following transport network improvements – may induce changes in the interacting activities and the associated uses of land. This is broadly the rationale for considering spatial interaction models as land use change models. Spatial interaction models have been applied to residential location, retail location, and transportation analyses and they have been used also as components in integrated land use-transportation models (these latter versions are discussed in the section on integrated models.). In the following, the basic structure of these models is presented in a historical context and variations of the original formulation are examined.

According to Carrothers (1956), the origins of the use of the concept of gravitation to explain human spatial interactions are placed in the late 19th century in the work of H. C. Carey which was directly inspired by Newton’s Law of Universal Gravitation. Reilly’s Law of Retail Gravitation followed tailored essentially on the same idea and applied to the case of retail trade between cities. Three other researchers, working on the gravitational formula from independent angles – Stewart (1948) on demographic gravitation, Zipf (1949) on the principle of least effort in human interaction, and Dodd (1950) on the interactance hypothesis for human groups – formulated the first versions of the gravity model applied to modeling socio-economic behavior (cited, among others, in Haynes and Fotheringham 1984, 16; Batten and Boyce 1986, 359ff).

Hansen (1959) proposed a first formulation of a gravity/potential model to predict the location of population in residential zones of an urban region. It is based on the assumption that accessibility to employment is the principal determinant of the location of population and it is concerned with "potential interaction" or relative accessibility of zones (Lee 1973). An accessibility index expresses the relationship between population location and employment:

          (4.5)

where:

Aij the accessibility index of zone i in relation to zone j
Ej total employment in zone j
dij distance between i and j
b exponent of distance reflecting the "friction of distance" between i and j

The overall index for zone i is the sum of all individual indices (all other zones j):

           (4.6)

Hansen introduced the notion of the "holding capacity" of a zone which is the amount of vacant land which is suitable for residential development. Combining the accessibility index with the holding capacity measure, the "development potential" of a zone is calculated which can be considered as a measure of attractiveness of a zone:

Di = Ai Hi           (4.7)

Hansen suggested essentially that the share of total population growth which will be received by any one zone of the study region depends on its attractiveness in relation to all other competing zones. Hence, if the projected population at a future time t is Gt, then the allocation of this growth to the each individual zone is given by the allocation formula:

           (4.8)

This last formula provides a simple, "quick and dirty", way to calculate changes in the allocation of population to zones given changes in either holding capacity and/or the accessibility index of each zone. Evidently, despite the simplicity, spatial explicitness, and intuitive appeal of Hansen’s potential model, its lack of theoretical underpinnings, static nature, and the restricted number of ill-defined types of land uses which are considered (residential and employment areas) render it a very naïve model of land use change (if it can be considered as such). In addition, it is not a "complete" gravity model as its operational expression (equation 4.5) includes only one of the two interacting entities – the destination zones whose attractiveness is assessed.

One of the restrictive assumptions of Hansen’s model, that of the equal desirability of the total supply of housing to all households independent of income levels, employment type, etc., was relaxed by Stouffer (1940, 1960) who proposed the intervening opportunities model in which the total supply of residences was stratified in housing submarkets. Stouffer argued that "the number of individuals (or families) Gr going a given distance r is directly proportional to the number of opportunities Qr (residences) at that distance, and inversely proportional to the number of ‘intervening opportunities’ Q" (Romanos 1976, 24). Stouffer’s idea was further improved by Schneider (1959 cited in Romanos 1976, 24) who developed an opportunity-accessibility model. In this model, "the distribution of population growth is a continuing evaluation of potential dwelling units which are rank-ordered from an urban center serving as the location of employment. These potential dwelling units are the opportunities and are obtained from the product of vacant land available for residential development times the appropriate population density. …Similar to Stouffer’s model, this formulation describes an allocation of households by starting their search from the center of the city and moving across rings containing the opportunities" (Romanos 1976, 24-25). For a brief mathematical exposition of this model the reader is referred to Wilson (1974, 397-399). Finally, another opportunity-accessibility model based on the intervening opportunities concept has been designed by Lathrop and Hamburg (1965) to allocate different activities to zones in a region. The model was tested in Upper New York State. For a brief mathematical exposition of this model the reader is referred to Batty (1976, 52-55).

Before moving to the contemporary forms of the spatial interaction models, a few comments are in order with respect to the first generation gravity models presented thus far from the perspective of the analysis of land use change. All models contain, however simple, some behavioral assumptions concerning the relationship between the location of households (residential land uses), the availability of land for development, the availability and location of jobs (industrial and commercial and other services areas), and accessibility. However, they address the problem of how future population will be allocated to zones given the amount of vacant land in each zone. Hence they do not address completely the issue of land use change; this is exogenous to the models. If the amount of available land changes, the models can assess its impacts on the distribution of population to zones but not vice versa. In addition, the models do not contain any equilibrium mechanism and, hence, they do not provide any guidance as to how the interactions of changes in population, employment opportunities and available, developable land will lead to particular spatial patterns.

A contemporary contributor to the spatial interaction modeling tradition is A.G. Wilson (see, for example, Wilson 1967, 1970, 1974, 1985) who avoids the term "gravity" and uses instead the term "spatial interaction models". Here we keep the term "gravity" as a convenient shorthand. Moreover, despite conceptual and operational modifications and improvements, all model versions are essentially similar to the original gravity formula. The gravity model assumes a study region subdivided into a number of zones which are called origin and destination zones. Origin zones are characterized by activities from which flows originate (e.g. residential areas where employees live) to reach destination zones (e.g. employment areas where the employees work). Each zone of the system can be both an origin and a destination zone. The simplest form of the gravity model which parallels the form of the corresponding model in Physics is the following:

           (4.9)

where,

Sij denotes interaction (flow) from origin zone i to destination zone j
Pi is the "size" or "mass" of origin zone I
Pj is the "size" or "mass" of destination zone j
dij is a measure of distance between zones i and j
b an exponent indicating the effect of distance on the interaction between origin and destination zones
k a constant which is empirically determined and adjusts the relationship to actual conditions

The above formula states that the magnitude of the interaction between zone i and zone j, Sij, is proportional to the product of the "sizes" or "masses" of the origin and the destination zones and inversely proportional to a measure of the distance between them. Measures of the "interaction" term include number of trips between zones, volume of goods transported between zones, migration flows, etc. The "sizes" or "masses" of the origin and the destination zones are operationalized variously depending on the application. In the more common applications of the model – in retail and residential location problems – the "size" of the origin zones is expressed by the population of these areas or the income of the population (a proxy of their purchasing power). The "size" of the destination zones is expressed as retail floorspace or revenues of retail stores or number of employees. Usually, it is taken to reflect the "attractiveness" of the destination zones and alternative, multidimensional measures for this term have been proposed in the literature (C. Lee 1973, Wilson 1974, Haynes and Fotheringham 1984).

The denominator of the formula contains the critical expression of the effect of distance on the interaction between origin and destination zones. This is variously known as "friction of space", "impedance effect of distance", "friction against movement", and so on. The literature contains an extensive discussion of the distance function as regards: (a) alternative ways to operationalize the concept of distance in other than metric units – such as in terms of cost, time spent on commuting between origin and destination zones, multidimensional measures combining time, money, and effort spent in commuting between zones, (b) the values of the exponent of the distance function, known also as the "distance decay parameter" – which varies with the purpose of the interaction (e.g. trip purpose) as well as with distance itself and (c) the use of other functional forms of the distance function instead of the one shown above. As to the latter issue, Wilson (1969) suggested a negative exponential function – – which reflects the fact that the exponent (i.e. the magnitude of the effect of distance) varies with distance (C. Lee 1973, Wilson 1970, Wilson 1974, Cliff et al. 1974, Caldwallader 1976, Haynes and Fotheringham 1984).

An alternative, simple form of the model shown in equation (4.8) is the following:

Sij = k Oi Dj f(dij)           (4.10)

where,

Sij, dij and k are defined as above
Oi corresponds to Pi above (O standing for Origins)
Dj corresponds to Pj above (D standing for Destinations)
f(dij) a general symbol for the distance function

Sometimes the origin and destination terms are raised to some power (exponientated) to reflect the difference in importance of the "masses" of origins and destinations. Oi can be considered as the total "production" of interaction flows out of zone i and Dj the "attraction" of flows by zone j (Wilson 1974). The above, classical form of the gravity model does not ensure that the aggregate flows modeled will sum to the total flows observed in the study region. This is called the additivity condition and it can be expressed mathematically as:

Drawing on the above, a form of the gravity model which satisfies the additivity condition for the flows of both the origin and the destination zones is the following:

Sij = Ai Bj Oi Dj f(dij)                     (4.13)

where,

Based on equation (4.13), four alternative forms of the gravity formulation can be distinguished depending on whether information on the interaction sums Oi and/or Dj is available. When either one or both are not known, Oi and Dj are replaced by "attractiveness" terms Wi and Wj respectively (Wilson 1974). The attractiveness terms Wi and Wj can be operationalized in various ways. Common measures for Wi is the amount of housing available in an origin zone (perhaps of a given quality) and for Wj the number of jobs in destination zones. The four forms of the gravity model are:

(a) unconstrained – neither Oi nor Dj are given. In this case the model takes the form of equation (4.9) where Wi replaces Oi and Wj replaces Dj as follows:

Sij = k Wi Wj f(dij)      (4.16)

(b) production-constrained – Oi is given but not Dj. In this case the model takes the form:

Sij = Ai Oi Wj f(dij)          (4.17)

where,

           (4.18)

(c) attraction-constrained – Dj is given but not Oi. In this case the model takes the form:

Sij = Bj Wi Dj f(dij)           (4.19)

where,

           (4.20)

(d) production-attraction-constrained (or, doubly-constrained) when both Oi and Dj are known. In this case the model takes the form of equation (4.13) and Ai and Bj are given by expressions (4.14) and (4.15).

The above summary presentation of the basic gravity (or, more generally, spatial interaction) model is discussed in the following in the perspective of the analysis of land use change. The purpose of the gravity models is basically: (a) to simulate the flows between origin and destination zones and (b) to predict these flows when changes in the origins and/or destinations occur and/or when the accessibility between origins and destinations changes (mostly through transportation network improvements). Another stated purpose of the model is the explanation of the interaction observed between origin and destination zones but this issue will be covered below in the discussion of the model’s underlying theory. Land use change can, thus, be modeled as resulting from accessibility changes, changes in the destination and/or changes in the origin zones. For example, improved accessibility may lead to increases in residential land in certain zones and to decreases in residential land in some other zones. Changes in income of the population living in the origin zones may generate more flows towards the shopping areas and, hence, produce land use change in the destination zones (increase in shopping floorspace). Changes in the distribution of employment centers in the study region (in destination zones) may induce changes in the distribution of households in the origin zones which may translate into changes in the proportions of residential land in each zone. Moreover, these land use changes are assessed by taking into account the constraints on the availability of suitable land in each zone. It is noted that most of the applications and uses of the model refer to urban/metropolitan areas (and not to agricultural, forestry, open space).

Gravity models are spatially explicit, the degree of spatial representation they offer depending on the number of zones into which the study region is subdivided. There has been considerable debate about the proper number and shape of zones and the effects of the zoning system used on the results of the model (see, for example, Broadbent 1970, C. Lee 1973, Openshaw 1977, Wilson 1974, Batty 1976). The models are static or quasi-static (or, comparative static), at best, which means that they do not account for the dynamics which underlies the observed interactions. In terms of level of detail of the land uses considered as well as of the spatial behavior modeled, the most common forms of gravity models concern two main types of land use – e.g. residential and commercial, residential and employment, residential and recreation. However, to make the gravity model more sensitive to the real world variability of human behavior, an important stream of research effort has been devoted to producing disaggregate versions of the models (depending on the availability of data). For example, residential (origin) areas are disaggregated by income group or types/prices of housing; employment (destination) areas are disaggregated by different wage levels, and types of products; and, interaction has been disaggregated by various modes of transport, trip purposes, and stages (C. Lee 1973, Wilson 1974, Gordon and Pitfield 1982, Batten and Boyce 1986). In the same spirit of improving the ability of the model to replicate real world situations, various model versions employ different expressions for the attractiveness of a destination region, different measures of the "distance" term, different configurations of the transport system, etc. (Wilson 1974, Caldwallader 1975, 1976, Batten and Boyce 1986, Haynes and Fotheringham 1984).

In terms of underlying theory, the gravity model has received heavy criticisms as it reflects a social physics conception of human behavior in analogy to the Newtonian physics prototype model and lacks a grounding on theories of urban (or any other regional, environmental) system behavior (for example, see C. Lee 1973, D. D. B. Lee 1973, Romanos 1976, Sayer 1976, 1979a, 1979b). In other words, it represents a mechanistic and deterministic view of aggregate human behavior interpreted according to the laws governing the motion of particles. It has been argued that this model simply represents and reproduces empirical regularities and does not provide a theoretical explanation of the factors accounting for interaction in addition to accessibility (Romanos 1976, Sayer 1976, 1979a, 1979b). Several attempts have been made to remove the analogy with Physics and provide alternative bases for the explanations offered by the gravity model. Wilson (1967, 1970) derived the gravity model starting from concepts of statistical mechanics and applying entropy maximizing principles which draw from the Second Law of Thermodynamics (the Entropy Law). Entropy measures the probability of a system being in a particular state. The entropy of a system is proportional to the number of assignments which correspond to a particular state. The entropy maximizing procedure Wilson developed seeks to reveal the most probable state (of interaction) of the urban system which corresponds to the largest number of possible (observed) microstates (Batten and Boyce 1986). In this way Wilson arrived at the same operational form of the gravity model avoiding the problem of aggregation by starting at the macro-level rather than at the micro-level (Haynes and Fotheringham 1984). Another interpretation of the entropy approach is that it offers a measure of uncertainty or lack of information in the system and, hence, the model can be cast in probabilistic form. However, even the entropy-based derivation of the model does not avoid the analogy with Physics (the Law of Entropy is a law from Physics), it offers simply statistical explanations, and ignores the body of social theories which explain particular spatial interaction phenomena (although there are arguments to the contrary; see, for example, Batten and Boyce 1986). Drawing on this original effort, several other efforts ensued to refine the theoretical basis of the model on the same entropy-maximizing lines (see, for example, Batten and Boyce 1986).

Other scholars attempted to derive the gravity model on the basis of economic principles of utility maximization (Niedercorn and Bechdolt 1969, Golob and Beckmann 1971 cited in Batten and Boyce 1986, 372, Anas 1983). Employing the theory of consumer behavior, an optimal allocation of origins to destinations is obtained by postulating a utility function which reflects the relative preferences of people at the origin zones for the attributes of the destination zones. Assuming a collective preference (utility) function, the gravity model form is obtained by maximizing this function subject to a budget constraint (Batten and Boyce 1986, 372). However, as Haynes and Fotheringham (1984) note, this derivation ran into the problem of applying individual level explanations of behavior to a model which describes aggregate outcomes. Other avenues for deriving the gravity model in an effort to refine its explanatory capability are described in Batten and Boyce (1986) among others.

In the light of the land use theories presented in Chapter 3, the gravity formulation appears to miss several of the determinants of land use and its change. The diversity and multiplicity of forces which come into play and shape land use patterns is drastically reduced as the model applies a predetermined functional form to replicate the revealed (observed) patterns (in modeling jargon, the model is fit to the data). In this way, it collapses the intricate web of causal processes into the neutral explanatory mold of "interaction", masking, thus, the real underlying causal mechanisms of urban spatial structure and change. It ignores the contingent nature of the observed interactions as these are influenced by the particular spatial structure of the study area as well as by the socio-economic, institutional, political and bio-physical forces at play. As Sayer (1979a) has argued these models mistake the mechanisms of change for the effects of change (the interactions and the land use patterns modeled). Evidently, they are a-historical models not simply because they are not dynamic but because they ignore the historical circumstances within which land use decisions are made and changed. As Sayer (1979a, 857) puts it: "(the models) turn development and history into something that happens to us, rather than something we make."

Another point to be noted is that, given that they are models of aggregate behavior whose performance has proved to be satisfactory at high levels of spatial resolution, their application to disaggregate data sheds even more doubt on their suitability as operational devices to model human behavior. At lower levels, the diversity of human behavior and of the physical setting of human activities are much higher than at the macro-level. Similarly, the explanatory factors and mechanisms of change are much more variegated, idiosyncratic and, in general, different (at least in importance and priority) from those which are valid at higher levels. The operational form of the model which applies to the aggregate level may not be suitable to particular household groups, industrial sectors (hence, land use types), modes of transport, not to mention socio-cultural and environmental settings other than those of the urban areas of the industrialized countries where most of its applications are made. At disaggregate levels, a variety of social theories exist to analyze and explain meaningfully human behavior which the gravity formulation simply ignores and, hence, cannot accommodate into its overall structure. Hence, the weak and unsatisfactory explanatory ability of the model.

In terms of specification, despite the efforts to disaggregate the characteristics of the origin and destination zones as well as the modes of interaction between them (depending on the application), the model is restricted to representing the interaction between one pair of land uses at a time; hence, it cannot provide an overall picture of the web of interactions among different types of land uses at any point in time. As regards its disaggregate versions, it is observed that, in addition to the theoretical problems mentioned above, the models do not seem to take into account the interactions between the disaggregate groupings (e.g. between the location decisions of high, middle and low income groups) – the results of the disaggregate version are added to obtain the final numbers of flows, and the zonal distribution of whatever activity is being modeled. The policy variables which can be introduced in the context of the model’s use for impact assessment are restricted to the land use types being represented. One of the heaviest uses of the model (especially when used in the context of integrated models which are discussed in a separate section below) concern the impacts of policy intervention in transportation, the impacts of new retail centers and improvements in residential areas. Constraints on availability of land (or, housing units) within each zone can be introduced. In this way, the model provides an avenue for simulating the impacts of various factors which in one way or another impinge on the availability of land which is suitable for particular purposes (e.g. environmental deterioration).

Despite efforts to improve the functional form of the distance function, the overall results are conditioned by the particular (multiplicative) formulation of the model. For the model to be used in impact assessment or for obtaining conditional predictions of future land use patterns, it has first to be fit (calibrated) to actual data. The model is then used to obtain forecasts using the estimated coefficients under the assumption that they will remain constant and will be the same in the projected date in the future. This is a very restrictive assumption as it implies that the same socio-economic, environmental and other conditions which gave rise to the observed data used to calibrate the model will not change and will apply when changes in the urban or regional system are introduced. This runs counter to the logic of policy interventions which is exactly to change the existing conditions and forms of behavior to achieve better (e.g. sustainable) land use patterns and forms of interaction.

The gravity model is data hungry especially in its disaggregated versions. Given that it is spatially explicit, the higher the level of spatial and land use detail required, the greater the demand for data which may be difficult to meet except in exceptional cases of complete record keeping systems. Otherwise, many data requirements may be compromised, proxy variables may be used and the results may not be those anticipated by the original modeling intent.

The "family of spatial interaction models", an expression commonly used to denote the basic model and its variants, has found numerous applications in various thematic areas – retail trade, market area and commodity flow analysis, transportation analysis and planning, residential location, migration, tourism and recreation analysis, at various spatial levels – urban, interurban, regional, interregional, and by various types of public and private agencies (e.g. transportation planning ministries and boards, planning agencies). Its use for the analysis of land use change is rather secondary and indirect compared to the other thematic uses. An exception is their use in integrated land use-transportation models where the distribution of land uses and of transportation flows are analyzed simultaneously with the main purpose of assessing the impacts of transport/accessibility on land use (see the section on integrated models).

Closing this section on spatial interaction models, a brief evaluation of the ability of these models to deal comprehensively with analysis of land use change is undertaken. Spatial interaction models can deal with only two land uses at a time; hence, their capacity to cover the complete pattern of land uses in urban, rural and other regional contexts appears to be limited. The point is that the incidence of concentrations of particular land use types in certain zones which are included in the model (e.g. residential areas) may be related to other uses which are present in these zones but which are not represented in the model. The land use changes which result from accessibility and other changes discussed before in the context of the gravity model may be modified and conditioned by the presence of these other uses.

Most important, however, is the manner in which these models conceptualize land, land use and its change and which relates to their theoretical basis. Although they are spatially explicit and account for the distribution of land uses in the zones of the study region, they reduce the modeled land use activity to the center of each zone, disregarding, hence, the actual variability of land use intensity within the zone which may affect the resulting changes in important ways. The influence of the shape, number of the zones, and distribution of land uses within each zone on the results of the spatial interaction models have been examined since the early years of their contemporary evolution (Openshaw 1977). It appears questionable, therefore, if these models can represent satisfactorily extensive land uses such as agriculture and forestry especially when the purpose of the modeling exercise is to obtain spatially differentiated land use impacts as well as the environmental impacts of these land use changes. In fact, no applications of spatial interaction models to such uses are known to this author.

In addition, only one characteristic of the land use types modeled usually enters the analysis – e.g. population of the residential areas, or the income of the population, or the revenues of the shopping areas, or their floorspace. The many other environmental and socio-economic characteristics associated with these land uses (and, more importantly, with the land users) are not accounted for and the models base their explanations as well as their results on a very limited set of partial aspects of a few (and not necessarily the most important in all contexts and cases) determinants of land use change. Despite efforts to develop multidimensional measures of the origin and destination terms, these models do not capture, in general, the multidimensional character of land use and its change as Chapter 3 on theories of land use change attempted to reveal. In a broader perspective, their weak theoretical foundations and reductionist ( deductive ) mode of analysis deprive them of the ability to represent the complex web of interactions among the bio-physical and socio-economic drivers of land use change. As land use changes result from several other causes except from those accounted for by the spatial interaction models, these models, then, have a limited ability to address a host of other (policy) questions related to other determinants of land use change; for example, the land use impacts of climatic change. In this and similar situations, the only way to use these models to analyze the impacts of determinants other than those they account for directly is to assess, outside of the model, their impacts on the origin and/or destination zones and/or accessibility and then use these estimates in the model as usual. However, the question is whether it is theoretically sound and acceptable to manipulate these estimates by means of the functional form of the spatial interaction model. In conclusion, much more research focusing especially on broader conceptualizations of land use and its change is needed to examine if, how, and to what extent spatial interaction models can address the variety and multitude of questions related to land use change.