|
||
| BACK | NEXT | WEB BOOK |
1. Definitional Issues
Among the first empirical or data-related questions to be settled in studying migration are: what is the appropriate time period for the analysis; what are the administrative units that make up origin and destination places of migration (for studies that involve areal units rather than individual-level observations); and is the focus on gross or net migration flows? In this subsection we discuss time periods covered and the nature of spatial or aerial units analyzed. Interested readers are referred to the excellent evaluation of federal migration data in Isserman, Plane and McMillen (1982), which includes a discussion of data accuracy and timeliness issues.
Obviously, the longer the period over which migration flows are calculated, the larger the total number of migrants, everything else equal. Over any given weeklong period, fewer people will move between two places, or into and out of a place, than will move over a one-year period, for example.
In the United States, where migration data are somewhat difficult to come by because there is no national registry, migration analysts usually cannot determine the period of migration themselves or through the choice of experimental design. Instead, they are forced to work with whatever data they have available, and the time periods at which the data happen to have been collected. Usually this will be a secondary data source, and the data will not have been collected for the express purpose of studying migration. Examples include the decennial Census of Population and the Current Population Survey (CPS), which is administered each month to a random sample of respondents.
An interesting fact here is that migration increases with the amount of time allowed for. But, the intensity of migration per period or unit of time declines in this case. The reason for this result is that people may have died since the previous enumeration, they may have moved out and back in again (return migration) and hence not appear on the census rolls as migrants, or they may have moved multiple times but only one corresponding move was recorded (Isard 1976, 53).
1.2. Spatial Unit of Analysis for Migration
For present purposes, "movers" are all of those people who lived in a different house in the United States one year prior to the most recent survey. The U.S. Census Bureau defines "mobility status" (5) as follows: "The U.S. population is classified according to mobility status on the basis of a comparison between the place of residence of each individual at the time of the survey or census and the place of residence at a specified earlier date. Nonmovers are all persons who were living in the same house or apartment at the end of the period as at the beginning of the period. Movers are all persons who were living in a different house at the end of the period from that in which they were living at the beginning of the period."
The larger the spatial unit of analysis used to study migration, the smaller the number of migrants. In the extreme case, if we use the entire United States as our administrative boundary unit, then we will not measure any internal migration in a given year. In this case, it only makes sense to study international migration into or out of the country as a whole. One logical choice of spatial units is census regions of the country (the West, Midwest, Northeast and the South). Another is states within the regions; yet another is counties within states. The smaller the administrative unit that is chosen, the larger the number of migrants.
This direct relationship between the size of the spatial unit of analysis and the number of migrants measured over any given period is obvious from recent data collected in the Current Population Survey. In March 1997, there were about 262,976,000 people living in the United States. Of these, 219,585,000 people, or 83.5% of the total population, were considered "nonmovers"they were still living in the same house in 1997 as in 1996. A total of 42,088,000 people, or 16.0% of the population, were considered "movers." Another 1,303,000 people, or 0.5%, had moved into the United States from abroad. Some of these may have been U.S. citizens returning from overseas assignments for the government or private businesses.
Obviously, some of these movers may have moved across the street, while others moved across the country. To distinguish among these various groups, the government tabulates separate statistics according to where people moved in terms of various administrative units, which represent increasingly larger spatial units (Table 1).
| Table 1: General Mobility Statistics, the U.S., in Thousands (March 1997) | ||
| Total movers: | 42,088 | 100.0 % |
| Moved within same county: | 27,740 | 65.9 % |
| Moved to a different county: | 14,348 | 34.1 % |
| within same state | 7,960 | 18.9 % |
| different state: | 6,389 | 15.2 % |
| same region: | 3,220 | 7.7 % |
| same division: | 1,905 | 4.5 % |
| different division: | 1,315 | 3.1 % |
| different region: | 3,168 | 7.5 % |
| Source: U.S. Census Bureau, with author's calculations. | ||
Note the following important relationship in this table: Nearly twice as many people moved within their county as those who moved across a county line. Also, these are individual-level statistics; people in fact tend to move as family units if they have intact families, and as single individuals otherwise.
If the county is used as the cutoff criterion defining whether or not an individual has migrated, then 14.348 million individuals qualified as migrants over this period (14,348,000 persons). If a state is used as the cutoff criterion, then only 6.389 million individuals qualified as migrants. If the nine census regions (New England, Middle Atlantic, South Atlantic, East South Central, West South Central, East North Central, West North Central, Mountain and Pacific regions) are selected as the criterion, only 3.169 million individuals qualified (6,389,000 - 3,220,000 people). Last, if only the four census divisions (reported above) are used as the criterion, only a paltry 1.315 million people qualified as migrants (a map of the United States showing census divisions and regions is available in any Statistical Abstract of the United States). Clearly, therefore, the stricter the requirement in terms of the administrative unit that has to be crossed before someone qualifies as a migrant, the smaller the number of migrants that is identified, everything else equal.
Isard (1976, 53) pointed out that there are serious problems with using state administrative boundaries to study migration patterns, and this grows out of the problem just identified. In particular, states (and counties) have significantly different land areas: Texas has 267,277 square miles (Alaska has even more, 615,230), while Rhode Island has only 1,231 square miles, or is 1/217th the size of Texas (Washington, D.C. consists of only 68 square miles). Likewise, the number of counties differs considerably across states, with Texas having 254 counties and Rhode Island not having any. These large discrepancies intrinsically tend to skew migration rates (the number of people migrating per beginning-of-period population) upwards for the smaller places (states), and downward for the larger ones.
As Isard further suggests (1976, 53), "Ohio and Tennessee have approximately the same area; yet because of Tennessee's elongated shape many more short migrations cross the state line than they do for Ohio." Another concern arises from the population distribution within each state. In some states the major population centers are on the border of adjacent statese.g., St. Louis, Missouri, Kansas City, Kansas or Chicago, Illinois. In others, the centers are more toward the geographical center of the state, and thus the populations tend to be more isolated from the influences of adjacent states. Examples of such states are Colorado and California (with Denver and San Francisco).

A few techniques exist for estimating the amount of migration that has taken place in a previous period. It should be noted that these are estimates, since the United States does not keep population "registers," in which individuals are required to register themselves with a central agency within two weeks of moving. Such strict tracking of peoples' movements would be considered unacceptable in a country such as the United States, where individual freedoms are valued highly and no national identification system is used. In contrast, some European countries track the movements of their residents much more closely, which entails many advantages for studying migration behavior. For example, it is possible to cross-classify data in many different ways so as to provide virtually complete information about migration characteristics over space.
2.1. The Residual or Survival Method
With this method, the number of migrants into or out of
a region over a specific period of time (
) is estimated from a straightforward accounting
identity:
M
=Popt+
-
Popt - B
+
D![]()
where M
is the
change in population in the region due to migration, Popt+
is the
population at the end of the period over which migration occurred and
Popt the population at the beginning of that
period. B
is the number of live births over this period,
and D
is the number of deaths that have occurred. Note
that both of the last two variables refer only to the population originally
present in the community. We are abstracting from international migration,
although it would be straightforward to include this in the calculation.
Obviously, the calculated migration number can be positive, negative or even
zero. An example of the use of this method is provided in the exercises and
discussion questions at the end of this section.
Notice that some noise or random error is introduced into this estimation procedure. For example, in most census enumerations it would be very costly to distinguish between births and deaths of the migrant population as distinct from the original population. Consequently, separate statistics are usually not compiled, and the result is an underestimation of the actual in- or out-migration that took place. In particular, a birth that occurred in a migrant family would be credited to the existing population and not count as migration. The larger the amount of migration taking place, relative to the resident population, the more serious this bias. One solution to dealing with this problem has been the use of death and birth rates obtained from other studies where migration did not take place. This, too, has shortcomings, since it is not known how transferable the rates are from the other studies and places (or populations).
Sometimes even rudimentary statistics on births and deaths are not available for a particular community. In such situations researchers have resorted to using expected survival rates, which are applied to actual population data to obtain estimates for natural population increases due to births and deaths. A fundamental problem with these accounting identities is that they do not provide any information on where people moved to, or where they came from.
2.2. Data from the Census of Population
The decennial U.S. census collects data on where people
were born (nativity data) or where they resided in a prior
yearsuch as five years ago. Using such statistics it is possible to
calculate either net or gross migration flows into or out of a given state. For
example, using the place of birth method, at the state level, we have
information on where state residents were born, and how many survived since the
last census was taken (say,
years or months ago).
Therefore, for each state we can calculate the number who were still living in
the same state at time t+
, and the number living in each of the other 49 states. More
specifically, a matrix can be calculated showing the flow of out-migrants to
each of the other states (i.e., their migration destination) as well as the
flow of in-migrants from all other states (i.e., their place of origin). This
is the same procedure as that used in Section I for the four U.S. Census
Divisions. In this manner, if
=10 years (in a decennial
census), migration flows can also be compared over time, that is, across census
decades. Use of this method is problematic in areas where there are large
numbers of individuals who were born in foreign countries. Usually this is more
of a problem in metropolitan than in rural communities.
Another common method of estimating migration flows, the residence method, uses information on where the respondent lived in a prior year. Currently the question is phrased to cover a time lag of five years. This is then compared with data on where the individual currently resides. Note that this method fails to capture all persons less than five years of age or those who have died within the last five years, which again introduces a potential error into the migration estimate. As was true with the methods discussed in subsection 2.1., neither of the two methods discussed here can capture moves made in between the census periods, including return migrations.
2.3. Estimating the Components of Migration
Even with Census of Population data, it is possible to identify some of the characteristics of those who have migrated from one place to another. A precondition for this is that the underlying data contain sufficient detail, as illustrated with the next formula:
Mj,
=Popj,t+
-
Popj,t - Bj,
+
Dj,![]()
Each of these letters is as defined previously, except
that we have introduced a subscript to denote one of j=1,2,...,J
population characteristics. These may include such groups as income groups, age
groups and racial groups. In the case of race, for example, Bj,
might
refer to births of whites for j=1, births of African Americans for
j=2, births of Hispanics for j=3, and so forth. One common
age-related component analysis uses five-year intervals of age, because data
are often collected on this basis in census instruments.
This type of component or differential analysis has been used to study the characteristics of people leaving rural areas for urban areas (Baker 1933, 1936; Beck 1934) or the South (Greenwood 1985). The total migration flows are obtained by aggregating over each of the individual components.
When the residual estimation method is used to obtain
migration components, survival rates for different age and sex cohorts are
employed. These rates can be obtained either from tables showing age
distributions from the census for the given period,
, or life tables
compiled, for example, by insurance companies. The results on the expected
number of persons by age and sex in a community are then compared with actual
numbers enumerated at t+
to arrive at the residual,
which represents the migration component. Thus, we have:
![]()
Here rj
represents the survival rate for the jth component of population (e.g.,
the group of 25-30 year-olds).
These calculations represent a forward-projected survival rate, which fails to capture in- or out-migrants who subsequently died within the time span , or migration of those younger than who had not yet been born at time t. To address this problem, a "reverse-projection" can be used with appropriate, component-specific survival rates, rj.
To do this, start with the number of individuals in each cohort at time t+, and divide this number by the component-specific survival rate rj. This provides a backward-projected expected number of individuals at time period t. If the actual (enumerated) number at time t is larger than the calculated number, net out-migration must have occurred over the period of analysis.
As an illustration, consider the group of 25-30 year olds, and suppose that rj=95% for this group. Suppose the enumerated number of individuals in 1990 is 1,000, and that it is 800 in the year 2000. Dividing the latter number by 0.95 yields 842 for 1990. In other words, this is the theoretically expected initial number of 25-30 year olds in 1990, given that the survival rate for this particular cohort is 95%. Given that the enumerated number is 1,000, however, we know that (1,000 - 842=) 158 individuals must have migrated out of the area on balance (i.e., this is a net out-migration number).
| Illustration of the backward-projection method | ||
| Year 1990 | Year 2000 | |
| Actual population | 1,000 | 800 |
| Expected survival rate | 95% | |
| Expected population | 800/0.95=842 . . . . . | [backward-projected] |
| Migration component | 1,000842=158 | |
As Isard points out (1976, 62), "[a]lthough the forward
method implies that no persons who died during period (
) have migrated, the reverse
method implies that all those in the estimating cohort who died during
period (
) are migrants."
Thus, one estimate provides an upward-biased estimate, while the other is a
downward-biased estimate. Together, these estimates present bounds on what is
likely to have happened in reality, and they should be reported as such.
Alternatively, a simple average of the two numbers could be reported.
3. Other Measures of Migration
So-called transition probabilities, also known as Markov processes, are used by demographers to study population migration, and especially to make population projections or forecasts. In essence, these are migration rates, obtained by dividing the number of migrants by a base population, such as the population in the place of migration origin. Two other important measures of migration presented here, the gross out-migration rate and the gross in-migration rate, were briefly examined in Section II. Also discussed in Section II was the net migration rate, defined as the gross in-migration rate minus the gross out-migration rate. A related concept is that of migration efficiency, which is discussed below.
Suppose we are considering flows of migrants,
Mij, between places i=1,2,...I and
all migration targets j=1,2,...I, over the period t and
t+
. The initial population, at time t, in
place i is equal to Popi. Then we can
define the transition probability for this population, or its propensity to
migrate, pij, as:
![]()
Note that we could also express birth and death rates on a per person basis, i.e., by dividing by the initial population of the region, to calculate net changes in population over time.
For the sake of simplicity, consider the four census
regions of the country: the Northeast, Midwest, West and South. Then we can
calculate the propensity of individuals to migrate from any one region to the
other three regions using a matrix of transition propensities,
P. Notice that the diagonal elements of this matrix,
pii, are simply the share of the population in
each region who do not migrate but survive from the period t to
t+
(the nonmovers). The off-diagonal elements are
the pij values calculated as shown in the above
equation.
| Table 2: Markov Transition Matrix for U.S. Census Regions, 1996 | |||||
| Population | From/To | Northeast | Midwest | West | South |
| 51,580,000 62,082,000 58,523,000 93,098,000 | Northeast Midwest West South |
0.98691 0.00190 0.00121 0.00271 | 0.00246 0.98752 0.00402 0.00516 | 0.00353 0.00335
0.98676 0.00431 |
0.00701 0.00723 0.00801 0.98783 |
| Source: Authors calculations. | |||||
Each row of the matrix in Table 2 represents one of the four regions as a migration origin, and the four shaded columns show how the population from that origin is distributed across the destination regions. This is similar to a make- and use-matrix familiar from input-output analysis. Because of the way probabilities are defined, all of the elements of the matrix are less than or equal to 1. If one of the pij's or pii's were equal to 1, that would mean that the region in question lost all of its population to other regions, or that it lost none of its original population (pii=1). Note also that the rows of the P matrix have to sum to 1, to account for or allocate all of the population correctly.
Now, we need a 1x4 vector of the initial population in
each of the three regions, i.e., the population at time t. Multiplying
this vector times a 4x4 square matrix returns another 1x4 vector, which
represents the (post-migration) population in each of the four regions in the
period t+
. This is
a typical Markov transition process. If there is no net population growth, then
we end up with the same number of people in the post-migration period,
t+
, but this
population is distributed in a new way across the four regions.
|
From/To |
Northeast |
Midwest |
West |
South |
Total population |
|
| Original population |
51580000 |
62082000 |
58523000 |
93098000 |
265283000 |
|
|
Northeast |
0.98691 |
0.00246 |
0.00353 |
0.00701 |
1.00 |
Row sum |
|
Midwest |
0.0019 |
0.98752 |
0.00335 |
0.00723 |
1.00 |
|
|
West |
0.00121 |
0.00402 |
0.98676 |
0.00801 |
1.00 |
|
|
South |
0.00271 |
0.00516 |
0.00431 |
0.98783 |
1.00 |
|
|
Final population |
51345882 |
62149752 |
58539460 |
93244195 |
265279289 |
|
|
Difference* |
||||||
| Change in population |
-234118 |
67752 |
16460 |
146195 |
230407 |
-3711 |
|
Source:
Authors calculations. |
||||||
In the above example, the red vector represents the (transposed) initial population of the Northeast, Midwest, West and South, respectively (from the first column in Table 5). The green matrix contains the transition probabilities reported in Table 2. The blue vector at the bottom represents the new, reallocated population after the migration transition has taken place.
The migration propensities pij can be interpreted as the probability that individuals from a given region will migrate. Researchers study these propensities over time to see whether people are more or less likely to move, for example, over the business cycle. In fact, one might suspect that in downturns people are more likely to move, especially if the business cycle hits different regions of the country more or less severely and at different times. Also, the diagonal of the transition matrix contains interesting information in itself. By looking at the magnitudes of the numbers, we can infer which regions are the least likely to lose population. Presumably, people in these regions have the strongest ties to their local communities, or do not have better economic opportunities elsewhere.
3.2. Gross and Net Migration Rates
Using notation similar to that in Plane and Rogerson (1994, 95), let Mij denote the number of migrants leaving place i and moving to all other places j, where j runs from 1 to n. Since the number of places potentially receiving people has to equal the number of places potentially sending people, the counter for i also runs from 1 to n. In fact, our migration story can be made visual by imagining a nxn matrix placed into a coordinate system with places i on one axis and places j measured on the other axis. The 45-degree line would consist of zeros since a place cannot receive migrants from itself. However, it can receive migrants from any other place, and it can send emigrants to any other place in this system.
We can then define gross out-migration (GOM) from place h to all other places except for h as follows:
and, likewise, define gross in-migration (GIM) from all n-1 other places as:
![]()
Now, we can define net migration into or out of a given place, h:
NMh=GIMih - GOMhj
Clearly, NMh is positive if a place receives more migrants from other places than it loses to those places, and NMh is negative if a place receives fewer immigrants than it loses.
It is critical to precisely define what we mean by rates of in-, out- or net-migration. More specifically, it is important to choose carefully the population base with respect to which the rate is calculated, that is, the denominator. This is considered to be the population universe that is "at risk" of migrating. For the out-migration rate from community i, the existing population is clearly the appropriate one to use: gomri=GOMi /Popi. However, as Plane and Rogerson point out (1994, 97), when calculating rates of in-migration, the existing population is already in place and, therefore, does not qualify as being "at risk" of migrating into the region. Instead, the population outside the region should, theoretically, be used as the base. This detail is usually ignored in applied studies, and instead the same population is used in the calculation rate for in-migration as for out-migration: gimri=GIMi /Popi.
From this, it is straightforward to calculate the net migration rate:
nmri=gimri gormi
Another concern is the point in time at which the base population is measured. Usually, the beginning period (pre-migration) is used, but a more accurate procedure would involve using the population at the midpoint of the period over which the migration flow is calculated.
The concept of migration efficiency has been used to examine migration turnover in different countries or states, or the same countries (states) over time. Using the concepts of gross out- and in-migration for a community i defined earlier, i.e., GIMi and GOMi, migration efficiency for the community is defined as:
![]()
If the number of out-migrants is exactly equal to the number of in-migrants, then the migration efficiency is zero. Theoretically, the migration efficiency would be -100 if there were only out-migration from a community, and +100 if there were only in-migration. In order to compare communities with one another, absolute values of migration efficiency are used.
One way of understanding how this measure works is as follows: Each state has a net migration stream (in or out) over a given period, which is generated by outflows and inflows of migrants. The question is, how many in- and out-migrants are needed to generate this net flow? Think of the in-migrants as replacing the out-migrants. The larger the total number of migrants relative to the net number, the less efficient is the state in terms of migration. In other words, the total number of (in- and out-) migrants needed per net migrant is larger in less efficient states.
| Example: Comparing Minnesota and New York. | ||
| Migration efficiency for MN
is 1:146.05 Migration efficiency for NY is 1:2.77 |
![]() |
|
![]() |
NY is much more efficient than Minnesota: 146 total migrants are needed in MN to generate one net migrant, whereas in NY only 3 (2.77) total migrants are needed per net migrant. | |
Migration efficiencies calculated for each state for the period 1985-90, and converted to absolute values, reveal that the state of New York has the highest efficiency of all states: 36.1% (Table 3). The next-highest state is Nevada, with a migration efficiency of 35.9%. Note that one of these highest-ranked states had a net outflow of population, while the other experienced a net inflow. The state with the third-highest population turnover due to net migration was Louisiana, with 35.7%.
| Table 3: Migration Efficiency by State and Rank Based on Absolute Value, 1985-1990 | ||||||||
|
State |
Net Migration |
Migration efficiency |
Rank |
State |
Net Migration |
Migration efficiency |
Rank |
|
|
Alabama |
35,869 |
5.8 |
40 |
Montana |
(52,604) |
23.7 |
7 |
|
|
Alaska |
(48,485) |
18.7 |
17 |
Nebraska |
(39,950) |
12.4 |
29 |
|
|
Arizona |
216,177 |
20.0 |
14 |
Nevada |
172,852 |
35.9 |
2 |
|
|
Arkansas |
24,247 |
5.3 |
43 |
New Hampshire |
62,060 |
19.4 |
16 |
|
|
California |
173,586 |
4.6 |
45 |
New Jersey |
(193,533) |
14.5 |
23 |
|
|
Colorado |
(77,998) |
7.7 |
36 |
New Mexico |
(11,457) |
2.9 |
49 |
|
|
Connecticut |
(51,843) |
8.2 |
35 |
New York |
(820,886) |
36.1 |
1 |
|
|
Delaware |
25,881 |
15.9 |
19 |
North Carolina |
280,882 |
23.1 |
9 |
|
|
District of Columbia |
(54,411) |
20.0 |
13 |
North Dakota |
(50,947) |
31.2 |
6 |
|
|
Florida |
1,071,682 |
33.6 |
4 |
Ohio |
(141,179) |
10.2 |
32 |
|
|
Georgia |
302,597 |
23.2 |
8 |
Oklahoma |
(127,760) |
18.6 |
18 |
|
|
Hawaii |
(20,256) |
5.7 |
41 |
Oregon |
82,572 |
12.8 |
26 |
|
|
Idaho |
(19,579) |
6.6 |
37 |
Pennsylvania |
(77,689) |
5.3 |
44 |
|
|
Illinois |
(342,144) |
20.4 |
12 |
Rhode Island |
12,268 |
6.1 |
39 |
|
|
Indiana |
3,128 |
0.4 |
51 |
South Carolina |
109,341 |
15.9 |
20 |
|
|
Iowa |
(94,372) |
19.5 |
15 |
South Dakota |
(22,443) |
14.0 |
25 |
|
|
Kansas |
(23,450) |
4.1 |
46 |
Tennessee |
131,462 |
15.1 |
22 |
|
|
Kentucky |
(20,124) |
3.5 |
47 |
Texas |
(331,369) |
12.5 |
28 |
|
|
Louisiana |
(250,654) |
35.7 |
3 |
Utah |
(36,162) |
9.3 |
34 |
|
|
Maine |
33,318 |
14.4 |
24 |
Vermont |
16,985 |
12.8 |
27 |
|
|
Maryland |
100,890 |
10.5 |
31 |
Virginia |
227,872 |
15.2 |
21 |
|
|
Massachusetts |
(96,732) |
9.8 |
33 |
Washington |
216,270 |
20.9 |
11 |
|
|
Michigan |
(132,999) |
12.3 |
30 |
West Virginia |
(73,655) |
22.9 |
10 |
|
|
Minnesota |
4,362 |
0.7 |
50 |
Wisconsin |
(35,854) |
5.5 |
42 |
|
|
Mississippi |
(27,130) |
6.6 |
38 |
Wyoming |
(56,693) |
31.3 |
5 |
|
|
Missouri |
28,057 |
3.2 |
48 |
|
||||
The states with the lowest migration efficiency, ranked in terms of absolute value, were Indiana (0.4%), Minnesota (0.7 %) and New Mexico (2.9%). Indiana and Minnesota experienced net in-migration of population, while New Mexico had a net loss due to out-migration. Thus, there is no simple relationship between the direction of the net migration flow and the efficiency of migration.
Another interesting observation is that Kentucky and Louisiana rank at opposite ends of the distribution in terms of migration efficiency or turnover: Kentucky is forty-seventh while Louisiana is third. Earlier, in Section III, we saw in Table 2 that Louisiana and Kentucky were ranked second and fourth in the nation in the share of population born within the state (with 79.0% and 77.4%, respectively). Thus, there is also no simple relationship between population born in the state and migration efficiency.
Many communities experiencing rapid population growth are eager to obtain forecasts of future in-migration in order to better anticipate and plan for future growth. For example, as discussed in Section I, the city of Tooele, Utah, is facing enormous population growth, and recently released population forecasts under three different scenarios: a high-, medium- and low-growth situation. Increases in population under the different scenarios due to in-migration have important implications for city planners and existing residents in terms of building roads, sewer lines and schools as well as for necessary zoning regulations. Here we review three widely used methods for forecasting population growth into the future, following Isard (1976, 64 ff).
One of the most simple techniques for forecasting population growth (or decline) is to apply past growth rates to the most current population data available, or to use a linear or nonlinear trend line. Either method essentially assumes that past conditions will continue into the future. Alternatively, we could work directly with population migration data to forecast total population of a community. The caveat that two points in time are hardly sufficient to establish a trend bears repeating here, as does the fact that nobody can see with certainty into the future. However, the larger the number of points from the past, the more reliable the forecast into the future generally is.
A number of different functional forms are available for forecasting purposes using trend analysis. The most straightforward is a simple linear trend line:
Mi,t= a+ ßt
where migration over time is regressed against a time variable. For example, the time variable may start with the year 1900 and increase in increments of 10. Suppose corresponding migration data are available for a community from the years 1900 to 2000, also in ten-year increments. To obtain a forecast for migration in the year 2020, we would first carry out a linear regression to obtain estimates of parameters a and ß. Then we would simply plug the number 2020 into the equation for t, and calculate the corresponding forecast number of migrants, Mi,2020.
If a scatter or x-y plot of the data reveals nonlinear relationships, then we might fit a cubic functional form:
Mi,t= a + ß1t + ß2t2
which uses up one additional degree of freedom, or an exponential form, such as:
logMi,t=loga + ßlogt
which is derived by taking logs of the equation, Mi,t=a t ß.
Using an exponential decay function to model population decline in a community may be more satisfactory than using a linear function, since the population is unlikely to drop to zero altogether, and it cannot become negative. Hence, an asymptotic approach to the x- or time-axis may be appropriate. Flows of migrants can turn from being positive to negative in a community, and as the population drops to a very small number, out-migration will similarly decline to zero.
This illustrates the interesting point that migration, as a flow variable, is just the derivative over time of population, as a stock variable (ignoring deaths and births, which do not affect the basic argument). Many counties in rural eastern Kentucky, for example, experienced rapid inflows of population during the oil crisis. As the crisis subsided and energy prices returned to more normal levels, in-migration stabilized and net migration flows became zero. Eventually, migration rates turned negative with more people leaving than entering these communities. Today, many of these communities continue to lose people, but at much lower rates than was the case in the years immediately following the peak of the energy crisis.
Forecasting population change due to migration is more complex than forecasting change due to births and deaths. The latter variables, by and large, are fairly stable over time and they (Isard 1976, 63) "usually change character slowly, reflecting the net effect of change in a complex matrix of social, economic, and political forces. ... In contrast, the migration element of population growth in an open area is much more volatile and marginal. It tends to be linked more directly, and to respond much more quickly, to a few dominant economic and other forces." As an illustration, consider the impact of the California "gold rush" of the late 1840s on the migration of people to the West Coast. At that time a large number of people moved to California hoping to benefit from the discovery of new gold deposits.
4.2. Ratio and Related Techniques
In the ratio method, migration is linked to growth in the population of a region. The link is usually a rate, which may itself change over time as a function of changes in other variables or conditions. For example, 15% of the growth in the population of an area may be the result of migration. Or, if an accurate national forecast of population growth is available, then the analyst might reasonably assume that a certain portion of that growth will accrue in the region of interest. This is similar to the step-down techniques sometimes used in regional input-output tables, where the same relationship between inputs and outputs is assumed to hold at the regional and national levels (e.g., Goetz and Debertin 1993).
The quality of migration forecasts based on ratio techniques can be increased significantly if good forecasts are available for the variables that influence migration. Forecasts of income or employment growth in an area are one example of this possibility. Yet more sophisticated forecasts may take into consideration changes in the demand for specific occupations. To illustrate, the fact that Toyota Motor Manufacturing established a car assembly plant in Kentucky, an engine plant in West Virginia, and a truck assembly plant in Indiana, is likely to have influenced the decisions of new migrants to these areas as well as local residents who would have potentially migrated elsewhere had these plants not been established in the respective communities.
The net migration projections from the Census Bureau presented in the previous section are important because they are used by so many public and private agencies to get a glimpse of what the future may hold. The method used to obtain these projections are discussed in Campbell (1997, 6), and presented here in a condensed manner to provide a flavor of how they are generated. The appendix contains the full discussion of the methodology used to calculate domestic migration flows.
Overview: In the last three sets of State population projections issued by the Census Bureau, we used a modified multi-state projection system. Multi-state projection or demographic accounting systems overcome many of the limitations of a net migration approach.[36] State-to-State migration data were used to model migration flows between States explicitly. The rate of moving from one origin State to one destination State was calculated and applied to the base population of the origin State. Using this approach in a projection system, the potential number of in-migrants to a State were linked to the geographic as well as the age, sex, and race/ethnic distribution of the population. The use of State-to-State migration rates also ensured that the total for the nation of all projected internal out- and in-migration was zero, a necessary ingredient of any multi-state model.
Note the remarkable level of detail with respect to the sociodemographic characteristics of the population of migrants in this projection method.
4.4. A Cautionary Note About Forecasts
Citing a study by Rogers (1990), Plane and Rogerson (1994) caution against the use of net migration rates in making population forecasts for an area. The reason, as they point outand the study by Rogers illustratesrelates to the use of the existing population in the county or region as the population base that is "at risk" of migrating, i.e., the denominator used to calculate the net rate. Recall that this net rate is the difference between in-migrants and out-migrants.
The problem arises because the "at risk" group of (potential) in-migrants is all people outside the region, not the existing population for which we are trying to make a forecast. To the extent that the composition of this population outside the region changes over time, as a result of prior migration, the current in-migration rate will not be accurate and the forecast will be off. Plane and Rogerson point out (195) that net migration rates are frequently usedincorrectlyto make forecasts for individual regions that make up a country. If migration rates differ across the individual regions that make up a country, then overall, systemwide identities can no longer hold as the base populations in the different regions change.
5. Data Sources for Migration Studies
As mentioned earlier, obtaining data to study migration behavior is a challenge in a nation without a population registry. Detailed official data are compiled and stored in the United States when a new citizen is born and when a resident dies. Thus, all births and deaths are officially recorded. What happens in between these landmark events of an individual's life is an altogether private matter, not tracked formally by any agency.
A number of different data sets on migration are available from secondary sources, which are reviewed in this section. In addition to different population censuses or surveys, including the Public Use Microdata Sample (PUMS) extracted from the decennial census, migration data can also be compiled from Internal Revenue Service tax records.
5.1. U.S. Census Bureau Web Site
The U.S Census Bureau keeps a number of data sets related to geographic mobility and migration on the World Wide Web (see http://www.census.gov/population/www/socdemo/migrate.html). The four unrelated migration-related data sets are the Current Population Survey, the Survey of Income and Program Participation (SIPPs); Population Estimates and Projections and the 1990 Decennial Census of the Population. Some of these include raw data, at the state or county levels, while others contain results of data analyses that have been compiled into tables. Each of these is discussed in turn. Note that none of these surveys or censuses is carried out exclusively for the purpose of collecting data on migration.
5.1.1. The Current Population Survey
The Current Population Survey is administered monthly, and migration results are compiled annually with varying degrees of detail. At present, data summaries are available at the Web site from the CPS for 1997, 1996 and 1994, showing annual rates of moving within counties, across counties and states and from abroad (with the same breakdown as in Table 1 above). Data include basic sociodemographic characteristics of movers and nonmovers stratified by the type of move involved.
In addition, historical data are available showing annual geographical mobility rates by type of movement for 1947-97 (these were used to createFigure 2 in Section III.2); annual in-migration, out-migration, net migration, and movers from abroad for regions between 1980 and 1997; and in-migration, out-migration and net migration for metropolitan areas between 1985 and 1997. Geographical mobility data by housing tenure are reported for the years 1986 to 1997.
5.1.2. The Survey of Income and Program Participation
A subset (or module) of this questionnaire deals with the respondent's migration history, which includes such information as the previous residence and when the most recent move took place. This is the data set analyzed by Hansen (1998) in her report on the seasonality of moves and duration of residence. Shumway (1993) uses this data set to study the effect of migration, among other variables, on the amount of time individuals remain unemployed after losing their jobs.
According to information on the SIPP Web site's section on methodology (http://www.sipp.census.gov/sipp/chap1-4.htm)
SIPP is a multipanel longitudinal survey of adults, measuring their economic and demographic characteristics over a period of 2 1/2-years. The adults followed in each panel of the survey are determined by a nationally representative survey of households in the civilian noninstitutionalized population. The first panel began in October 1983 with the adults in 19,878 interviewed households. ... Persons selected into the SIPP sample continue to be interviewed once every 4 months ... If persons initially interviewed move from their original address to another address, they are interviewed at the new address.
Further information about the SIPP, including access to the data, data applications and publications and analyses, is provided at the Web site.
5.1.3. Population Estimates and Projections
The third source of migration data provides net domestic migration and net international migration components, in conjunction with county population estimates. More specifically, the Population Estimates Program
...produces for counties each year: total population estimates and county estimates by age, sex, race, and Hispanic origin. The release of total population estimates in the winter also includes demographic components of change. In the summer, the Program releases the estimates by age, sex, race and Hispanic origin. The reference date for county population estimates is July 1.
This source includes estimates of the demographic components of population change between April 1, 1990 and July 1, 1997. Subsets are available on counties ranked by the percent and numerical population changes over this period. A county-level map of the net domestic migration rates (measured as net change per 100 persons) between 1996 and 1997 is no longer linkable. However, a map of the domestic migration rates between 1997 and 1998 can be accessed at:
http://www.census.gov/population/estimates/county/dom9798.gif
Also included at this Web site are compressed data files, which can be downloaded.
5.1.4. The 1990 Census
Every 10 years the federal government enumerates the U.S. population, as mandated by the Constitution. This is the source of the percentage of the population born in each state reported in Section III, as well as selected data on in-migrants, out-migrants, net migrants and movers from abroad.
This information is compiled from a question on the Census Form that asks the individual where he or she lived in 1985. Information on separate tabulations related to the place of residence in 1985 is also available on this Web site, including the publication "Selected Place of Birth and Migration Statistics" (CPH-L-121).
A number of migration studies (e.g., White and Mueser 1994, Ngarambé and Goetz 1998) have taken advantage of the Public Use Microdata Samples data set, which is available as far back as 1970. The statistics contained in this set are extracted from the decennial census and, as the name suggests, are available for use by the public. The Web site for PUMS is http://www.ciesin.org/datasets/pums/90.pums.html. As the site explains,
The 1990 PUMS contain individual- and household-level information from the long form questionnaires distributed to a sample of the population enumerated in the census. The PUMS are available in samples that represent: 5% of the Population and Housing of the U.S. which identifies all States and various subdivisions within them, including most counties with 100,000 or more inhabitants. 1% of the Population and Housing of the U.S. which identifies all metropolitan territory and most [MSAs] with 100,000 or more inhabitants individually, and groups of [MSAs] elsewhere.
The PUMS data sets are compiled in this manner because there are strict confidentiality rules (Title 13, U.S. Codes) making it illegal for state or federal agencies to divulge information about any one individual. The data provided through the PUMS have been purged of any information identifying individuals or their households to preserve their confidentiality.
5.3. The County-to-County Migration File (STP28)
This file was used by Schachter, Jensen and Cornwell (1998) to study migration behavior in Pennsylvania. The underlying data set is available on the Internet at the Census Bureau Web site. The county-to-county data set is a special tabulation from the 1990 Census of Population, and it is based on a question on the long form of the survey concerning where household members who are five years old or older lived five years ago. The three authors use this information to develop a "migration interchange," which represents the number of in-migrants divided by out-migrants. In order to describe the flows of net migrants based on poverty and education measures, Schachter, Jensen and Cornwell explain that they
...calculate three distinct indexes of unequal exchange. The first, the Poverty Interchange Index (PER), is calculated as the percentage of inmigrants who are poor divided by the percentage of all outmigrants who are poor. The second, the Least Educated Poor Index (LEP), is calculated as the percentage of poor inmigrants with a high school education or less divided by the percentage of poor outmigrants with a high school education or less ... The third, the Brain Drain Index (BDI), is calculated as the percentage of outmigrants with some college education or more divided by the percentage of inmigrants with some college education or more. (41)
Using these measures, the authors are able to draw conclusions about the characteristics of migrants moving into different areas.
The National Longitudinal Survey of Youth (NLSY) is an ongoing, nationally representative survey of young men and women, conducted by the Center for Human Resources Research at Ohio State University. This rich data set contains information on current employment and work, personal characteristics, and the migration behavior of those surveyed.
Young adults who do not enter the post-secondary education system tend to have a more difficult time entering the workforce than those who do. Consequently, it is especially interesting to see how migration affects the employment prospects of this group of young individuals. Bailey (1994, 305) concludes, using this data set, that
[m]embers of the sample remained unemployed for longer periods of time ... if they move] when unemployed [...and, conversely... y]oung adults returned to work more quickly ... if they do not leave the labor market of residence.
Bailey explains that this may possibly be an information problem, in the sense that young adults do not have sufficient information (or experience) that would allow them to migrate and identify job opportunities in a timely manner.
5.5. IRS Migration Data
Whenever individuals move, the IRS is notified of a change of address. This information is perhaps the closest one can come in the United States to a population registry. Although they contain limited information beyond the income of the individual, IRS data have been used to study migration behavior of tax filers. One problem with this data set is that only about 90% of the population is believed to file taxes with the IRS in any given year. Therefore, this is not an unbiased sample covering the entire U.S. population.
5.6. Data from Moving Companies
Data from moving companies (the American Moving Conference or AMC) can provide important insights into migration. One of the larger moving companies recently reported, for example, that it had moved more people out of than into Oregon for the first time in recent history.
Gober, Jeffery and McHugh (1996) use records from moving companies to determine whether such data are helpful in estimating net migration flows. They conclude that even though these numbers have to be interpreted cautiouslyin part because different companies have different regional market shares and are therefore not necessarily representative of the entire population of migrantsthis data source is helpful because results are available before official estimates have been released (as the Oregon example illustrates).
6. Migration-Related Sites on the WWW
A number of sites devoted to migration or some aspect of population mobility have sprung up on the Web. Some of these focus on internal migration, or U.S. migration, while others are more concerned with international population movements. A limited, somewhat representative sample of such sites is provided in this section.
6.1. A Really Neat SiteCheck it Out
A fascinating animated set of gifs was developed by Zach Shelton to show the expansion of the U.S. population from the East Coast across the country, starting back in 1790. Shelton uses a sequence of county-level maps showing changes in population density in persons per square mile over 21 periods (1790-1990). The URL is: http://mprepserv.somw.siu.edu/zshelton/USpop/animation.html.
6.2. People and Place
The address for this site is: http://www.swin.edu.au/sbs/pub/pnp/welcome.html#issues. The following is from the Web site banner: "People and Place is published quarterly by the Australian Forum for Population Studies. People and Place presents key information on migration patterns, the labor market, urban growth, and the environment and related topics." Tables of Contents for current and past issues of the publication are provided at the Web site, along with subscription information.
6.3. The European Forum for Migration Studies (efms)
The address for this site is: http://www.uni-bamberg.de:80/~ba6ef3/efmshome.htm, and the following text is from the site's welcome page: "The European Forum for Migration Studies (efms) is an academic research center at the University of Bamberg. Its work in the areas of migration, integration and migration policies encompasses research, documentation, consultative services, training and providing information to the public. This site provides information on the institute, its research projects, publications and other activities. The site offers access to resources of the institute such as relevant statistics, documents and an online documentation database."
6.4. Links to Web Sites of Migration Researchers
| Michael Greenwood | www.colorado.edu/Economics/people/greenwood.htm |
| David A. Plane | www.u.arizona.edu/~plane/ |
| Philip E. Graves | spot.colorado.edu/~gravesp/ |
| William H. Frey | www.frey-demographer.org/index.html |
| Oded Stark | http://www.sv.uio.no/sosoek/katalog/stark.html |
| Alan M. Schlottmann | http://econ.bus.utk.edu/Schlottmann.html |
| Walter J. Wadycki | http://www.uic.edu/cba/cba-depts/ids/wadycki.html |
| Brian J. Cushing | http://www.rri.wvu.edu/vita/cushing.htm |
| Henry W. Herzog | http://econ.bus.utk.edu/Herzog.html |
| Daniel R. Vining | http://www.ssc.upenn.edu/rsci/vining.html |
7. Summary
This section has reviewed a number of practical issues that face applied migration researchers. Data problems are perhaps the foremost concern, since none of the public statistical agencies collect data for the express purpose of studying migration. Researchers may have to conduct their own primary surveys if they want to have migration information for specific periods, populations or migrant characteristics.
As a result of the lack of high-quality and detailed migration data, numerous techniques have been devised to estimate flows of migrants. Methods have also been developed to forecast future flows of migrants into and out of communities. A number of migration-related data sets are available on the Web, and Web sites dedicated to migration issues in the United States and around the world have appeared in recent years.
Exercises and Discussion Questions for Section IV
Discuss how the selection of the time period and spatial unit of analysis affects the number of migrants.
Using the following 1985 population numbers (in 1,000s) and the number of out-migrants in Table 2 of Section III, do you find any empirical support for the caveats raised by Isard above?
| California | 26,353 | Colorado | 3,232 |
| Delaware | 626 | Illinois | 11,539 |
| Kansas | 2,448 | Missouri | 5,036 |
| Ohio | 10,744 | Rhode Island | 967 |
| Tennessee | 4,766 | Texas | 16,382 |
Consider the following initial population distribution in the urban core, suburbs and rural areas of a hypothetical place, and the transition matrix shown.
The numbers in the 1x3 vector (100,000, 10,000 and 1,000) refer to the populations of urban, suburban and rural areas, respectively, in the pre-transition period. The proportions in the 3x4 matrix are the transition probabilities into urban, suburban, rural and overseas regions. For example, the value 0.08 means that 8% of the suburban residents are expected to move overseas, compared with 5% of urban and 4% of rural residents. Also, none of the suburban residents will move into urban areas. Use this information to calculate the population distribution after the transition has taken place, assuming that net births are zero.
In Table 2 above, which region of the United States lost the largest share of its population during 1996?
What happens to the off-diagonal and diagonal elements of a Markov (transition) matrix as the time period considered for migration approaches infinity?
Consider the following migration rate data from the hypothetical region, Migratonia. Your assignment is to forecast the migration rate 5 years into the future, by running an ordinary least squares regression of the migration rate on the time trend. You should fit both a linear trend and an exponential trend to the data, and discuss the relative merits of each functional form.

7. Using data from the following table, estimate the net domestic migration rate into (or out of) your three favorite states. How important is the net domestic migration rate in these states, compared with the contribution of net births and foreign immigrants, in accounting for population change in each of your chosen states?
| Basic Data for Estimating Net Domestic Migration | ||||||
|
States |
Resident Population, 1990 |
Resident Population, 1995 |
Births, 1990-95 |
Deaths, 1990-95 |
International migration, 1990-95 |
Federal citizen movement, 1990-95* |
|
UNITED STATES |
248,718,291 |
262,755,270 |
21,260,274 |
11,652,006 |
3,966,322 |
462,389 |
|
ALABAMA |
4,040,389 |
4,252,982 |
326,665 |
213,636 |
7,187 |
6,021 |
|
ALASKA |
550,043 |
603,617 |
59,500 |
11,994 |
5,297 |
7,466 |
|
ARIZONA |
3,665,339 |
4,217,940 |
365,385 |
167,309 |
50,490 |
6,894 |
|
ARKANSAS |
2,350,624 |
2,483,769 |
183,401 |
135,350 |
3,527 |
1,872 |
|
CALIFORNIA |
29,758,213 |
31,589,153 |
3,116,085 |
1,147,804 |
1,379,703 |
77,326 |
|
COLORADO |
3,294,473 |
3,746,585 |
284,412 |
121,338 |
29,099 |
12,148 |
|
CONNECTICUT |
3,287,116 |
3,274,662 |
249,679 |
150,593 |
36,963 |
3,149 |
|
DELAWARE |
666,168 |
717,197 |
56,620 |
32,007 |
4,243 |
1,433 |
|
DISTRICT OFCOLUMBIA |
606,900 |
554,256 |
57,280 |
38,014 |
15,550 |
2,311 |
|
FLORIDA |
12,938,071 |
14,165,570 |
1,016,673 |
746,113 |
256,114 |
28,738 |
|
GEORGIA |
6,478,149 |
7,200,882 |
584,856 |
286,231 |
41,600 |
21,324 |
|
HAWAII |
1,108,229 |
1,186,815 |
104,388 |
37,738 |
33,137 |
|