West Virginia University


Edward M. Bergman

Edward J. Feser

Industrial and Regional Clusters: Concepts and Comparative Applications
Edward M. Bergman and Edward J. Feser


Cluster Morphology of Regions: Analytic Options

3.1 Introduction

In this chapter, we examine a set of methods for identifying and analyzing industry clusters. There are a variety of tools available for the task, from simple measures of specialization (location quotients) to input-output based techniques. We begin by making a distinction between highly stylized studies of pre-determined sectors (often in the Porterian tradition) and studies that attempt to infer the identity of clusters embedded within a very diverse and reasonably comprehensive set of regional industries. The first kind, what we label "micro-level cluster applications," are typically driven by specific regional interests or policy concerns. In micro-level applications, clusters are defined as a group of firms that produce similar products (i.e., industries), but that hold key complementary informal and formal ties. The clusters may include some limited supplier chain characteristics, but in such studies, explicating value-chains is less important than characterizing ties between similar producers. Such industry-focused, firm-level studies are likely the most well-known type of industry cluster application (the many studies of industrial districts around the world are of this variety).

Most regions interested in pursuing industry cluster analysis fall into one of three categories: 1) they have become aware of their leading industries but desire an understanding of how ties among firms within those industries might be strengthened and turned to competitive advantage; 2) they are aware of their principal industries, but want to identify unseen complementarities and potential strategic alliances between those and wholly different--or perhaps as yet undeveloped-- regional industries; 3) they have little knowledge of their core regional strengths and potentials, apart from what can be gleaned from single-sector trends. Micro-level studies pursued singly (i.e., not in concert with other methods) apply most readily to cases in the first category.

For the second and third categories, techniques that permit a comprehensive investigation of virtually all sectors in the regional economy are needed. We label analysis based on such techniques "meso-level cluster applications," following terminology adopted by the OECD. Meso-level applications may very well be followed by intensive micro-level analyses of relationships between firms in identified clusters. Indeed, a two-stage industry cluster analysis is probably ideal, resources permitting. Nevertheless, meso-level cluster studies even in absence of micro analysis can generate unique and policy-relevant intelligence about the regional economy.

3.2 Micro-oriented Cluster Applications

Linking the theory (of Chapter 2) and application (as discussed here) are the motives and policy interests that drive inquiries and support regional studies of any kind. While a strong epistemological base is necessary in policymaking as a legitimate foundation for conducting empirical research, core policy interests often play a strong role in determining the nature and quality of analysis. What this means is that many cluster studies–both the definitions of clusters and the methods used to identify them–are based on political concerns or pre-determined policy options rather than established theoretical models.

Such "interest-based" empirical applications have always been the case in the field of economic development. Witness early state-level pursuit of exogenous policy levers such as "growth poles," "counter-cyclical industrial portfolios," "industrial targeting and recruiting," and the wide range of related initiatives designed to propel peripheral areas into prosperity or stave off decline in more developed regions. Such approaches invariably reflected core local interests, usually some representative derivative of basic local production factors (labor and capital) and the nation-state. As the tides have turned toward more endogenous views of regional development (e.g., the creation of local state and development partnerships, business entrepreneurship strategies, incubators, programs to build social capital, human capital and technology initiatives, and industry clusters) to cope with global risks and opportunities, different political interests, as well as communities of scholars, seek different kinds of empirical applications.

North American regional development policy as a supporting interest comes relatively late and comparatively uninformed to the strategic consideration of industry clusters. More advanced are regional bodies in which industry clusters (or the industrial district variant) have been supported and studied longer (e.g., Alpine-Adriatic Europe, particularly northern Italy). Firms and industries (particularly associations), including increasingly those in the U.S., that seek agility in a turbulent global economy, a keen understanding of core competencies, and greater advantage from localized technological spillovers have shown considerable interest in the industry cluster concept.

Such interests were clearly stimulated by early forays during late 1980s into the topic by management strategist Michael Porter (1990) and his many emulators (see Chapter 2). Porter’s first-to-market success in showing how clusters support firms’ effective strategic options blazed a simple analytic approach that, using another newly popular concept, very nearly became the "path dependent" default method of analysis. This despite the fact that Porter’s analytical methods were opaque at best (Enright 1997). The upshot is that regional development policymakers coming late to the concept usually encounter a business or industry flavored- approach to identifying and analyzing clusters that we title micro.

Micro-level studies begin with some of the same theoretical insights presented in Chapter 2 of why firms successfully co-locate with other firms in industry clusters.1 These concepts are presented in somewhat stylized form, permitting greater focus to be placed on how similar-sector firms cooperatively share production capacities, markets, labor and technologies, reserving for such Italianate arrangements the term ‘cluster.’ The underlying cooperative behavior is seen as a current that follows barely-visible local channels, such that:

The "current" of a working production system [is] less easily detected and is often embedded in trade, professional, . . .and civic associations, and in informal socialization processes. . .[such]. . .that a cluster is a "geographically bounded concentration of interdependent businesses with active channels for business transactions, dialogue, and communications, and that collectively shares common opportunities and threats (Rosenfeld 1997, p. 10)."

Rosenfeld also describes this collectivity as a ". . .a critical mass of firms in a region of the same, closely related or complementary sectors (emphasis added)." The relevant point here is that such clusters typically consist of very similar types of firms selling similar consumer or household design-intensive products. In other words, single-industry clusters set the standard for studies under consideration by development policy officials that face a large portfolio of very different, interacting industries.

Italianate industry clusters often consist of commodity or raw material inputs that are transformed by cooperating producers employing similar production technologies and cooperative cultures. The relatively short supply chains are of comparatively less importance to this definition of clusters than the factors presented by Rosenfeld. Less significant too for the success of such clusters is the underlying technological system that supports these highly effective production regimes, or the appreciation that the technological origins of production methods that support Italian consumer good clusters differ radically from the producer good clusters to be found elsewhere in the Italian economy (Debresson 1996).

The richly detailed accounts of these uniquely successful industrial groupings are instantly familiar and compelling,2 particularly to politicians and policymakers desperately seeking immediate solutions to regional economic problems, while they are also of occasional use to theorists who wish to illustrate far more complex concepts. 3 As in other ethnographic inquiries, the studies are so uniquely etched that enduring lessons and generalizations prove difficult to distill or to apply in other regional economies.

Further, such studies, by definition, limit attention to physically detectable evidence of "currents" flowing among similar sector firms that are best uncovered up close and at fairly small geographic scales by labor-intensive investigations (e.g., on-site interviews, Delphi techniques, or focus groups). Not surprisingly, this approach restricts its view to a single visible collection of similar sector firms, thereby overlooking linkages that some of its members may have with regionally co-located firms from very different sectors, or the robust clustering of other sectors. A micro-level study then tends to document one cluster per region, usually that of its policy client. Apparent indifference to the presence of additional clusters, particularly those based on alternate criteria or detectable only from a wider spatial view or from data-intensive sources, is due mainly to micro-oriented investigations of an a priori cluster definition. An implication is that significant instances of region-wide industrial clustering go unrecognized by micro studies. At the same time, the labor-intensive method of study all but precludes a region-wide investigation of all industrial clusters that might form the basis for "seeing regional economies whole."

Recognizing that regional development interests are eager to learn about all components of a local economy for which they are responsible, micro study analysts sometimes precede or accompany their proposals for detailed study of single industries by employing certain simple single-industry techniques drawn from regional analysis, which are then applied repetitively to commonly available multi-industry data. Location quotients are the most frequently applied method to identify unusually high relative concentrations of industrial activity, which in these studies are taken as evidence of "industrial clusters." The cluster studies that employ this simple technique to widely available employment data are generally indifferent to the fact that high concentrations are–in the hands of other analysts–interpreted as inferential evidence of local export production (economic base theory). Worse, and somewhat perversely, such studies often appear completely unaware that employment concentrations per se are indistinguishable proxies for total industry output, regardless of whether that production is concentrated in one huge branch plant or distributed within a "cluster" of cooperating establishments and firms.

Micro studies also tend to revolve around the needs of the focal industries to survive or thrive in their settings, and study designs are therefore geared to learning what is needed for members to act decisively in their specific economic and regional environments. These studies attempt to provide useful specificity, detail, and subtlety of how connections are made, networks are maintained, and interpersonal assets are translated into cluster advantages of utmost importance to the sponsoring clients. These interests may align or be at odds with host regions that wish to restructure their economies away from the most vulnerable to the most promising clusters. At the same time, an uninformed application of standard techniques drawn uncritically from the regional scientist’s toolbox offers little in the way of improvements that would benefit an overall regional perspective.

Micro-oriented studies of regional industry clusters are appropriate in some circumstances. When an analyst is beginning with a definitive set of industries that constitute the policy interest, the kinds of qualitative and labor-intensive research needed to truly identify evidence of clustering behavior are called for. There virtually no secondary sources of information on cooperative relationships between local companies; input-output data can only provide hints of such relationships, or perhaps the most likely suspects among which such relationships might be organized.

The following section focuses on methods designed to distill the industrial complexity of a given region in such a manner as to identify regional clusters or potential regional industry clusters. The techniques are quantitative and, for the most part, data intensive. This kind of analysis may very well be followed by a qualitative examination of specific identified clusters. Indeed, it probably makes most sense to conceive of regional cluster analysis as a two-stage process: 1) an initial scan of the regional economy, using detailed quantitative sources; 2) then a detailed, perhaps painstaking, investigation of specific industrial features/groupings identified in the scan. The two-part approach implies that the analyst is beginning with a "clean slate," that is, no restrictions or a priori predilections of the sectors that are of most import.

3.3 Methods of Meso Industry Cluster Analysis

This section identifies several ways of identifying industry clusters, with most of the detailed focus placed on input-output based methodologies. The discussion is presented from the perspective of an analyst considering issues of study design and methods. For a discussion of general cluster approaches from the perspective of the policy maker considering whether to commission a cluster study, click here.

Exhibit 3.1 lists six basic analytical approaches, ordered roughly in terms of how commonly they have been used: expert opinion, location quotients, trade-based input-output analysis, innovation-based input-output analysis, network analysis, and surveys. The following sections summarize each approach, save innovation-based input-output analysis. The latter is based on innovation survey data available in only a few countries.

3.3.1 Expert Opinion

Probably the most common approach to identifying regional clusters is the use of interviews, focus groups, Delphi survey techniques, and other means of gathering key informant information. Regional experts--industry leaders, public officials, and other key decision makers--are important sources of information about regional economic trends, characteristics, strengths and weaknesses; they are the "agents who know the region’s industries in terms of basic practice, supply chains, current investment patterns and potential opportunities for new products. . .(Stough, Stimson and Roberts 1997, p. 2)." Industry association reports, newspaper articles, and other published documents that are anecdotal or otherwise not based on systematic empirical analysis also fall under the category of "expert opinion."

While gathering expert opinion data can be relatively cost and time effective, as well as yield rich contextual information about the region’s economy, it is rarely done systematically enough that findings can be generalized. It is easy for researchers to overestimate the accuracy of strongly held opinions among key stakeholders and to forget the multitude of potential biases affecting each expert’s views, as well as each expert’s limited field of experience within the broader economy. Moreover, there have been few attempts to use expert opinion in comprehensive assessments of the regional economy (the meso-analytic approach).

Expert opinion is most commonly used in the kinds of micro studies described in section 3.2. There the threat of bias is particularly strong since the researcher is embarking on the analysis with a pre-determined sense of the most important regional sectors, actors, and relationships. Unfortunately, the literature on clusters pays scant attention to valid expert data collection techniques. There has also been comparatively little research on ways to marry expert opinion data with secondary economic data, an important feature for meso-level cluster studies. For example, if we envision a two-stage cluster analysis with a quantitative regional "scan" preceding a qualitative investigation (including the collection of expert opinions), how does one effectively merge findings from the two stages in a way that generates insight greater than the sum of the parts?

Among the few to take up that question, as well as to design an approach for scanning a range of sectors using expert opinion data, are Roberts and Stimson (1998). They describe a tool, which they title multi-sectoral qualitative analysis (MSQA), for helping identify "core competencies, economic possibilities, strategic markets, and economic risk (1998, p. 470)." The method entails a simple categorical scoring of regional sectors along on a set of performance criteria (a total of 34 in their application to Far North Queensland, Australia). The ranking of each sector as "strong," "average," or "weak" was based on "I/O table data, focus and industry leader group discussions, reviews of 30 economic reports and studies of the FNQ region, and local knowledge (1998, p. 476)." The performance of each sector is then compared by attaching weights to the scores and summing them. Roberts and Stimson suggest several different indexes that can generated from the results.

The potential of the MSQA approach for utilizing expert opinion in cluster analyses is revealed more clearly in Stough, Stimson and Roberts (1997). In an application to Northern Virginia, the authors utilized a survey of regional experts (". . .selected from industrial directories and from economic development agency bases to ensure that they represented senior officials from the region’s major industries (1997, p. 6)." Respondents evaluated the region’s competitiveness on 35 dimensions from their own firms’ perspective and from the point of view of any general regional business. Small group meetings were then held where respondents were first asked to interpret, elaborate on, or modify findings from the survey. Participants then "identified new business opportunities for the future of their sectors and then assessed the risk associated with developing these options. Out of this exercise it was possible to create alternative proposals for deepening, and stretching and leveraging the sectors (1997, p. 6)." Stough, Stimson and Roberts identify a set of future Northern Virginia industry clusters from the results.

It should be emphasized that Stough, Stimson and Roberts’ cluster findings are more consistent with a single-industry definition of clusters (as in micro studies) rather than broader a value-chain definition. Nevertheless, the MSQA technique is suggestive of ways that more systematically collected expert opinion can be incorporated in meso-level cluster analysis.

3.3.2 Location Quotients

A very common, though limited and misunderstood, means of identifying regional industry clusters is the location quotient (LQ). The location quotient is simply a ratio of employment shares: regional industry i’s share of total regional employment over national industry i’s share of total national employment. An LQ of 1.0 indicates that the regional economy has the same share of employment in industry i as the nation as a whole.4 (Note that any other measure of economic activity and/or reference area could be used depending on the analysis.) Location quotients exceeding 1.25 are usually taken as initial evidence of a regional specialization in a given sector. The many potential conceptual and measurement pitfalls in using location quotients have been described in detail by others (see, for example, Isard et al. 1998, pp. 24-6).5 Here we focus on the value they have for industry cluster analysis.

Applied in the traditional manner, location quotients say absolutely nothing about regional industry clusters. They are an industry-based technique and therefore offer no insight on interdependencies between sectors. Industry cluster studies that rely solely on location quotients to identify clusters are simply sector studies in disguise. Location quotients in concert with other techniques may contribute to a meso-level cluster analysis however.

Top-down Versus Bottom-up Industry Cluster Analysis. There are two basic types of meso-level industry cluster analyses: top-down and bottom-up (see Exhibits 3.2 and 3.3). In the bottom-up approach, the analyst seeks to identify industry clusters by beginning with individual sectors and then finding linkages with other industries and related non-business institutions. In essence, the analyst builds a picture of regional industrial interdependence from the ground up, one sector at a time. The bottom-up approach is particularly appropriate in small regions with only a few industries, or in those places with only a few sectors with non-trivial employment. Top-down industry cluster methods attempt to identify industry clusters through various data reduction techniques (statistical cluster analysis, factor analysis, and the like). They are appropriate when there is sufficient industrial diversity in the regional economy to preclude a sector-by-sector "piecing together" of the picture of regional economic interdependence. What top-down method surrender in terms of control over the analysis they gain in terms of their capacity to make sense of complexity.

Location quotients can be used in bottom-up analyses as one of several simple measures of sector performance. The full set of regional industries might be ordered alternatively by size (measured in employment, value-added, income, or other terms), number of establishments, growth rates, specialization (location quotients), change in specialization (rate of change in the location quotient), share of total regional activity, share of total national activity, change in regional and national shares, and so on. Several categories of sectors might then be selected to begin the analysis, e.g., largest sectors, major specializations, growth industries (or combinations, such as growing specializations). Input-output data (see below) or other data on formal and informal linkages may then be used to map out value chains (suppliers and buyers of the target sectors).

Ultimately, location quotients are only useful in concert with methods that utilize, in some form, information on industrial interdependence. Even then, they can only play a minor role in identifying clusters. Spatial and economic interdependence are the two key features of the regional industry cluster concept. We now turn to the principal means of studying industrial interdependence: input-output techniques.

3.3.3 Identifying Clusters via Input-Output

Regional scientists have long used a range of methodologies, including graph theory, triangularization, and factor/principal components analysis for sorting industries into groups based on input-output (IO) linkages. Czamanski and Ablas (1979) provide a useful review of early contributions. A more recent study uses statistical cluster analysis to group sectors for Alberta, Canada (Roberts 1992). U.S. Census researchers also recently used statistical cluster analysis to combine SIC sectors into groups that presumably shared the same production technologies (Abbott and Andrews 1990). Feser and Bergman (1999) use factor analysis of the U.S. input-output table to construct U.S. value-chain "templates" for use in the descriptive analysis of potential trading patterns in North Carolina (discussed in more detail below; see also Bergman 1998). Other examples of input-output based applications include Scott and Bergman (1997), Hewings et al. (1998), and Roelandt and den Hertog (1999).

An important input-output approach applied in a number of OECD countries is based on analysis of innovation interaction matrices rather than (or sometimes in concert with) traditional production flow matrices. Debresson (1996) offers a comprehensive source for techniques and examples of such analyses. Innovation matrices, derived from surveys (e.g., the Community Innovation Survey of Eurostat), describe flows of innovations between innovation-producers and innovation-users. As noted by Roelandt and den Hertog (1999, p. 5), the principal advantage of innovation matrices is "their focus on actual innovation interdependency and actual interaction between industry groups when innovating." Disadvantages are the costliness of data collection and conceptual difficulties in survey design. A survey similar to Eurostat’s Community Innovation Survey has not been conducted for the United States.

Acknowledging the considerable advances made by the innovation survey approach, we concentrate here on the analysis of production flows. We begin by describing a set of general steps in input-output cluster analyses, and particularly conceptual decisions that have to be made along the way. We then provide an example of an input-output industry cluster analysis, our own study of potential clusters in North Carolina. We then briefly contrast our approach with that of several others, mainly to highlight major methodological differences.

Analytical Steps. There are five major steps to conducting an input-output based industry cluster analysis:

  1. Define industry clusters (existing or potential/emerging, localized or non-localized);
  2. Determine whether a top-down or bottom-up method is appropriate;
  3. If top-down, identify an analytical method (statistical cluster analysis, factor analysis, other);
  4. Collect data;
  5. Apply and interpret analysis.

The first step essentially entails framing the policy issue (or set of issues) the cluster analysis is intended to inform. In Chapter 2 we make a distinction between potential (possibly emerging) and existing clusters. We also emphasize that industry clusters may manifest themselves at different spatial scales. Choices regarding existing/potential and spatial scale may determine the kind of input-output data that are most appropriate for the analysis.

Whether or not an analyst should use a regional or national input-output table to identify regional clusters is usually regarded as obvious: a regional table should be used since only it provides information about regional trading patterns. But, in actuality, the decision is not so simple. It is true that only regional input-output tables provide information about existing trading patterns between sectors currently in the region (the same is the case of regionalized national input-output tables). But because such tables provide no insight regarding interdependence of industries absent in the study area, they cannot be used to explicate possible development paths or avenues for regional diversification. For that purpose, a national table must be used, or, if such existed, a "global" input-output table. Using a "global" table, one could identify industrial interdependency among sectors regardless of location and then investigate, perhaps with the help of a regionalized table, possible linkages between and among those sectors in the region. Since there is no such thing as a global table, a national table (particularly in highly diverse economies such as the United States) constitutes a workable substitute.

Once a decision regarding regional- or national-level analysis (or perhaps a combination) is reached, the analyst must decide whether to utilize a top-down or bottom-up methodology. Some regions are so small or contain so few sectors that use of a data-reduction technique is unwarranted. Connections between sectors can be identified by constructing simple measures of input usage and sales (several are defined below). In section 3.3.4, we briefly summarize some graphical network analysis techniques that are particularly appropriate for bottom-up applications. They permit the visual description of cross-sectoral linkages and can be combined (using a variety of visual dimensions) with descriptive data on regional industries to effectively "overlay" information on interdependence with indicators of regional industry performance.

Step three involves identifying a data reduction method (for top-down applications). The two most common in industry cluster studies are statistical cluster analysis and factor analysis. A principal difference between the two is that the former yields mutually exclusive groups of industries. Though this aids interpretation, it is frequently unrealistic. Due to complex trading patterns, industries tend to trade with sectors that belong to multiple clusters (though their links to each cluster vary in strength). Factor analysis can accommodate, and even provide ways to explore, this complexity. All data reduction techniques, which are themselves primarily exploratory methods, involve numerous user-defined assumptions. With today’s user-friendly statistical software, it is easy to produce a cluster or factor analysis in seconds with minimal user input other than the base data. However, default assumptions embedded in canned software routines should be carefully examined and modified as appropriate.

Procedures involved in data collection and analysis/interpretation obviously vary from case to case. Definitional considerations and data collection issues in input-output analysis, particularly for the U.S. case, are reviewed in Miller and Blair (1985).

A Note on Data Sources. The principal source of input-output data in the United States are the Benchmark Input-Output Accounts of the United States, produced twice every decade in years ending in 2 and 7. The latest table available at this writing was for 1992; 1997 is scheduled to be released in 2000. Regionalized tables for the U.S. are available from the Bureau of Economic Analysis, or from several proprietary sources. Minnesota Implan Group, Inc., for example, produces relatively inexpensive economic impact analysis software from which regionalized tables can be extracted. Regionalization techniques used in Implan software, or by any other vendor of regional analysis software (e.g., Regional Economic Models, Inc.), are well-known and can be replicated given the necessary data. Miller and Blair (1985) and Isard et al. (1998) outline various methods for regionalizing national IO tables in detail. Survey-based tables for specific regions in the U.S. are very rare. A very recent description of socioeconomic data series useful in regional analysis (including IO) is Cortright and Reamer (1998) .

Example. Here we illustrate a top-down meso-level analysis designed to identify potential clusters and sectoral interdependencies. The study was initially conducted in support of a technology diffusion program at the state-level and is reported in detail in Bergman, Feser, and Sweeney (1996), Feser and Bergman (1999), and Bergman (1998). The policy agency wanted to target specific manufacturing sectors for technology adoption assistance such that within industry value-chains, internal pressures for the diffusion of advanced production technologies would be created. The agency was also interested in identifying elements of value-chains that could be singled out for a variety of industrial development strategies [link to Appendix 1]. With those considerations in mind, we first analyzed U.S. input-output patterns to identify a set of industry cluster "templates," national-level manufacturing value chains. We then used the chains in combination with confidential establishment-level employment and wage data to characterize the presence of the chains in the state (North Carolina). Sub-state level-analyses and simple mapping of establishments in each cluster gave some indication of regional clustering patterns. Chapter 4 uses findings from the study to illustrate a range of techniques and exploratory methods for further analyzing regional industrial interdependence.

Our methodological approach uses principal components analysis on a matrix of national interindustry linkages (derived from the 1987 U.S. IO table) as the basic methodology to derive clusters. Principal components factor analysis exploits the common statistical variation among multiple variables to generate a reduced number of "principal components" that represent linear combinations of the original set of variables. Measures of interindustry direct and indirect linkages computed from the input-output accounts for each sector are treated as variables. The derived components are then rotated to a varimax solution to facilitate interpretation. The methodological details behind factor analysis are beyond the scope of this monograph; Tinsley and Tinsley (1987) provide a summary introduction.

The input into the factor analysis is a matrix of interindustry linkages between all sectors in the U.S. manufacturing economy. There are a variety of ways such matrices can be developed. As an initial approach, one can group only those industries with non-zero employment in the study region based on those sectors’ estimated patterns of commodity use and production, as revealed by the U.S. make and use tables. This involves scaling the use and make tables with study area wage data, followed by conducting a factor analysis on the resulting matrices. Note that no assumptions are made regarding where, in geographic terms, study region industries purchase their inputs or sell their outputs.

The 1987 478 x 519 U.S. use matrix (U) reports the dollar value of each of 519 commodities used by each of 478 producing U.S. I-O industries.6 To focus only on manufacturing, U can be reduced to a 362 x 519 manufacturing use matrix (UM). Given 362 x 1 vectors of total manufacturing wages by industry for the U.S. (wUS,M) and study region (wNC,M), a 362 x 519 scaled use matrix (UNC) can be derived that reports the estimated dollar value of 519 commodities used by 362 study region I-O industries:

Each cell entry in UM,W is the ratio of output of commodity i purchased by U.S. I-O industry j to the total wages paid by industry j. Applying factor analysis to the resulting n x 519 data matrix clusters industries based on commodity use patterns. The reduced 328 x 519 UNC matrix is identical, in terms of the factor analysis, to a 328 x 519 UM matrix (where the industries without a presence in the study region are removed); the use of study region wages to adjust the use matrix provides a simple means of performing this basic adjustment. Repeating similar matrix operations and factor analysis for the make matrix generates clusters based on commodity production patterns.

While such an approach reveals differences in clustering based on commodity use and production patterns, it provides no means of jointly evaluating interindustry linkages to derive one set of clusters. Thus it makes both the final derivation of clusters considerably more complicated and the interpretation of any final result more difficult. Roepke, Adams, and Wiseman (1974) suggest a different approach. First, a standard 478 x 478 interindustry transactions matrix (T) is derived from an adjusted use matrix UA, a 516 x 1 vector of

commodity outputs (OC), and a 516 x 478 commodity by industry make matrix (M):7

Each cell (aij), in T gives the dollar value of goods and services sold by row industry i to column industry j. Since industries may be related by both input and output patterns, a symmetric matrix LT is derived from T such that,

Each column in LT gives the pattern of total (input and output) linkage between the given column industry and every other (row) industry. Eliminating non-manufacturing industries from the columns of and rows of LT and subjecting to the resulting data matrix to the factor analysis generates a set of industry clusters.

The drawback of Roepke, Adams and Wiseman approach is that evidence of indirect linkages, e.g. relationships between sectors based on links between second and third tier buyers and suppliers, will be largely absent from the groupings. The third approach employs a slightly different interindustry linkage measure. Czamanski (1974) demonstrates that given, for each industry, total intermediate good purchases (p) and sales (s), the type of functional relationship between any two industries, i and j, may be expressed in terms of four coefficients (where a is defined as above):

Each coefficient is an indicator of dependence between i and j, in terms of relative purchasing and sales links:

xij, xji: intermediate good purchases by j (i) from i (j) as a proportion of j’s (i’s) total intermediate good purchases. A large value for xij, for example, suggests that industry j depends on industry i as a source for a large proportion of its total intermediate inputs.
yij, yji: intermediate good sales from i (j) to j (i) as a proportion of i’s (j’s) total intermediate good sales. A large value for yij, for example, suggests that i depends on industry j as a market for a large proportion of its total intermediate good sales.

Selecting the largest of the four coefficients for each pair of manufacturing industries yields a symmetric data matrix LU, which, when subjected to principal components analysis, generates clusters that at least partially capture indirect linkages between industries.

In this case, functional linkage between pairs of industries in isolation are investigated. Correlation analysis permits the assessment of linkages between pairs of industries based on their total patterns of sales and purchases across multiple industries. Each column (x) in a matrix of x’s, X, gives the intermediate input purchasing pattern of the column industry. Each column (y) in a matrix of y’s, Y, gives the intermediate output sales pattern of the column industry. Four correlations describe the similarities in input-output structure between two industries l and m:

r(xl×xm) measures the degree to which industries l and m have similar input purchasing patterns;
r(yl×ym) measures the degree to which l and m possess similar output selling patterns, i.e. the degree to which they sell goods to a similar mix of intermediate input buyers;
r(xl×ym) measures the degree to which the buying pattern of industry l is similar to the selling pattern of industry m, i.e. the degree to which industry l purchases inputs from industries in which m supplies;
r(yl×xm) measures the degree to which the buying pattern of industry m is similar to the selling pattern of industry l, i.e. the degree to which industry m purchases inputs from industries in which l supplies.

When working with a reduced set of industries (e.g., only manufacturing sectors), the four correlations can be calculated for each pair of industries using alternative specifications of X and Y. One specification consists of buying and selling patterns for each member of the reduced set of industries across all other industries in the reduced set itself. Another specification consists of buying and selling patterns for each member of the reduced set of industries across all other industries, both in and out of the reduced set. In the case of an analysis of the manufacturing sector alone, interindustry correlations calculated using the second specification of X and Y also account for similarities in manufacturing industries’ sales/purchase patterns to/from non-manufacturing industries (e.g. construction, wholesaling, services).

Deriving the correlations from the first set of X and Y matrices and selecting the largest of the four between each pair of industries yields a symmetric matrix, LV. Each column of LV describes the pattern of linkage between the column industry and all other industries in the study set. Factor analysis can then be used to identify groups of related industries.

For each factor (group of industries), the analysis generates a set of loadings, which represent the correlations of the variables with the factor. The loadings provide a measure of the relative strength of the linkage between a given industry and a derived factor, where the highest loading industries on a given factor are treated as members of an industrial cluster. It is often regarded as standard procedure in factor analysis to regard only loadings greater than 0.5 (in absolute value terms) as significant or worthy of interpretation. This approach, however, does not provide a means of interpreting gradations in loadings. For example, industries with loadings exceeding 0.75 on a given cluster might be regarded as closely linked to that cluster, while industries with loadings from 0.5 to 0.75 and from 0.35 to 0.50 may be viewed as only moderately and weakly linked, respectively. For the reasons described below, analysts should adopt a combination of rules of this type. Because any approach to delineating cluster industries from factor analysis output is necessarily partially arbitrary, loadings should also be reported to allow study users to draw their own conclusions.

In interpreting the factor analytic results to identify specific industrial clusters, analysts typically face several competing objectives. First, they want to derive a set of clusters based on the most significant linkages as revealed in the IO data matrix. According to that objective, the concern is to identify the industries with the tightest linkages to each cluster (i.e., the highest loading industries for each factor), regardless of whether or not some of those industries are also tightly linked to another cluster. Frequently a second objective is to identify, to the degree possible, a set of mutually exclusive clusters in the sense that each sector would be assigned to only one cluster. Such a result facilitates cross-cluster comparisons of size and growth rates using regional economic data sources. A common third objective is to investigate the linkages both between clusters as well as between industries within each cluster. Such linkages are sometimes revealed by an examination of sectors that are only moderately or weakly related to each cluster, thus competing with the first objective.

Such multiple objectives can be met, at least partially, by distinguishing membership in each cluster according to the strength of linkage as suggested by the loading. We derived, for example, a set of "primary" and "secondary" industries. Although there are alternative means of doing this, we suggest the following definitions based on our experience. Primary industries for a given cluster are those sectors that achieve their highest loading on that factor and whose highest loading is 0.60 or higher. Secondary industries for a given cluster are those sectors that achieved loadings on the cluster equivalent to or greater than 0.35 but less than 0.60. For some clusters, the set of secondary industries will include industries with loadings exceeding 0.60 but that achieved their highest loading on a different cluster.

Based on those definitions, as a general rule, primary industries are those that are most tightly linked to a given cluster while secondary industries are those that are less-tightly or moderately linked. Considering only primary industries yields a set of mutually exclusive industrial clusters that can be used for cross-comparison purposes. But some caution should still be exercised in interpreting the clusters derived on this basis since some "secondary" industries will actually be more tightly linked to a given cluster than a few of the primary industries in the same cluster. Often the advantages of deriving a set of mutually exclusive clusters will be viewed as significant enough to warrant the pragmatic approach.

Our analysis identified 23 clusters in the U.S. manufacturing sector [see Exhibit 3.4]. Basic summary data on the 23 clusters identified in the U.S. manufacturing economy are provided in Exhibits 3.5 and 3.6 . Exhibit 3.5 represents the breakdown of the clusters when both primary and secondary sectors are included in the cluster definition; the clusters in Exhibit 3.6 are constituted solely of primary sectors. The clusters consist of heavy manufacturing (e.g., metalworking, vehicle manufacturing, chemicals and rubber, nonferrous metals), light manufacturing (e.g., electronics and computers, knitted goods, fabricated textiles, wood products, leather goods, printing and publishing), five separate food-related clusters, and several clusters closely related to other major clusters (e.g., brake and wheel products and platemaking and typesetting). With the exception of the growth in importance of key high tech clusters (electronics and computers and aerospace), the set of clusters is roughly similar to results found in earlier cluster studies conducted using input-output data from the 1960s and 1970s. Also reported in the tables is the number of 3- and 4-digit SIC sectors that make up each cluster (column 3 in each exhibit), as well number of different 2-digit SIC sectors represented (column 4).

In addition to relative size, the exhibits highlight two key features of the clusters. First, the number of component sectors in each cluster varies dramatically from 116 in the metalworking cluster to just 4 in the tobacco products cluster (when both primary and secondary industries are included in the cluster definitions). Clusters with the largest number of component sectors sometimes include multiple final market product chains, whereas smaller clusters (tobacco, dairy products, meat products, etc.) generally describe only a single major final market product chain. Second, most clusters are composed of sectors from a variety of 2-digit level SIC industries. Sectors from 10 different 2-digit SIC industries are represented in the metalworking cluster, for example; sectors from 16 different 2-digit SIC categories make up the vehicle manufacturing cluster. Therefore, although the 23 clusters are similar in number to the 20 official 2-digit SIC classifications, they are, in fact, very different in composition. Template clusters defined on the basis of interindustry linkages generate a unique picture of the manufacturing economy when used in subsequent economic analyses. See Bergman, Feser and Sweeney (1996) and Feser and Bergman (1999) for a description of the basic makeup and characteristics of the largest of the 23 U.S. clusters.

Exhibit 3.7 provides the detailed sectoral makeup of the 23 clusters. The columns labeled Cluster ID provide a rough indication of some of the linkages between the vehicle manufacturing cluster and the remaining 22 clusters, though a complete analysis is possible only with primary input-output data and detailed intersectoral comparisons. The cluster in which a given sector is most tightly linked is given in column L1. L2 and L3 report additional clusters, if any, in which the sector is also moderately linked based on our criteria.

For example, as might be expected from the high metal content of most transportation equipment industries, 20 of 58 total primary and secondary industries in the vehicle manufacturing cluster are also members of the metalworking cluster. Other sectors are members of an additional 10 clusters, with the chemicals and rubber (including plastics), printing and publishing, fabricated textile products, and electronics and computers clusters the most significant in terms of number of cross-cluster linkages. Not surprisingly, the vehicle manufacturing cluster is also closely linked to the brake and wheel products cluster, which itself shares most of its component industries with the former as well as the metalworking cluster.

For 44 of the 362 manufacturing sectors, sectoral interdependencies are too weak to qualify them as a primary industry in any cluster. Therefore, another category of industries remains that requires attention here. The last row of Exhibit 3.6 reports the total number of U.S. companies, establishments, employees, and value-added represented by such industries in 1992. At over 11 percent of total manufacturing value-added, these "independent" industries constitute a significant share of U.S. manufacturing production. Exhibit 3.8 lists the industries that failed to load as a primary industry on any cluster along with their maximum factor loading and the cluster on which this loading was achieved. 8 The most significant of the independent industries are pharmaceuticals (SIC 283), paper and paperboard mills (262-3), photographic equipment and supplies (386), and toilet preparations (2844).

Additional Points and Clarifications. In Chapter 4, we demonstrate how the cluster templates can be used to "see regional economies whole." Our example is specific to the policy needs of the technology agency that commissioned it. Nevertheless, the national templates can be used in for studies in any U.S. region, where knowledge of actual local trading patterns is not the over-riding concern but instead a means of identifying potential cluster firms is of interest. They also can be used in conjunction with bottom-up methods. Exhibit 3.9 maps out supplier linkages to the non-upholstered household furniture sector, and, using the templates, illustrates how different industries in the chain are linked to different manufacturing clusters. For a comparison of the input-output application with a micro-level approach, click here.

A number of clarifying points are in order regarding top-down, input-output illustration. First, although the use of the national table yields clusters with very specific uses, the basic techniques to derive the clusters (measures of interindustry linkages and factor analysis) can be employed in a variety of circumstances (e.g., with regional input-output tables).

Second, although the derived industry clusters are obviously based on formal trading patterns, the construction of the linkage measures in combination with the factor analysis means that many indirect trading patterns are considered. The clusters may be viewed, in one sense, as an excellent first guess of what sectors are likely to engage in both formal and informal kinds of cooperative behavior, that is, if we believe cooperative relationships are most likely to occur between firms in sectors with rough technological affinities. This is another instance when IO based approaches can provide support to micro or more qualitative analyses.

Third, early regional science research on industrial complexes (see definitions in Chapter 2) has already demonstrated that it is a mistake to attempt to replicate the national industrial mix at the regional level. The templates do not provide a blueprint for how any region should develop, but rather serve as an analytical device to further analyze regional industrial interdependence. This will become clearer in Section 4.

3.3.4 Network Analysis

A relatively novel way of identifying industry clusters is through network analysis of linkages between firms or sectors. The most obvious data sources are trade or innovation-based input-output tables, however surveys of regional experts or other qualitative sources of connections between regional industries can also be used. Indeed, qualitative analysis of industry clusters using techniques perfected in the social network analysis literature (see Wasserman and Faust 1994) is promising though has not been attempted to our knowledge. Debresson (1996, pp. 167-173) provides a short discussion of techniques for identifying clusters by directed graph (see also Debresson and Hu 1999).

An example of the power of even simple descriptive network techniques can be illustrated using vehicle manufacturing template from Section 3.3.3. To completely analyze linkages among the sectors that comprise the cluster, one could examine the base correlation matrices used in the factor analysis. Although this would provide the most comprehensive picture, the detail involved in summarizing relationships among 58 sectors precludes such an approach (there are 6,728 distinct linkages in total). Another alternative is to use the indicators of dependence defined above (xij, xji, yij, yji) to identify the major relationships tying the cluster together. We used simple network graphing software to diagram key intracluster purchasing linkages in the vehicle manufacturing clusters.

Exhibit 3.10 is the result. Arrows are drawn between significant trading partners (i.e., the direction of an arrow between sectors i and j indicates that sector j purchases a significant share of its inputs from industry i, where "significant" is defined as exceeding a threshold based on the distribution of linkages between all sectors in the cluster). (SIC codes are defined according to the 1987 SIC system.) What the figure highlights is the core role of SIC 308, miscellaneous plastics products, in the U.S. vehicle manufacturing value chain. Also indicated are other sectors that serve as suppliers to multiple cluster industries.

The principal challenge of graphical network analysis techniques for identifying regional industry clusters is finding ways to interpret the revealed complexity. Software for the purpose is still limited. What is available is geared toward social network analysis, though even sociologists suffer from a lack of good software. Freeman (1999), for example, provides a recent review of molecular modeling software that can be used–imperfectly–to generate images of social networks [can be linked to at eclectic.ss.uci.edu/~lin/chem.html]. Developing better graphical techniques and associated software is a potential area of research for industry cluster analysts.

3.3.5 Surveys

In principle, one could survey regional firms to identify local and non-local trading patterns, cooperative alliances, and so on. Not surprisingly, however, survey-based methods for analyzing industry clusters are very rare. Surveys are expensive and the level of detail required in the survey instrument in order to fully explicate cross firm trading patterns and informal linkages is almost always prohibitive. There does seem to be potential for marrying limited surveying with other quantitative methods. To our knowledge, there have been few if any attempts to do this.

3.4 Summary

This chapter summarizes a range of techniques for identifying regional industry clusters. We began by characterizing micro-level cluster analyses, usually of the industrial district variety, that labor-intensively examine cooperative behavior between firms in the same or closely similar industries. We then focused most attention on methods that attempt to identify clusters from a comprehensive analysis of the regional economy. Such approaches we labeled "meso-level analyses."

Industry cluster analysis is a relatively new trade, despite its modern origins in regional science in the 1960s and 1970s. Only since the early 1990s have industry cluster applications become numerous enough to begin to discern trends in methods and approaches. Yet most cluster studies retain a highly idiosyncratic element, often dictated as they are by place-specific policy concerns, resource constraints, data limitations, and varying interpretations of the theoretical literature. Over time, a more systematic and widely-held set of definitions and analytic techniques will probably emerge. Until then, would-be industry cluster analysts should acquaint themselves with the literature. The many citations contained in this chapter are a good start.

End Notes

  1. This sub-section draws upon work previously published in Bergman (1998).
  2. Unsuccessful groupings of similar industries, lacking inherent interest to study sponsors, remain relatively unresearched, therefore leading to selection bias in available scholarship. Absent studies that investigate why certain firm clusters are unsuccessful, we cannot be confident of which factors are responsible for cluster success and which are simply result from clusters everywhere. The restricted study of successful clusters is due in part to Porterian-type analyses that were specifically intended to identify the factors most closely associated with "competitive clusters."
  3. "As I mentioned at the beginning of this lecture, in 1895 the teenaged Miss Evans made a bedspread as a gift. The recipients and their neighbors were delighted with the gift, and over the next few years Miss Evans made a number of tufted items, discovering in 1900 a trick of locking the tufts into the backing. . .[two paragraph expansion traces origins of carpet cluster]. . .And so the little Georgia City (of Dalton) emerged as America’s carpet capital" (Krugman, 1991, pp. 60-61).
  4. Isard et al. (1998, pp. 26-30) also review two related measures of specialization/localization: the coefficient of localization and the localization curve.
  5. There are also policy pitfalls: "We find in the regional literature suggestions that those industries with location quotients greater than unity represent areas of strength within a region and ought, therefore, to be further developed; and, in somewhat contradictory fashion, that those industries with location quotients less than unity ought to be encouraged in order to reduce the drain of imports" (Isard, 1960, p. 494, as quoted in Higgins and Savoie, 1995, p. 156).
  6. One of the "industries" in the use table is an inventory valuation adjustment (I-O code 85.0000) and three "commodities" are not directly produced by business enterprises (noncomparable imports--I-O 80.0000, used and secondhand goods--I-O 81.0002, and rest of the world adjustment to final uses--I-O 83.0001).
  7. This operation invokes the "industry-based technology assumption," which assumes that the total output of a given commodity is provided by industries in fixed proportions. See Miller and Blair (1985). UA is U with noncomparable imports, secondhand goods, and rest of the world adjustment to final uses removed. Those "commodities" are not reported in the make matrix since they are not produced goods.
  8. Note that all of the independent sectors are classified as secondary industries in one or more clusters.

Copyright ©1999, Regional Research Institute, WVU

No portion of this web site can be reproduced on paper or electronically without express permission from the Regional Research Institute.