• Keine Ergebnisse gefunden

2 Simplicity versus Efficiency

2.1 Simplicity, Efficiency, and Economy in Forest Surveys

2.1.2 Innovative Statistics

One can find either frustration or excitement in the contemporary information dynamic.

We advocate the latter view. In any case, complacency is imprudent. Accordingly, we turn next to some current realms of statistical innovation which have been explored in ecological and/or environmental contexts by the Center for Statistical Ecology and Environmental Statistics at The Pennsylvania State University. These topics are immediately relevant to multi-resource and ecosystem-oriented forest inventory. They further suggest alternative integrative views of forest inventory in spatial context that appeal to GIS and knowledge-based systems.

Meta-analysis, Encountered Data, and Weighted Distributions

Patil calls attention in numerous publications and presentations to a dismal cycle of no information, new information, and non-information. The essence of the message is that scientists and resource specialists, when confronted with a decision-making context, feel discomfort with a perceived inadequacy of data for addressing the question. Accordingly, they declare need to mount a new data acquisition effort. Assuming that the new data collection effort is sanctioned, and having proceeded to obtain the new data, they again perceive deficiencies so that the new data effectively becomes non-information since definitive direction for the pending decision is still not forthcoming. While this is perhaps more fundamentally a human psychological problem of anxiety concerning uncertainty, it nevertheless has a real impact in terms of the decision-making process. It also contributes to a negative view of investments in surveys from the decision maker's perspective.

Whereas the decision maker had reasonably expected to obtain solid quantitative guidance for administrative action, the support received is soft at best.

To the decision maker it appears that the value of past inventory investments has become negligible. It further appears that new inventory investments have questionable utility. The decision maker thus tends to overweight personal intuition and support

inventory operations only to the extent of having credible claim that inventory has been duly conducted. Clearly there is advantage to both decision makers and inventory specialists in breaking this cycle of negative feedback.

Although it may seem to be heresy, consider the scenario whereby inventory is obligated to render decision support on the basis of existing data, with new surveys only taking place when decision makers deem such guidance to be inadequate. Let there be a further inventory obligation to acquire supplemental data only to address particular deficiencies in existing data. Inventory must therefore combine evidence from disparate sampling operations in meeting its charge. This is known in statistical circles as Meta­

Analysis, which now holds respectable stature in the discipline. PATIL (1991) provides a cogent view and review of meta-analysis in the context of statistical ecology and environmental statistics.

Since the area of present interest is unlikely to be framed exactly as in previous surveys, there is effectively no proper retrospective sampling frame. In searching the data archives, one will encounter some samples from each of several previous surveys having pertinence to the present problem. Part of the area may have been intensively sampled in one survey, and a different part lightly sampled in another survey. In some instances, the coverage of prior surveys will partially overlap. It is likewise inevitable that the prior surveys will have been made at different times, making some data more current than other data. It thus becomes necessary to weight available samples differentially in arriving at a combined estimate. This is a case of Encountered Data and the set of weights constitute a Weighting Function. The retrospective samples and their associated weights give rise to a Weighted Distribution.

In similar manner to the way unequal selection probabilities for a proper sampling frame are used inversely for estimation, so also the weights are used inversely with appropriate normalization in arriving at a combined meta-analytical estimator. Since it lends increased utility to past inventory investments, it would seem that inventory specialists should exhibit willingness to engage in meta-analysis even without managerial coercion.

Encountered data and weighted distributions have a unifying purview that extends well beyond meta-analysis. The unifying quality of weighted distributions was perceived early on by RAO (1965), and revisited by him two decades later (RAO 1985). Encounter sampling with elaborately conceived weighting (detectability) functions forms the basis of transect sampling for fauna as discussed by PATIL et al. (1993). They consider the possibility that encountered and biased data may be more informative under weighted distributions than designed counterparts. PATIL ( 1984) has discovered weighted distributions as stochastic models in the equilibrium study of populations subject to harvesting and predation. MAHFOUD and PATIL (1981) and PATIL et al. (1986) have identified a Bayesian analogue to the theory of weighted distributions through the relationship of the posterior distribution to prior distribution via the likelihood.

Ranked-Set Sampling

Ranked-set sampling (RSS) fills a void in the repertoire of surveys by offering opportunity for direct exploitation of capability for ordination. Its history also underscores the importance of recognizing generality of concepts that may be introduced in limited context. The Center for Statistical Ecology and Environmental Statistics at The Pennsylvania State University has played a substantial role in solidifying theory for this approach and calling attention to breadth of applicability.

RSS originated with MCI NTYRE (1952) for estimation of pasture yields. After some years of inattention, it was similarly applied to forage under forest (HALLS and D ELL

1966) and forest regeneration (EVANS 1967). These early explorations shared the narrow context of ranking to select one among a set of locally clustered plots for measurement.

This particular context is affected by spatial autocorrelation among the members of local clusters.

RSS is conveniently viewed as a nonparametric sort of double sampling with sets. The first phase determination is a ranking, and the second phase determination is quanti­

fication of a subsample selected on the basis of rank. The scenario begins with choice of a set size (M) such that the M members of a set can be ranked with some degree of consistency. Although limited ranking errors do not obviate the method, increasing numbers of such errors will progressively degrade efficiency. The first phase sample is a series of sets, with each member of a set being individually selected (at random) from the population. When M sets of size M have been identified, the members are ranked within each set. One member of each set is then designated for quantification. The highest ranking member is designated from the first set, the second ranking member from the second set, and so on. Such a cycle of selection activity yields M samples for quantification in the second phase. The second phase sample size can be augmented in steps of M by repeating the selection cycle. Repeating the cycle R times will yield an ultimate sample of size MR.

The mean is estimated as for a simple random sample of size MR. Multiple cycles are required for estimating the variance of the mean and each sample value must be rank­

tagged, since squared deviations are computed about the respective rank means. The divisor for the sum of squared deviations is then MMR(R-1). Estimates of the mean and its variance are unbiased, even in the presence of possible ranking error. The nonparametric character of the RSS involves no restrictive assumptions concerning underlying dis­

tributions.

In terms of relative efficiency as variance ratio for estimators, RSS performs at least as well as simple random sampling with size MR. Since T AKA HAS I and W AKIMOTO (1968) have showed that (M+l)/2 constitutes an upper bound on efficiency for all continuous distributions with finite variance, large set size is advantageous if it does not impair ranking ability. It seems that ranking errors should generally increase with set size for many ecology/forestry applications, if only as number of ties that must be arbitrarily or randomly broken. Such increases in ranking error tend to counter the advantage of larger set size.

As discussed later, remote sensing and GIS provide a huge pool of opportunity for ranking in forest inventory. Ranking from these sources is, however, not without cost.

While unit cost of ranking has been considered under certain conditions, GIS and remote sensing will often involve fixed cost for the first set with much smaller variable cost for additional sets. Additional research is needed to improve designability of RSS with fixed cost for ranking.

JOHNSON et al. (1993) have reviewed RSS for vegetation. PATIL, SI NHA and TAILLI E (1992) have compared RSS with the regression estimator in double sampling. GO RE, et al.

(1993) have explored some multivariate issues in RSS. PATIL et al. (1993) provide a general framework for RSS with reference to encounter sampling and weighted distributions which were introduced in the previous section.

Composite Sampling

Ranked set sampling effectively increases sample representation while avoiding the more costly determinations for many samples. The same sort of observational economy can be achieved by composite sampling for some types of environmental variables, particularly compositional aspects of soil and some other substrates. Since soils will become

increasingly important for ecosystem-oriented forestry, this approach has potential relevance to forestry.

The composite sampling approach involves removing samples to the laboratory and mixing aliquots according to certain schemes for joint analysis. Considerable research in application of composite sampling for analysis of soil contaminants has been conducted in the Center for Statistical Ecology and Environmental Statistics (PATIL, G ORE and SINHA 1992). This work includes schemes for detecting more localized concentrations of contaminants/constituents. GORE et al. (1993) consider compositing in conjunction with rank set samples. This latter combination of approaches provides opportunities for exploiting GIS that remain to be explored.