A Bumpy Old Road: Size-based

9 A Bumpy Old Road: Size-based

Methods in Fisheries Assessment TONY J. PITCHER

9.1 INTRODUCTION

and so, in one sense, it matters little how they are obtained provided they are accurate (Rosenberg

If we were to take a thousand humans visiting and Beddington 1988). When properly used, size- Rome at random and line them up by rows of based methods should lead to the same estimates approximately equal height in St Peter’s Square, of these parameters as other techniques, although we could attempt a size–frequency analysis simi- the sources and impacts of uncertainty are dif- lar to that used in fish populations. The aim is to ferent. In some cases, as shown below, size-based estimate mortality and growth from the relative methods have advantages over, or can comple- frequency in the size classes. But this short-cut ment, conventional estimates based on direct human demographic analysis would fail for two ageing. reasons. First, humans stop growing in height in

In fish, length is easier to measure accurately their late teens, and subsequently live for another than weight (or strictly, mass), so the term ‘length-

40 years or so at the same height, except for a slight based methods’ is in general use. Lengths tend to shrinkage when very old. Secondly, humans breed increase smoothly throughout a fish’s lifespan, ex- continuously and are therefore not born in discrete cept in very old fish that are near their asymptotic cohorts. In fact, present size–frequency analyses size (L • or W • – see Jobling, Chapter 5, Volume 1). would fail in almost all mammals for one or other Huge old cod (Gadus morhua) like this greeted of these reasons. But fish are generally born in dis- Cabot when he discovered ‘New Founde Lande’ in crete, often annual, cohorts, following an annual 1497, but they are hardly ever seen in any fisheries or seasonal breeding season, and individuals grow today. In fact, weights are an even better guide to in size throughout life towards an asymptotic size. age than lengths, but are rarely used in analyses, The growth of fish is usually well approximated by mainly because they are harder and slower to mea-

a von Bertalanffy curve (see Jobling, Chapter 5, Vol- sure accurately. ume 1). Size–frequency analysis looks for peaks of

Length–frequency plots are bumpy old things numbers in the size classes to estimate the mean because they reflect actual variance in size with sizes of successive cohorts at integer intervals of ages as a consequence of individual differences in age, and at the relative numbers in these cohorts to growth rate, success in food acquisition and assi- estimate total mortality rates.

milation, a range of individual birth dates in one,

Size-based methods can be used to obtain esti- or sometimes more, cohorts, and inaccuracies in mates of growth and mortality when fish are diffi- measurement. As we will see, the net result of all cult, or too expensive, to age using hard parts. these uncertainties makes not only for bumpy old Growth and mortality themselves are employed in plots, but also means that approximate, ‘quick and assessments of the exploitation status of fisheries, dirty’ or ad hoc, length–frequency analyses tend

Chapter 9

still to be useful checks. Analytical methods for The average sizes and relative abundance of the length–frequency data have a long pedigree in fish- component cohorts provide measures of growth eries science, but none of them work as well as one and mortality. First, if we can follow the mean would wish, and as a consequence many fisheries sizes in a series of samples, we can estimate researchers have dabbled in the sport of inventing growth. Secondly, if we can follow the changes in new methods at some stage or another of their numbers of a cohort with time, provided that the careers.

changes accurately mirror changes in the underly-

In fact, length-based analysis still presents a ing population, we can estimate mortality rates. bumpy old road for anyone wishing to employ Variation among individuals in mortality and these methods. Hence, this chapter reviews the growth can be thought of as ‘smearing’ the original underlying principles behind the main methods cohort structure. that have stood the test of time, and especially those that are amenable to the revolution that has

quietly occurred over the past decade – the simple Distributions use of spreadsheets in the analysis of fisheries data. It is generally assumed that the variation in

Emphasizing utility over elegance, spreadsheets length of any one cohort follows a normal distri- mean that almost any competent general fisheries bution. The expected frequency, f, at length L for scientist can make a rigorous analysis without

a normal distribution of mean and standard training in the use of expensive software.

deviation, is

fL ( ms , )= [ Nw 2 ( 2 ps ) ]

9.2 (9.1) AGE AND MORTALITY METHODS

¥ 2 [ exp { - 05 . [ ( L - ms ) ] } ]

where N = sample size, m is the mean length, s is

9.2.1 A disappearing act: modes

the standard deviation of the lengths, w is the class

and growth models width and L is the mid-point of the class. In fish-

eries work, the assumption of normality should The sizes of fish of similar age in a cohort vary

be tested using a subsample of individuals aged about a mean. Fish populations usually comprise by conventional means. A ‘quick and dirty’ test several such cohorts, which are mixed together for normality can be performed using probability in a sample. If we know the shape of the size distri- paper, or it can be tested more formally using butions of the cohorts, we can try to dissect a skew and kurtosis coefficients (or very rigorously mixed sample into its constituent cohorts. Size- using goodness-of-fit tests between the data and frequency analysis thus provides a method of the expecteds for the normal frequencies in each ageing fish without the difficulties of preparing data-length class). Other distributions are some- and reading hard parts like scales or otoliths, and times employed. The log normal, in which the without having to kill the fish sampled.

lengths, means and standard deviations are trans-

Variation in length of fish of a given age gener- formed to logs, may be appropriate for weight- ally follows a statistical normal distribution, frequency analysis and sometimes for length although other distributions, like the log normal frequencies. The gamma distribution is also some- or the gamma may also be employed. Length- times used. frequency plots from a sample of a fish population are therefore mixtures of a series of overlapping

length distributions (Everitt and Hand 1981). Parameters to estimate Length–frequency analysis aims to dissect, to For a mixture of normal distributions we set

‘decompose’ or ‘deconvolute’, the mixture into its N to total sample size and obtain the expected components.

frequency at length L as:

Size-based Methods in Fisheries Assessment

f L = Â { [ Nwp i 2 ps ( 2 i ) ] may be clear and evident things to the human eye,

¥ [ exp { - 05 . [ ( L - ms i ) i ] } ]}

2 but with only small changes in the fish population

(9.2) parameters, or to the sampling procedure, they can

be surprisingly ephemeral. where N is now total sample size, p i is the propor-

The conditions under which modes appear have tion of this total in the ith age group, m i is the mean been formally investigated. For two components of the ith age group, and s i is the standard deviation in a mixture, Behboodian (1970) showed that sepa- of the ith age group. For h components, the prob- rate modes will be seen (bimodality) if: lem for length–frequency analysis is therefore to estimate the sets of proportions, means and stan-

(9.3) dard deviations. The p i s must sum to one, so we have (3h - 1) parameters to estimate:

m 1 - m 2 > 2 min { sm 1 , 2 }

but even then, they will not necessarily be clear to the eye for small sample sizes.

p 1 ;s 1 ;m 1 ; Figure 9.1(a) to (e) illustrates how modes can p 2 ;s 2 ;m 2 ; appear and disappear in length–frequency plots p 3 ;s 3 ;m 3 ; which are the overall sum of underlying normally- ......... distributed cohort components in a mixture. The p h ;s h ;m h ; values were generated from a mixture of four

Statisticians have shown that mixtures of normal normal components in proportion of abundance distributions are ‘identifiable’ (Yakowitz 1969):

4 : 3 : 2 : 1. The resulting overall length–frequency that is, we can in principle determine all the para- plot is unimodal in Fig. 9.1(a) but modes begin to meters in the mixture, provided that the assump- appear as the means get further apart in Fig 9.1(b) tion of normality is valid and provided we know and (c). Holding the means at 10 units apart, the the combined probability exactly. In practice, of modes vanish again when the standard deviations course, we only have the data histogram to esti- are increased in Fig 9.1(d) and (e). mate the latter. There is a more detailed discussion

Following these rules, three main factors in the of this point in Macdonald and Pitcher (1979).

fish population conspire to reduce the separability of modes in length–frequency data.

First, if fish grow according to the von Bertalanffy curve, as they approach L • , the Cohorts of fish which recruit at different times are cohort means get closer together and are there- consequently separated in mean size. The appear- fore less likely to reveal modes. This is known ance of separate modes in a size–frequency plot of a as the ‘pile-up’ effect. The ‘pile-up’ effect is sample taken from the whole population has long illustrated in Fig. 9.2 and may be investigated been interpreted by ecologists as revealing age using the animated spreadsheet available at groups (e.g. Petersen 1891). Within any one cohort <www.fisheries.ubc.ca/projects/lbased.htm> there will be a spread of sizes resulting from differ-

The appearance of modes

Secondly, variance in length-at-age increases ent birth dates and individual growth rates. This with length as fish get larger and approach L • , and spread may obscure the modal sizes of the separate so age groups are more likely to overlap. For cohorts.

randomly-varying L • , Rosenberg and Beddington

The procedure of estimating growth and mor- (1987) show that modes will appear in a two-cohort tality from a series of samples by tracking the mixture if: modes of each cohort has been termed ‘modal pro-

gression analysis’. There are a number of graphical { L • exp [ - ktt ( - 0 ) ] }¥ [ exp ( - k )- 1 ] techniques for achieving this aim, but they

2 > 2 2 s L { 1 - exp [ - ktt ( - 0 ) ] } (9.4) work well only when definite modes appear in the length–frequency plot. The problem is that modes where s 2 L is the variance at length L. This function

Length (c)

Length (e) 80

Fig. 9.1 Diagrams show the normally distributed cohort components (thin line) in length–frequency mixtures (thick line) generated from an algorithm (as described in the text). (a) The means of the four normally distributed components are 5 units apart and the standard deviations are set to 2.5. The overall envelope, representing the size–frequency plot, is unimodal. The overlap index V = 0.49. (b) The means of the four normally distributed compo- nents are increased to 7 units apart with the standard deviations remaining at 2.5. Separate modes are beginning to appear in the overall envelope. The overlap index V = 0.29. (c) The means of the four normally distributed components are further increased to 10 units apart with the standard deviations remaining at 2.5. Separate modes are clearly evident in the overall envelope. The overlap index V = 0.02. (d) The means of the four normally distributed components remain 10 units apart but the standard deviations are increased to 3.5. The modes, especially for the less abundant older age groups, are less evident in the overall envelope. The overlap index V = 0.27. (e) The means of the four normally distributed components remain at 10 units apart while the standard deviations are now further increased to 4.5. Separate modes have disappeared the overlap index V = 0.43.

Size-based Methods in Fisheries Assessment

Fig. 9.2 Diagram illustrating the ‘pile-up’ effect. The larger diagram shows normal distributions of

0 5 10 length around the mean lengths

Age of 10 successive annual cohorts. Inset: length-at-age projected (horizontally) from each mean length at integer age (circles) on a von Bertalanffy growth curve. Source: diagram taken from an animated spreadsheet available from www.fisheries.ubc.ca/ projects/lbased.htm.

Frequency

Length

will vary with length, so that separate modes are extended season in some tropical fisheries for

less likely at greater lengths if s 2 L increases.

which length-based methods are otherwise ideal.

If modes are present, they probably reveal co- with size can be quite complex and there has been horts, but if modes are absent, there can still be no rigorous investigation using actual fish growth. several cohorts present. This means that the older Rosenberg and Beddington (1987) show that if dif- name for these methods, ‘polymodal analysis’, can ferences in L • are the main source of variation be-

Unfortunately, the way in which s 2 L changes

be misleading. If modes appear, the simple graphi- tween individual fish, s 2 L increases with fish size. cal or approximate computer methods will give On the other hand, if most variation between fish good results. If there are no modes, one of the sta- is in the growth parameter k, s 2 L peaks at about tistical methods will be needed. In their review of half L • and then drops as fish approach the asymp- length-based methods, Rosenberg and Beddington totic size. Empirical data usually show variance in (1988) recommend the greater use of formal statis- length increasing with size, so that for sizes up to tical methods, which are not so dependent upon

0.7 L • the assumption of a constant coefficient of the appearance of modes. variation of length (COV-l) seems reasonable. The simulated length–frequency data used in this

paper uses a COV-l which falls almost impercepti- An index of overlap as a guide to bly with increasing length at first, but drops rapid- the appearance of modes

ly as L • is approached. This section presents an index of overlap which The first two problems above affect older age will, in conjunction with a consideration of sam- groups more seriously. But the third mechanism ple size, indicate whether modes are likely to which may obscure modes can affect young and old appear. This in turn will allow the researcher age groups alike. If recruitment of cohorts is con- to decide whether simple graphical or ad hoc tinuous, or extended over a large portion of the methods are likely to be adequate. year, the variance in length by age group will be

The first step in the calculation of an index, is to large and modes may be obscured from young ages decide roughly what the likely means and standard on. Unfortunately, recruitment may occur over an deviations of the proposed age groups are. The

Chapter 9

overlap index can then be calculated as follows, The first of these can be accommodated by most

a quick task using any modern spreadsheet: length-based methods, provided that the variance ih =- 1 is not too large. But the second is a major problem

V = Â { [ ( m i + q s i )- ( m i + 1 - q s i + 1 ) ]

for all the methods, as it tends to destroy cohort

i = 1 [ ( m i + q s i )- ( m i - q s i ) ]} h (9.5) structure.

A further assumption is that the length- where q = 1.96 to give 95% limits, and h = number frequency data in your sample fully represent the of age groups. This is then repeated in order to com- length classes in the fish stock. If they do not, then pare several alternative hypotheses.

the sample data will need to be adjusted to com- The index V reflects the average proportion pensate for the selectivity of the sample gear. A by which the 95% zone (i.e. 19 out of 20 fish of neat series of tests for this employs the relation- this age) for the age group is overlapped by the ship among length at maturity, L m , age of maxi- 95% zone for the next age group. When V is mum yield-per-recruit, L opt and temperature in negative, there is a very wide separation of the order to evaluate the validity of the length– age groups. When V is greater than about 0.25, frequency sample (Froese and Binholan 2000). For modes disappear. The V for individual age groups example, these authors show how a trawl survey of can also be usefully examined where the separa- Nile perch (Lates nilotica) taken in Lake Victoria tion of adjacent ages differs across the length– in 1982 missed all fish larger than 100 cm. Since frequency plot. Example values of V are given in this is far less than L opt = 136 cm, the survey Fig. 9.1(a) to (e).

length–frequency data could not be used for assess- ment of the perch population at that time.

9.2.2 Assumptions of

The starting point in all analyses is the

length–frequency analysis length–frequency distribution, adjusted if neces-

sary, with known class width and class boundaries. For length-based methods to work, fish must It is worthwhile taking a lot of care over the recruit in discrete cohorts. There is no obvious class boundaries: lower bound, mid-point and way that length-based methods could provide both upper bound of the classes are all used in different growth and mortality estimates for continuously methods. recruiting populations, like the human example given earlier. For a caveat see the Conclusions

9.2.3 Classification of

section of this chapter. Discrete cohorts usually derive from separate spawning seasons, but there

length–frequency analysis methods

may be more than one of these per year. The co- Methods may be divided into parametric and horts must remain discrete as the fish grow older. non-parametric. Another classification is into This requirement may be relaxed to a certain simple ad hoc methods, which are often essen- extent for different methods of analysis, but, gen- tially graphical or non-parametric, and rigorous erally, the methods work better the more discrete statistical estimation methods, which are usually the cohorts and the more separate they remain as parametric. they get older. This implies that the growth of indi-

Parametric methods depend upon estimating viduals in a cohort should be similar, i.e. that vari- the means, standard deviations and proportions or ability in growth rates among individuals of the numbers in each of the cohorts in the mixed sam- same age is not large.

ple. These are the parameters of the size-frequency So there are two sources of variation in size distributions, hence the term parametric. The size within a cohort of fish:

distributions are generally taken as normal, but log • different birth dates within the spawning normal and gamma distributions may also be em- season;

ployed (Macdonald and Pitcher 1979). The meth- • different growth rates among individuals.

ods include both graphical (e.g. probability plots)

195 and computational (e.g. mixture analysis) meth- from a series of formal single-sample estimations:

Size-based Methods in Fisheries Assessment

ods, but all make strong assumptions about distri-

a clear example is discussed by Sparre and Venema butions. In parametric methods, the number of (1992). cohorts generally has to be determined by the user,

Gulland and Rosenberg (1990) outlined simple and several scenarios may have to be compared.

interpretations that may be made from visual in- Non-parametric methods do not depend upon spection of length–frequency plots. Type A, a estimating the parameters of the cohort distribu- single mode that stays in the same place through tions directly. So they make only weak assump- time, can be produced by gear with high selecti- tions about the distribution of sizes within the vity, such as gill-nets, or by fish, for example cohorts, that they are roughly distributed about yellowfin tuna (Thunnus albacares), that migrate some modal or central value, and hence are analo- with age. The authors say that not much can be gous to non-parametric statistics. The modal done with type A although see the conclusion sec- lengths of each cohort are fixed to lie upon a curve tion of this chapter. Type B, a single mode moving described by a growth model. Generally the von steadily upwards, is typical of single-cohort fish- Bertalanffy model is used, but other models such eries such as prawns or squid, which are good can- as a seasonal growth model, can be employed. didates for any simple analysis. Type C, with many Hence the non-parametric methods make strong clear modes like Fig. 9.1(b) or (c), may also be a good assumptions about growth. In non-parametric subject for the classical techniques described methods, the number of cohorts is implicit in below. Type D, with smeared modes like Fig. 9.1(a) the estimates of growth model parameters, and or (e), may be hard to analyse. may be revealed when cohorts are sliced into

The use of probability plots was reinvented age groups.

several times by fishery workers (Harding 1949; The following sections briefly describe graphi- Cassie 1954; Harris 1968). Originally they were cal methods, most of which are parametric, three done on special probability paper, but today it is non-parametric methods, and two fully statistical easy to set them up on a spreadsheet using the parametric distribution mixture methods. Most of built-in normal distribution function. A series of them can be carried out using spreadsheets.

progressively more sophisticated graphical meth- ods were based on the change of slope of a parabolic