Fitting and suppressing trends

7.5.4 Fitting and suppressing trends

At first sight there seems to be no reason why one should not fit trends of catchability to the data from all fleets in an ad hoc tuning analysis, and this approach has indeed been used in practice in the past. In the present context this leads to the so- called Hybrid method. However, it has been known for some time that indiscriminate fitting of trends of catchability is a dangerous procedure (see, for example, Anon. 1983). Qualitatively it is obvious that allowing much variation in catchabil- ity must be undesirable – the whole idea of CPUE tuning rests on the assumption that CPUE is an in- dicator of abundance, and any variation of catcha- bility weakens that foundation. In fact, if one allowed catchability to vary with abundance, rather than with time, CPUE could in fact become independent of abundance, when catchability is inversely proportional to abundance, and therefore lose all utility as an indicator. A similar but less se- vere problem arises with variations with time. In fact, we now know that allowing for an arbitrary exponential trend of catchability with time causes linearized versions of the equations one is trying to solve to become singular, meaning that they ac- quire a zero eigen-value, and this makes the solu- tion indeterminate again.

The conclusion from this is therefore that one should not fit trends of catchability unless this is inescapable. Indeed, one may state quite firmly that it is highly undesirable to fit trends to the catchability for all fleets. If one does, the solution will be at best very poorly determined, and at worst effectively a random number. This rules out in practice the Hybrid method in its pure form, which applies to variable catchability on all fleets. One should in fact keep catchability constant for as many fleets as possible. Results of simulation tests (Anon. 1988; Sun 1989) show that paradoxically one may in fact obtain more precise results by fix- ing catchability on all fleets, even when this is an erroneous assumption, than by fitting the trends,

149 even when one only fits the trends where they error although this is not guaranteed. If this is the

Dynamic Pool Models I: Interpreting the Past

really exist. This is because the expected reduction case, however, including both sets of data should of bias by fitting the trends is outweighed in the improve the analysis. The surveys will control the final prediction error by the associated increase in absolute size of terminal F determined, but the variance, manifest as sensitivity to noise.

commercial data will help to reduce the variability

Nevertheless, it would be unreasonable to of the estimates. Secondly, it is often found that just fix catchability uncritically by using the Lau- surveys give good (precise) results for younger fish, rec–Shepherd method, for all and any data sets: one not fully recruited to the commercial fisheries, but needs to start with fixed catchability on ‘at least poor results for older fish of which relatively very one reliable fleet’ – meaning one where there is a few are caught. The surveys therefore dominate high probability of little or no trend in catchability. the results for younger fish, and the commercial But this can only be based on a prior assumption, or data dominate for old fish, where the reduction more commonly prejudice, since there is no way to of estimated variance may be substantial. There test the assumption made. Indeed, selecting differ- is therefore still some point in including even ent fleets to be the standard in this way will usually unstandardized commercial data, but it would lead to different results – all the trends are just de-

be much more valuable if it were properly termined with respect to one another, and all ulti- standardized. mately depends therefore on which subset of fleets are taken as the standard. One may, however, have quite strong grounds for the choice: in general a re- search vessel survey would be preferred to a com-

7.6 THE EXTENDED

mercial data series, and one where it is known that

SURVIVORS METHOD (XSA)

great care has been devoted to standardization of gear and fishing practice would be preferred to one

7.6.1 Introduction

which is known to be conducted in a more haphaz- Fully integrated statistical methods such as those ard way. Nevertheless, this remains a potential described by Quinn and Deriso 1999 and briefly weak point of this and all similar methods of analy- outlined by Sparre and Hart (Chapter 13, this vol- sis, and if one finds a discrepancy between the re- ume) are in principle the most desirable methods sults of what should be well-standardized data to use for the analysis of catch-at-age and CPUE series, one may be left in doubt about the correct data. The models so fitted are as accurate a repre- result.

sentation of the dynamics of populations as pos-

This also serves to emphasize that standardiza- sible, and they can allow for errors in all measured tion of effort data remains vitally important, since quantities. They do, however, usually have a large otherwise one can never have any confidence that number of parameters to be estimated, and this

a data series may be taken as a standard. Indeed, at makes them computationally relatively quite de- first sight it seems that one might as well just dis- manding, and sometimes less robust. card any data from unstandardized fleets, since one

These is in fact a middle way between the would be able to fit any necessary trend of catcha- known crudities of the ad hoc tuning methods, bility for them, and they will add little or nothing and the full integrated statistical methods. This is to the determination of terminal F. This would, the ‘Survivors’ method, which was devised by however, be hasty and undesirable, for two rea- Doubleday (1981). The original implementation sons. First, survey data is (one hopes) unbiased by Rivard (1980) has, however, not been very with respect to trends of catchability, but it is often widely used, partly because it was written in the quite variable in that it has a high sampling error. APL programming language, which is not univer- Conversely, commercial data may be biased by sally available, and is quite impenetrable to the trends of catchability, but may be based on higher uninitiated, and partly because it did not always levels of sampling, and hence have lower sampling give reliable results (D. Rivard and S. Gavaris, per-

Chapter 7

sonal communication). The original implementa- which follows. The notation used is as follows. tion also allowed for only one set of survey/CPUE Firstly, suffix k is used to index cohorts (year data, which is a considerable disadvantage in classes). Thus practice. A development of the original method (Shepherd 1999), known as Extended Survivors k = y - a.

(7.14) Analysis, overcomes these difficulties, corrects an

inconsistency in the original formulation, and has Suffix i is used to index ages within a cohort, and given excellent results in simulation tests.

the range of i within a summation usually runs The essential idea of these Survivors methods is from the current age a to the maximum observed

that they focus on determining the surviving within the cohort, a max , where population for each cohort, and use VPA, or in prac-

(7.15) for the estimation of past population abundance.

tice, the cohort analysis algorithm of Pope (1972), a max = min , ( gt - k )

Thus, they are expressed directly in terms of the (where g is as usual the greatest true age group, and survivors, the variable for which results are really t is the final year). The notation cum (short for required, avoiding the error-prone projection cumulative) is used to denote the operation of through the final year characteristic of ad hoc tun- accumulating something over all subsequent ages ing procedures. Second, they treat the catch data as within a cohort, so error-free in the calibration (VPA) phase of the cal- culation, but allow for deviations from the result-

(7.16) survivors. This reduces the number of parameters

a ing population estimates when determining max

cum = Â .

ia =

to be determined considerably, and permits the use of simple iterative algorithms rather than direct Later, the notation cum¢ will be used to denote ac- search minimization methods. By treating the cumulation over all previous ages, i.e. over the catch data as exact during a non-critical phase of range i = a max , a - 1. The mnemonic symbol ECZ the calculation (the VPA), one gains a substantial denotes Exponential Cumulative Z or total mor- conceptual and computational saving. The impre- tality, i.e. cision introduced is unlikely to be severe unless the total international catch-at-age data are sub- ECZ y a ( , )= exp { cum Z y a [ ( , ) ] } .

(7.17) ject to larger errors than are the data for individual fleets. This is not impossible, but it seems not to be Similarly, ECM denotes Exponential Cumulative the usual situation. The method is therefore useful (Natural) Morality: and practical, and achieves some of the benefits of the integrated methods, without incurring the ECM y a ( , )= exp { cum M y a [ ( , ) ] } .

(7.18) major extra costs.

The symbol P t (k) is used to denote the terminal

population at the end of the final year, i.e. the sur- In this section we outline the method as formu- vivors for each cohort lated by Doubleday (1981), in a rather simplified

7.6.2 Description of the method

manner, since the original description allows for a Pk t ()= Py ( max + 1 , a max + 1 ) , (7.19) number of confusing complications which either

were not implemented by Rivard (1980), or are where we recall that a max = min(g,t - k) from rarely used in practice. The aim is to explain the equation (7.15) and similarly basis of the method, and to prepare the ground for the account of the Extended Survivors method y max = min ( k + gt ,. )

151 With this notation, Pope’s cohort analysis constant with respect to time. Now, the catch data

Dynamic Pool Models I: Interpreting the Past

equation are treated as exact in the VPA, and the estimates of population so obtained using equation (7.22) are re-

garded as the best available estimates of the un- + exp [ Mya ( , ) 2 ] Cya ( , )

Pya ( , )= exp [ MyaPy ( , ) ] ( + 1 , a + 1 )

(7.21) known true abundances, and will be treated as error-free estimates thereof. The survey CPUE

may be rewritten for the final age group in each co- data, however, also provide, through equation hort (k), for which the survivors are P t (k)

(7.25), a relatively error-prone estimate of these same abundances, once the survey has been cali-

Pya ( , )= exp [ MyaPk ( , ) ]()+ t exp [ Mya ( , ) 2 ] Cya ( ,. ) brated by determining the reciprocal catchability. (7.22) We seek therefore to determine both the survivors and the incidental variables r(a) by minimizing the

Then, applying equation (7.21) repeatedly discrepancies between the VPA estimates of popu- lation P vpa , and those determined from the CPUE,

P vpa ( ya , ECM y a P k ( ,

)= P )()+( t Pya c ,, ) est , which are

where P c (y,a) is the contribution to the population (7.27) arising from the raised and accumulated catches,

P est ( ya , )=()( rauya ,. )

Since the errors in P vpa are assumed to be small, the

main source of errors will be those in the CPUE

a max

Pya c ( , )= Â ECM k ( +¢¢ aaCk , ) ( +¢¢ aa , )

data. These are assumed to be log-normal, but of

aa ¢=

variable size s 2 (y,a), so that

¥ exp [ - 05 . Mk ( +¢¢ aa , ) ] .

(7.28) Equation (7.23) is a simple explicit expression for

P est ( ya , P vpa ( ya , exp ) N 0 , )= 2 [ ( s ( ya , ) ) ] .

the population-at-age in terms of the key variables Under the assumptions of normality and indepen- to be determined (the survivors) and some con- dence (etc.) maximum likelihood estimation of

stants (including the P c (y,a) array), which depend the parameters reduces to weighted least squares only on the data and on natural mortality which is estimation, and we therefore seek to minimize taken is taken to be known or, at least given.

The second main ingredient of the method is a ln 2 P est ( ya , ) ln P vpa ( ya , ) model for the relationship between CPUE and the S =

]- [

(7.29) population abundance. Doubleday uses the same

( ya , ) model as that which underlies the ad hoc tuning

procedures, i.e. The magnitude of the errors s 2 (y,a) cannot usually

be taken as known a priori, and they must usually uya ( , )=()( qaPya , )

be estimated from the data. Taking them to be con-

stant with respect to time, so as to avoid estimat- where u has no suffix f (for fleet) since for the ing as many parameters as we have data points, and

moment we assume that there is only one fleet or substituting equation (7.27), we have survey. In practice it is convenient to reverse this

equation, and write 2 S = Â Â [ ln ra ()+ ln uya ( , )- ln P

vpa ( ya , ) s 2 ] () a . (7.30)

Pya ( , )=()( rauya , )

Differentiating with respect to ln r(a) and setting where r(a) denotes the reciprocal of catchability, this to zero, and rearranging, allows ln[r(a)] to be which is assumed to be a function of age, but to be written explicitly as

Chapter 7

 ln P vpa ( yauya , )( , ) 2 [ ] s () a important technical details need to be handled

ln [ ra () ]= a 2 .

(7.31) more carefully.

 1 s () a a