Ad hoc methods for tuning VPA

7.5.2 Ad hoc methods for tuning VPA

This approach to the analysis of catchability leads directly to practical procedures for tuning VPA. Starting with a trial VPA, one iteratively computes catchability, analyses it by means of regressions and/or averages, uses these estimates to predict terminal F, and repeats the procedure until it has converged. Such methods are generally referred to as ad hoc tuning methods, because they are not di- rectly based on fitting a formal statistical model. They should not, however, be confused with even more ad hoc approaches which have sometimes been used, based on regressions of F on fishing ef- fort, possibly raised in some way to the total fish- ery, or correlations of some measure of biomass with CPUE. These are fraught with problems and not recommended.

The ad hoc tuning method of analysis, for each fleet and age, proceeds as follows: past estimates of log catchability over the available time period are used to make an estimate of the current value. If one is assuming constant catchability, this esti- mate is just the average. Otherwise a regression against time is calculated, and used to make the prediction. A conventional predictive regression is appropriate here, since all the errors are in the catchability estimates: the explanatory or inde- pendent variable, time, is known exactly. In either case, an estimate of the standard error of the pre- diction is also made, either from the standard devi- ation of the observations about the mean, or the standard three-term formula for the standard error of a further prediction from a regression. Note that the standard error of the mean, or the two-term for- mula for the standard error of the fitted value, are not what is required here. Any prediction based on the estimate will be fully affected by the usual level of error in the observations, so this residual error must be included via the three-term formula.

The estimation of these standard errors is im- portant, because one often has several sets of abun- dance indices. Each may be analysed in this way, each may be used to make a prediction, and these results will invariably be in conflict with one an- other. Some way of reconciling these conflicting estimates is needed, and a weighted mean, with

Dynamic Pool Models I: Interpreting the Past

145

Chapter 7

weights based on the inverse variances of the indi- priate and inferior, and only of historical interest. vidual estimates, is an obvious and attractive can- Readers interested in these aspects are referred to didate. This overall estimate can in fact be shown the reports of the ICES working group on the to be the minimum variance estimate, and can be Methods of Fish Stock Assessment for the period justified as a maximum likelihood estimate under 1983 to 1988 for more details. Some computer suitable and not unreasonable assumptions. It packages may still offer these variants as options, clearly makes sense, too, to give greatest weight to but we would recommend that they be avoided. these estimates which have historically given the The ICES working group also recommended that most precise predictions. Prior estimates of the other possible variants which are still virgin quality of the various data sets based, for example, should be left intact! Simulation tests of various on the number of stations worked, the proportion methods have been carried out (Pope and Shepherd of the total catch taken, or the area covered, may in 1985; Anon. 1988; Sun 1989) and confirm that the fact be quite misleading: the acid test should be the procedures described above work about as well as actual performance based on the historic data set.

anything else which was then available, and as The full VPA tuning algorithm may therefore be well as could be expected, given the inevitable im- written in pseudo-code as follows:

perfections in the available data.

These methods are often referred to as the Laurec–Shepherd method, where catchability is

Pseudo-code for ad hoc tuning of VPA

Initialize: Set terminal F = 1.0 (for oldest ages and treated as constant and its average is used, and the last year) Hybrid method when a time trend of catchability Do VPA is fitted. A Mixed method is also possible where For each age trends are fitted for some fleets but not others. The For each fleet methods may be available as options in standard For each year computer packages such as the Lowestoft and Calculate ln(q)

ICES VPA packages.

Next year Fit model to ln(q) (by regression) Predict terminal ln(q) (and

7.5.3 Practical aspects

standard error thereof) In carrying out VPA tuning and similar analyses, a Calculate terminal partial ln(F) number of practical details have to be decided, Raise to terminal total ln(F) which may have significant effects on the results. Next fleet The most important of these are discussed below, Combine to get weighted average ter- and these issues are still relevant when more ad- minal total ln(F)

vanced methods are used.

Retransform Next age Repeat (VPA et seq.) until converged

7.5.3.1 Inclusion or exclusion of data

Print results etc. Firstly, if one has an abundance of data, one must As mentioned above, in the past, numerous decide how much of it to use. It is not obvious that variations on this preferred procedure have been it is best to use all of it. Firstly, the data for the used. For example, a similar calculation can be car- youngest and oldest age groups may be poorly sam- ried out without using the logarithmic transfor- pled, with very high sampling errors, and at best mation, or regressing q (or ln(q)) against population may add little to the analysis, and at worst may size, or fishing effort. It is also possible to perform confuse it or even cause it to crash. Prior estimates similar analyses on CPUE data which have been of precision are not usually available, but the oc- aggregated over fleets in some way. All these vari- currence of numerous zeroes is almost always ants should now be regarded as relatively inappro- diagnostic of severe sampling problems (for more Print results etc. Firstly, if one has an abundance of data, one must As mentioned above, in the past, numerous decide how much of it to use. It is not obvious that variations on this preferred procedure have been it is best to use all of it. Firstly, the data for the used. For example, a similar calculation can be car- youngest and oldest age groups may be poorly sam- ried out without using the logarithmic transfor- pled, with very high sampling errors, and at best mation, or regressing q (or ln(q)) against population may add little to the analysis, and at worst may size, or fishing effort. It is also possible to perform confuse it or even cause it to crash. Prior estimates similar analyses on CPUE data which have been of precision are not usually available, but the oc- aggregated over fleets in some way. All these vari- currence of numerous zeroes is almost always ants should now be regarded as relatively inappro- diagnostic of severe sampling problems (for more

Secondly, there is no doubt that in fisheries things change, for reasons which are not always understood. Old data may therefore no longer be representative of current conditions. There may have been some unrecorded changes in the manner of fishing, by either commercial or research ves- sels, or in the spatial distribution of the stocks because of climatic or multi-species effects, or the way in which the data is worked up may have changed. In general, therefore, one usually wishes to place greater reliance on recent data than on old data. In practice a VPA can be carried out quite sat- isfactorily on five to 10 years’ data, and data more than 20 years old may be regarded as ancient his- tory. A convenient way of taking account of this is to apply a tapered weighting to the data before re- gression or averaging, the terminology being bor- rowed from spectral analysis.

Finally, it is necessary to decide whether data for all available fleets should be included in the analysis. In principle the method of analysis is it- self testing the quality of the various data sets, and allowing for this as far as possible, so that all avail- able data may indeed be used. In practice, however, there is little point in including bad data along with good data – it will be down-weighted, and merely increase the labour of the data preparation and the volume of the output with little or no bene- fit. Furthermore, although modern tuning meth- ods do attempt to allow for data of varying quality, they cannot do so perfectly. They are not com- pletely impervious to bad data, and may be ad- versely affected by it – particularly by outliers in the data for the most recent year. It is therefore ad- visable to remove poor-quality data sets once these have been identified. Here the provision of good di- agnostic information is most helpful (see below). No hard and fast rule can be given, but we would

certainly regard any data set which consistently gives log standard errors of one or more, on signifi- cant age groups, with grave suspicion – that means that the prediction is not even good to within a fac- tor of 3 either way, meaning that it is barely yield- ing even an order-of-magnitude estimate.

7.5.3.2 F on the oldest age

A second important choice in practice is what to select for F on the oldest age groups. This is, of course, not determined by the tuning procedure, and must be specified in some other way. At one time it was common practice to set these F values arbitrarily, or by trial and error, or simply to choose

a value, and not subsequently ever change it. When overall levels of F are high at around 1.0, these choices are adequate: the resultant F values on other ages are little affected, because of the strong convergence of the VPA. At moderate and low lev- els of F, however, the effects are more serious, and arbitrary or careless choices of F on the oldest ages can lead to very strange results and odd exploita- tion patterns. This is in fact the key to a practical solution of the problem, which is addressed in a more fundamental way by the integrated and ‘sur- vivors’ methods of analysis discussed later on.

Bearing in mind that the results of separable VPA show that the shape of the exploitation pat- tern on the older ages is undetermined, and that we have added nothing in the ad hoc tuning methods to determine it, it seems reasonable to choose F on the oldest age group so that the shape of the ex- ploitation pattern is internally consistent – for ex- ample, terminally flat, meaning that it levels off on the oldest ages. This can be done by setting each F value to some proportion of the average over a few of the next younger age groups. A proportion of 1.0 and an average over about five age groups will usu- ally achieve terminal flatness, and these are suit- able default choices. It is, however, still very important not to choose low values for F on the oldest ages without strong reasons, since this can destroy the convergence property of VPA, and drive one to the ever-present trivial interpretation of low F, and large populations, everywhere. If in doubt, one should err on the side of higher Fs on the

Dynamic Pool Models I: Interpreting the Past

147

Chapter 7

older ages rather than low ones. It occasionally happens that an initial tuning analysis gives sur- prisingly low Fs. In such a case it would be desir- able to test the effect of increasing the oldest age Fs, by increasing the proportion above 1.0, to see if this removes the problem.

7.5.3.3 Diagnostic information

Even if one is using an analysis package which is fairly well automated, it is important that as much statistical examination of the data, and the fit of the model to it, as possible should be carried out. This requires that good statistical diagnostics be made available. There are, of course, a large num- ber of potential diagnostics which could be calcu- lated, including correlation coefficients between variates, tests of normality and so on. In practice the most important and useful ones, when using a regression method, are probably the estimated slope and its standard error, the standard error of the estimated log catchability, and the highlight- ing of possible outlying observations. It is impor- tant to know the size of the slope and its standard error, so that one can judge both the practical and the statistical significance of the slope, when de- ciding whether or not to fit a trend of catchability, as discussed in Section 7.5.4 below.

The standard error of prediction of ln(q) is the best final arbiter of the quality of the result from a particular data series, at a particular age. It controls both the weight attached to each individual esti- mate, and the overall error of the final weighted es- timate of F, which can and should be calculated. As mentioned above, high prediction standard errors indicate problems with a data set, and should never

be ignored. Similarly, highlighting of possible out- liers helps quality control of the data, as it may pick up data processing errors before it is too late, and will bring unusual results to the analyst’s at- tention for further investigation. These should be regarded as a minimum set of diagnostics. In prin- ciple, the more diagnostics examined the better, but it is in practice necessary to strike a balance, because of the psychological disincentive attached to the study of large volumes of output. There is, however, no doubt that there is room for further

development of useful diagnostics. This approach has at present been taken furthest in the integrated statistical methods (see Quinn and Deriso 1999).