144 J
. Jamrozik, L.R. Schaeffer Livestock Production Science 67 2000 143 –153
sufficient for many research tasks and saves the user mink’s function Wilmink, 1987 with three parame-
from software development time and from possible ters per trait giving 72 equations per cow with TD
programming errors. However, efficient, routine records and 36 equations per animal without data.
genetic evaluations from specially developed soft- For the Holstein breed, CTDM required processing
ware can save time for delivery of results and may over 21 million TD records on 1.3 million cows in 2
be necessary when general software can not accom- million contemporary groups and 2.2 million animals
modate the model or data size. in total. The total number of equations was more
Iteration on data Schaeffer and Kennedy, 1986 than 135 million.
has been used widely as a method of solving MME. Several requests were received from Europe by
The MME are not constructed explicitly, but data Canadian Dairy Network CDN to acquire the
files are read each round of iteration or stored in computer programs used in the CTDM, but the cost
memory, and diagonal elements, right hand sides, of the programs was a major obstacle. The decision
and solutions need to be stored in memory. Iteration was made, therefore, to publish the gory computa-
on data allows for a variety of techniques to be tional details in this journal so that others may write
applied, such as Gauss–Seidel, Jacobi, or combina- their own programs if they want. Also, the details
tions of both, sparse matrix techniques Misztal, given here can serve as the beginning of the history
1999, transformations to simplify multiple trait on computing algorithms for random regression
problems Ducrocq and Besbes, 1993, or parallel models. First attempts, such as given here, are
processor algorithms for solving large sparse equa- usually replaced with better algorithms over time.
tion systems Madsen and Larsen, 1998. Thus, the objectives of this paper were to present the
The scale of the equations to be solved has computing details used in the CTDM, to present an
dramatically increased with the introduction of test outline for an alternative computing procedure that
day TD models Ptak and Schaeffer, 1993 for uses less memory and disk space, and to compare the
genetic evaluation of dairy cattle. Reents et al. computing requirements of the two algorithms when
1995 presented an efficient iteration on data algo- applied to data of the Canadian Jersey dairy breed.
rithm for a multiple lactation TD model that has been applied in Canada and Germany. Random regression
RR TD models Jamrozik et al., 1997 added
2. Material and methods
another level of complexity, mainly through an enormous increase in the size of the MME. Savings
2.1. Model in computing requirements for TD models have been
suggested by using transformations Van der Werf et The CTDM has been described by Schaeffer et al.
al., 1998. Canonical transformations for the random 2000a with a discussion of the experiences in using
regression model does not lead to a series of a TD model for routine genetic evaluation. In matrix
univariate analyses, but to a multiple trait model of notation, the multiple lactation, multiple trait, ran-
reduced rank in which only the variables with dom regression TD model could be written as
significant eigenvalues are evaluated. Then the multi- y 5 Hc 1 Xb 1 Wp 1 Za 1 e,
ple trait model with missing traits of Ducrocq and where
Besbes 1993 is applied. Parallel computing tech- niques have also been attempted Stranden, 1999.
The introduction of a multiple lactation, multiple y
is the vector of observations on T traits in L trait, random regression test day TD model in
lactations ordered traits within lactations, Canada in February 1999 was possible through the
c is the vector of fixed contemporary group
design and development of specialized software. effects defined as herd-test date-parity sub-
Milk, fat, and protein yields plus somatic cell scores classes,
within each of the first three lactations are simul- b
is the vector of fixed regression coefficients taneously analyzed in the Canadian Test Day Model
nested within time period–region–age-pari- CTDM. Random regressions were based on Wil-
ty–season subclasses,
J . Jamrozik, L.R. Schaeffer Livestock Production Science 67 2000 143 –153
145
p is the vector of random regression coeffi-
where cients for animal permanent environmental
PE effects, A
is the additive genetic relationship matrix, a
is the vector of random regression coeffi- P
are covariance matrices of order RT L for cients for animal additive genetic effects,
and the PE and genetic regression coefficients,
e is the vector of random residual effects,
G respectively, and
H is an incidence matrix that relates contem-
R is the covariance matrix for cow i on a given
ij
porary groups to observations, and test day j.
X,W,Z are matrices of covariates involving number
of days in milk associated with a cow on a The rank of R can vary from 1 to T depending on
ij
given test date and corresponding to the the traits that are missing on a cow. The values in R
ij
observations. depend on the lactation number and number of days
in milk within that lactation. The
corresponding mixed
model equations
Genetic groups for unknown parents are included MME for this model are
in the vector a, for simplicity of notation, and in the definition of A, the matrix of additive genetic
21 21
21 21
ˆ H9R H H9R X H9R W
H9R Z c
relationships. In the CTDM, the fixed regressions
21 21
21 21
ˆ X9R H X9R X X9R W
X9R Z b
and the random genetic and PE regressions were
21 21
21 21
21
ˆ W9R H W9R X W9R W 1 P
W9R Z p
1 212
modelled after Wilmink’s function Wilmink, 1987,
21 21
21 21
21
ˆ Z9R H Z9R X Z9R W
Z9R Z 1 G a
but in general, these regressions do not need to be
21
H9R y
modelled after the same function. For example, the
21
X9R y
fixed regressions could be modelled as classification
5 .
21
variables for every 10 days in milk because the
W9R y
1 2
21
shapes of the lactation curves would be allowed to
Z9R y
take whatever form was appropriate rather than being Let
forced to fit a particular function. Let R be the number of regression coefficients in the function
NPE equal the number of animals with TD
used for the lactation curve, i.e. for Wilmink’s records,
function, R 5 3. NAN
equal the total number of animals, The expectations and covariance matrices are
NCG equal the number of contemporary groups,
and y
Hc 1 Xb p
NFR equal the number of subclasses of fixed
E 5
a regressions.
1 2 1 2
e Then the total number of equations in MME for
and this model would be
p P
NEQ 5 NCGT 1 NFRT R Var a 5 0
G 0 ,
S D S D
1 NPE 1 NANRT L, e
R where
but because NFR is usually much smaller than NCG, NPE, or NAN, and because NCG does not involve
P 5 I P , any regression functions, then NEQ can be roughly
approximated by G 5 A G ,
NEQ 5 NPE 1 NANRT L. and
Suppose that R 5 3, L 5 3, NPE51 000 000, NAN5
1
R 5 S R , 2 000 000,
and T 54,
then NEQ
would be
ij
146 J
. Jamrozik, L.R. Schaeffer Livestock Production Science 67 2000 143 –153
21
108 000 000. If all calculations are performed as ing the elements of A
. Recall that the relationship double precision i.e. eight bytes per variable, then
matrix inverse can be decomposed as storing the solutions, diagonal elements, and right
21 22
A 5 T 9D
T, hand sides to MME would require a minimum of 2.6
gigabytes GB of memory. If the computer on where T is a triangular matrix with ones on the
which the calculations are to be performed has only diagonals and at most two non-zero elements per row
2 GB of memory, then not all of these elements can in columns corresponding to the parents of an animal
22
be stored in memory at one time or computations with value equal to 20.5, and D
is a diagonal must be done in single precision, or a combination of
matrix. In a non-inbred population the diagonal both. In the United States NAN would be greater
elements are 2 if both parents are known, 4 3 if than 10 million for their Holstein population.
only one parent is known, and 1 if both parents are unknown. With an inbred population then there are
many more possible values for these diagonal ele- 2.2. Algorithm A
ments which can be computed using the methods of Meuwissen and Luo 1992. The variables in the
A multiple trait model provides an obvious block- PEDIGREE file are
ing structure of MME by traits. With the TD model, blocks can be defined on different levels of generali-
Animal number ID, ty. Two different blocks will be used through the
Its sire ID, description of Algorithm A, which is the algorithm
Its dam ID, and currently used with the CTDM. Record blocks RB
22
Value from D .
are determined by the residual covariance matrix, R ,
ij
on a given test day for an animal. These blocks are The ID numbers of animals were consecutive from
of order T or 4 for the CTDM. RB matches the way 1 to NAN, i.e. youngest to oldest sequence. Genetic
in which data are stored, i.e. four traits within an groups were assigned for all missing parents and
animal on a given test day. The contemporary group were numbered from NAN11 onwards. Obviously,
effects have diagonal blocks of order T as well as T there must be another file that links the consecutive
elements in the right hand sides RHS. That is, ID number to an animal’s registration number, name,
21
H9R H is a block diagonal matrix with blocks of
and ownership, but such a file is not needed in the order T.
computation process. The entire PEDIGREE file is The other type of blocking is called an animal
stored in memory during the iteration process, and block, AB, which is defined by G and P of order
requires approximately 16NAN bytes of memory.
21
RT L or 36 in the CTDM. That is, W 9R W and
The TD records data file on all cows contains the
21
Z9R Z are block diagonal matrices with blocks of
following information: order RT L. Data associated with an AB or RB are
processed at the same time, and all equations pertain- Animal ID, matches the ID in PEDIGREE file
ing to a block are solved simultaneously. Data Cow ID, numbering for PE effects
processing in blocks is simple and speeds conver- Contemporary group number
gence, but to implement a blocking strategy requires Fixed regression subclass number
specific preparation of data files. Days in milk DIM
Parity number 2.2.1. Data files
Missing traits code Two types of data files are required; the pedigree
Accuracy of TD yields code file and the data file with TD yields. The pedigree
Yields for milk, fat, protein and somatic cell score. file, PEDIGREE, contains one record per indi-
vidual. The PEDIGREE file must be such that Levels of all effects have to be numbered consecu-
animals are ordered and numbered from youngest to tively. Missing trait codes tell which traits are
oldest. The PEDIGREE file is needed for determin- missing on that test day. In CTDM, all records have
J . Jamrozik, L.R. Schaeffer Livestock Production Science 67 2000 143 –153
147
milk yields present, but other traits may be missing. work vector large enough to store the RHS for
The missing trait codes specify the correct R to be animal additive genetic effects, and this vector must
ij
used in conjunction with parity number and DIM. be double precision because the RHS of the MME
TD yields are estimates of 24 h yield and if for some animals can become quite large in mag-
estimated from two supervised weighings receives an nitude. Elements of RHS for CG and PE effects are
accuracy of 100. If 24 h yield is estimated from an created sequentially while processing the CG file and
evening or morning weighing only, then accuracy is COW file, respectively. Because the fixed regression
89. If 24 h yields are estimated from one weighing effects are relatively small in number of levels, the
in herds that are milking three times a day, then solutions, diagonal blocks, and RHS for fixed regres-
accuracy would be lower around 80. These num- sions are stored in memory. The PEDIGREE file is
bers are provided by the milk recording organiza- also stored in memory, as mentioned previously.
tions Schaeffer et al., 2000b. Each record in the Inverses of the residual covariance matrices by
yield file requires 20 1 4T bytes of storage. With parity number, four DIM intervals, missing trait
over 21 million records in this file for Canadian code, and accuracy of TD yields are stored in
Holsteins, storage of the information in memory is memory as half-stored T 3 T matrices. Inverses for
impossible, and therefore the file must be re-read G and P are created prior to iteration.
during each iteration. In fact, two copies of the data The iteration process proceeds as follows:
file are needed: one sorted by contemporary group 1. The CG file is read sequentially.
numbers CG file and one sorted by cow ID COW a All records within a CG are stored in memory
21
file, and each file needs to be read once during every and the appropriate R
is selected for each record
ij
round of iteration. Reading these files in an efficient based on DIM, parity number, missing trait combina-
manner is facilitated by special input output I O tion, and accuracy of the TD information. Let the
routines in the C language and writing the data in model for the jth TD record in the ith CG and kth
an unformatted manner. fixed regression subclass be
y 5 c 1 X b 1 W p 1 Z a 1 e .
ijk i
ij k
ij ij
ij
2.2.2. Iteration scheme Prior to iteration the diagonal blocks for contem-
b Adjust the observations for the current solu- porary groups, animal PE, and animal genetic effects
tions for fixed regressions, animal PE, and animal need to be created, inverted, and stored on disk as
additive genetic effects, and accumulate into the three separate data files written in standard
RHS for that CG call it CGRHS,
FORTRAN
77 as unformatted. Animal genetic and
21
PE diagonal blocks for cows with TD records are CGRHS 5
O
R y
2 X b 2 W p 2 Z a.
ij ijk
ij k
ij ij
functions of DIM on which the cow’s records were
j
made. Because there is a very large number of Because Wilmink’s function is used for both fixed
possible combinations of DIM, missing trait codes, and random regressions in the CTDM, the values of
and accuracy codes, these diagonal blocks have to be the covariates that appear in X , W , and Z are the
ij ij
ij
created and stored explicitly. For animals without same. This is not essential, but it does make pro-
TD records i.e. ancestors, the diagonal block for gramming a little easier.
the genetic effect is c Adjust the observations in each TD record for
ii 21
animal PE and animal additive genetic effects and a G
, accumulate into the RHS for the kth fixed regression
ii 21
subclass call them FRHS , where a is the diagonal element of A
for animal i,
k
which can be created as needed or stored using an
21
9
FRHS 5 FRHS 1 X R y
2 W p 2 Z a. implicit representation as shown by Tier and Graser
k k
ij ij
ijk ij
ij
1991. Algorithm A requires memory storage space for
d After all records in a CG have been processed, solution vectors for all effects in the model and for a
read in the inverse of the diagonal block for that CG,
148 J
. Jamrozik, L.R. Schaeffer Livestock Production Science 67 2000 143 –153
21 21
9
HINV5 H R H
, p 5 WINVPERHS.
i i
i i
and obtain a new solution for that CG, e Adjust the animal genetic RHS for the new
c 5 HINVCGRHS. animal PE solution,
i 21
9
ARHS 5 ARHS 2 Z R W p .
i i
ij ij
ij i
e Go through the records for that CG again and adjust the RHS of the fixed regressions for the new
4. To get new animal genetic solutions, the CG solution,
PEDIGREE file in memory must be processed.
21
9
FRHS 5 FRHS 2 X R c .
k k
ij ij
i
Remember that animals are sorted from youngest to oldest, and that this ordering is critical. Let i
Continue until all CG have been processed. represent the ith animal, s represents the sire of
2. Compute new solutions for the fixed regres- animal i, and d represents the dam of animal i, and
sions. The block diagonal inverses are already stored
km 21
let a represent elements of A
between animals k in memory,
and m.
21 21
a Adjust the animal genetic RHS for its sire and b 5 X9R
X FRHS
k k
k
dam solutions, for all k from 1 to NFR.
21 is
id
ARHS 5 ARHS 2 G a a 1 a a .
3. The COW file is processed next. Now let the
i i
s d
model for the jth TD record on the ith cow be denoted as
b If the animal has TD records, read in the inverted diagonal block for animal i as
y 5 H c 1 X b 1 W p 1 Z a 1 e ,
ij ij
ij ij
i ij
i ij
21 ii
21 21
9
ZINV5 Z R Z 1 a G
,
i i
i
with Vare 5 R .
For simplicity,
the same
ij ij
subscript, i, has been used to denote PE and animal or if the animal has no TD records, then
genetic effects, but remember the PE effects are
ii 21 21
referenced by the cow ID in the data file and animal ZINV5 a G
. genetic effects are referenced by the animal ID.
a Read and store in memory all TD records for a Calculate a new animal genetic solution vector as
given cow. a 5 ZINVARHS .
b Adjust the observations for fixed regressions,
i i
CG, and animal genetic effects and accumulate in the RHS for the PE effects i.e. a 36 by 1 vector.
c Adjust the sire and dam genetic RHS for the new animal genetic solution and the solution for its
21
9
PERHS 5
O
W R y 2 H c 2 X b 2 Z a .
ij ij
ij ij
ij ij
i
mate as
j 21
si sd
ARHS 5 ARHS 2 G a a 1 a a , and
c Adjust the observations for CG and fixed
s s
i d
regressions and accumulate into the RHS for animal
21 di
ds
genetic effects, which is the large work vector in ARHS 5 ARHS 2 G
a a 1 a a .
d d
i s
memory for all animals, 5. Solve for new genetic group solutions as
21
9
ARHS 5 ARHS 1 Z R y 2 H c 2 X b.
i i
ij ij
ij ij
ij ii
21 21
a 5 a G ARHS .
i i
d Read in the diagonal block inverse for the animal PE effect,
The iteration process is continued until satisfactory
21 21 21
convergence is obtained. The CTDM applies up to
9
WINV5 W R W 1 P
,
i i
i
300 iterations, but utilizes solutions from a previous and compute the new animal PE solution as
run as starting values in the iteration.
J . Jamrozik, L.R. Schaeffer Livestock Production Science 67 2000 143 –153
149
2.3. Algorithm B would only be of length 78, but three could be
necessary for each animal. In terms of disk space this Inspection of the MME for the CTDM reveals the
would be only 35 of that required for the larger following structures as a result of partitioning ac-
blocks. If an animal does not have any second or cording to lactation number. That is,
third lactation TD records, then their inverted diag- onal blocks for those lactations are not written to
21 21
W9R W 1 P 5
22 21
disk because those blocks would be equal to P
21 11
12 13
33 21
9 W R W 1 P
P P
1 1
1
and P , respectively for PE effects, for exam-
21 21
22 23
9 P
W R W 1 P P
,
ple, and there is no need to write multiple copies of
2 2
2
1 2
31 32
21 33
9
these matrices. Thus, the actual savings in disk space
P P
W R W 1 P
3 3
3
would be greater than 65. Cows, however, need to
21 21
with a similar structure for Z9R Z 1 G
, and be coded in the program to know which ones do not
have records in second or third lactations.
21
9
W R Z
1 1
1
Memory storage is still required for solutions to all
21 21
9
W R Z
W 9R Z 5
.
2 2
2
effects in the model, but now the RHS for animal
1 2
21
genetic effects only needs to be large enough for all
9
W R Z
3 3
3
animals for one lactation, i.e. NANRT rather than Note that there are no data connections between
NANRT L. The iteration process proceeds as lactations, but only connections via the non-zero
follows: covariances of PE and genetic effects between
1. The CG file is read sequentially and calcula- lactations. These structures suggest blocking PE and
tions are performed exactly the same as in Algorithm genetic effects on a lactation by lactation basis,
A. RHS for fixed regressions are handled in the same rather than all three lactations simultaneously. Let
manner.
11 12
13
2. New solutions for fixed regressions are calcu- P
P P
lated as in Algorithm A.
21 22
23 21
P P
P P
5 3. The COW file is processed. Remember that this
1 2
31 32
33
P P
P file is now sorted by cow ID within parity number.
Let the model for the jth TD record on the ith cow in and
lactation m be
11 12
13
G G
G y
5 H c 1 X b 1 W p
1 Z a 1 e ,
ijm ijm
ijm ijm
im ijm
im ijm
21 22
23 21
G G
G G
5 ,
1 2
31 32
33
and Vare 5 R
.
ijm ijm
G G
G a Read and store in memory all TD records for a
21
where each partition is of order RT. cow and determine the appropriate R
.
ijm
b Adjust the observations for contemporary 2.3.1. Data files
groups, fixed regressions, and animal genetic effects The PEDIGREE file is needed as before with no
and accumulate in the RHS for the PE effects i.e. changes. The TD records data files are also the same
within lactation RHS is a 12 by 1 vector, as before except that the COW file must be sorted by
21
9
PERHS 5
O
W R
y 2 H c 2 X b
ijm ijm
ijm ijm
ijm
cow ID within parity number. The CG file, sorted by
j
contemporary group numbers, remains the same. 2 Z a .
ijm im
2.3.2. Iteration scheme c Further adjustment to PERHS is needed for the
Blocks are now defined by animals within lacta- PE effects in the other lactations which are correlated
tions. The order of these blocks is RT rather than to the PE effects in lactation m for cow i,
RT L. For the CTDM with R 5 3, T 5 4, and L 5 3,
, m
the block size of a half-stored matrix per animal was PERHS 5 PERHS 2
O
P p .
i,
of length 666, and blocks by animals within lactation
, ±m
150 J
. Jamrozik, L.R. Schaeffer Livestock Production Science 67 2000 143 –153
d Adjust the observations for CG and fixed d Adjust the sire and dam genetic RHS for the
regressions and accumulate over j into the RHS for new animal genetic solution and for the mate’s
animal genetic effects for cow i. genetic solution as
21 L
9
ARHS 5 ARHS 1 Z
R y
2 H c 2 X b.
im im
ijm ijm
ijm ijm
ijm m,
si sd
ARHS 5 ARHS
2
O
G a a 1 a a , and
sm sm
i, d,
, 51
e If a cow has TD records in lactation m, then
L
read in the inverted diagonal block for cow i PE
m, di
ds
ARHS 5 ARHS
2
O
G a a 1 a a .
dm dm
i, s,
effects in lactation m,
, 51
21 mm 21
9
WINV5 W R W 1 P
, 5. Solve for new genetic group solutions in
im im
im
lactation m as otherwise,
ii mm 21
a 5 a G
ARHS .
mm 21 im
im
WINV5 P .
Compute the new animal PE solution for lactation m 2.4. Comparison of algorithms
as Algorithms A and B were applied to the national
p 5 WINVPERHS.
im
Canadian Jersey dairy data set. Data were 543 769 TD records from the first three lactations of 35 502
f Adjust the animal genetic RHS for animal i cows 5NPE that calved after January 1, 1988. The
and lactation m for the new PE solution, total number of animals in the evaluation was 69 946
21
9
ARHS 5 ARHS 2 Z
R W
p . 5NAN. Contemporary groups, formed on the basis
im im
ijm ijm
ijm im
of herd-test date-parity subclasses with second and third parities combined numbered 71 038 5NCG.
4. The animal genetic solutions for lactation m are Seventeen phantom parent groups were formed for
obtained by processing the PEDIGREE file in unknown sires and dams based on sex of parent and
memory. year of birth of offspring. The number of fixed
a Adjust the animal’s RHS for the sire and dam regression subclasses, formed on the basis of region–
solutions in all lactations as parity-age at calving-season of calving, was 38
L m,
is id
5NFR. The model for each trait was the same and ARHS
5 ARHS 2
O
G a a
1 a a .
im im
s, d,
, 51
was described in detail by Schaeffer et al. 2000a. Wilmink’s function was utilized so that R 5 3.
b Further adjustment to ARHS is needed for
im
The MME comprised a total of 4 081 096 equa- the genetic effects in the other lactations which are
tions. Starting solutions for all effects were zero for correlated to the genetic effects in lactation m for
both algorithms prior to iteration. Algorithms were cow i.
compared on the basis of total computing time per
ii , m
ARHS 5 ARHS 2
O
a G a .
iteration, convergence properties, and memory and
im im
i, , ±m
disk storage requirements. Convergence was attained when the sum of squares of differences in animal
c If the animal has TD records in lactation m genetic solutions between iterations divided by the
then read in the inverted diagonal block for animal i, sum of squares of animal genetic solutions in the
21 ii
mm 21
9
ZINV5 Z R Z
1 a G ,
im im
im
latest iteration all times 100 was less than 0.00001. This criterion is unitless compared to a comparison
otherwise, of the squared differences between actual and re-
ii mm 21
ZINV5 a G .
generated right hand sides which would require additional storage to re-generate the right hand sides
Calculate a new animal genetic solution vector as for comparisons.
a 5 ZINVARHS .
Algorithms were written and implemented in
im im
J . Jamrozik, L.R. Schaeffer Livestock Production Science 67 2000 143 –153
151 Table 2
standard
FORTRAN
77. Programs were run on an HP-
Expected storage requirement for Algorithm B as applied to
UX 9000 800 workstation. All solution and RHS
Canadian Holstein data set for different numbers of covariates as
vectors were declared as single precision except for
random regressions
the RHS work vector that was allocated for NAN
Number of Memory space
Disk storage
animal genetic effects, which was declared as double
covariates MB
MB
precision, and was critical to achieving convergence
3 715
2433
in the Holstein breed.
4 954
4242 5
1192 6552
6 1430
9360
3. Results