Directory UMM :Data Elmu:jurnal:I:International Journal of Production Economics:Vol67.Issue1.Aug2000:
Int. J. Production Economics 67 (2000) 27}36
The use of quality metrics in service centres
Valia T. Petkova*, Peter C. Sander, Aarnout C. Brombacher
Eindhoven University of Technology, Faculty of Technology Management/Section Product and Process Quality, P.O. Box 513,
5600 MB Eindhoven, The Netherlands
Abstract
In industry it is not well realised that a service centre is potentially one of the major contributors to quality
improvement. Service is able to collect vital information about the "eld behaviour of products in interaction with
customers. If this information is well analysed and communicated, the recurrence of old problems in new products will
drastically be reduced and so will the expenses on recalls, repairs, warranties, and liabilities. In this paper we discuss the
kind of information a service centre has to collect and some quality-related metrics that organisations use, like the "eld
call rate, or should use, like the hazard function. ( 2000 Elsevier Science B.V. All rights reserved.
Keywords: Reliability; Service centre; Quality improvement; Quality metrics; Maturity index on reliability (MIR)
1. Introduction
For high-volume consumer products there is
a "erce and world-wide competition, in which the
four most important business drivers are:
f Functionality: As a result of a fast technological
development, the functionality of products increases sharply. For example, any 10-year-old
photo camera is as to functionality far behind
a recent one in the same price category.
f Time to market: The technological development
is so fast that products are outdated in months
instead of in years. This has as a consequence
that the time to market has to be very short,
otherwise a product is already obsolete by the
time it reaches the market. Desktop computers
are a clear example.
* Corresponding author. Tel.: #31-40-247-5944; fax: #3140-246-7497.
E-mail address: [email protected] (V.T. Petkova).
f Quality and reliability: Customers expect excellent quality even for relatively inexpensive products. In line with this the warranty period is
extended from half a year to sometimes three
years, or even longer. A short time to market and
excellent quality are con#icting requirements.
For example, a serious test programme needs
time, and if the tests show that improvement
actions are necessary, then that takes even more
time.
f Proxtability: The fact that products are obsolete
in months, has as a consequence that there is
a fast price erosion. One of the most striking
examples is probably the price of desktop computers in relation to their functionality. Because
of this price erosion, the necessary investments,
and the heavy competition, it is not easy to make
a pro"t on consumer products.
In order to survive as a company producing consumer products, it is vital to be best in class on one
or more business drivers. This means that there
0925-5273/00/$ - see front matter ( 2000 Elsevier Science B.V. All rights reserved.
PII: S 0 9 2 5 - 5 2 7 3 ( 0 0 ) 0 0 0 0 7 - 4
28
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
Fig. 1. Feedback control loop.
must be a coherent and company wide approach, in
which every department knows its role and understands that it is not the pro"tability of the individual department that counts; what counts is the
pro"tability of the company as a whole. This means
that departments must be judged by their contribution to the company wide vital business processes.
Functionality, time-to-market, quality and pro"tability are all the result of business processes, and
these processes must constantly be improved and
kept in line with the changing environment in order
to become or stay best in class and consequently
survive as a company.
The availability of feedback control loops is important for the improvement of business processes.
In principle a feedback control loop has a simple
structure. There is a process and the output of the
process has to ful"l certain criteria. In order to
check whether the output is in accordance with the
speci"cations, some measurements are done. If the
measurements make clear that there is a di!erence
between the output and the criteria, some action is
necessary (cf. Fig. 1). For an overview about dealing
with instability of business processes we refer to [1]
in this issue. The control of processes is complicated by the fact that normally there are all kinds of
disturbances, in particular in the input, in the process, in the measurements and even in the action.
In an organisation feedback control loops are
necessary on all levels. On a low level they are used
in production, examples are automatic control
mechanisms and statistical process control. On
a high-level feedback loops have to be used in order
to make sure that departments keep in line with the
overall company goal.
In a previous paper [2], we demonstrated that
service centres are an essential element in the
control loop aimed at quality improvement.
In the present paper we discuss some metrics that
presently are used in service centres and that at best
have some value for logistic purposes, but are not
related to the business drivers. Subsequently, some
metrics are discussed that focus on the quality of
the products.
The structure of the paper is as follows. In Section 2 the new role for service centres is shortly
described, namely the contribution of service
centres to quality improvement. In Section 3 the
information #ow is presented that facilitates the
new role of the service centre. This information #ow
is structured by the maturity index on reliability (cf.
[3,4]). In Section 4 it is discussed what metrics are
valuable in the process of enhancing the quality
and reliability of consumer products. We consider
two metrics that are currently used in industry, and
we propose two metrics that are much more
tailored to quality improvement. The conclusions
are presented in Section 5.
2. Role of service centres in quality improvement
Up to about 10 years ago companies could see
product quality as something &nice to have'. Nowadays it is a must, a boundary condition. Without it
there is no reason to enter the market. A unique
characteristic of a service centre is that there the
customer and the manufacturer have their "rst contact when there is a quality problem. That is, when
there is a mis"t between what the customer expects
and what he gets. This mis"t is the "nal result of the
product creation process (PCP). We de"ne the PCP
in a wide sense, i.e. it includes all business processes
that directly a!ect the "nal product; in particular
the business processes in marketing, development,
production and service, including the suppliers.
In order to prevent a mis"t between the customer's expectations and the product, it makes
sense to &listen to the voice of the customer'. But it is
not easy to integrate the voice of the customer in
the PCP. Of course, some companies systematically
use Quality Function Deployment (cf. [5]), in particular in the automotive industry. But QFD is
usually only used in the design phase. The customer
turns up at the other end of the PCP and that
customer is hardly ever approached to "nd out
whether the product really satis"es his needs and
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
expectations. The only contact between a customer
and the manufacturer is in a service centre when
a customer has a complaint. In particular when the
complaint is covered by the warranty, the service
centre will try to repair the product as fast as
possible and with minimum costs. Service centres
will try to reduce local costs by skipping expensive
and locally non-contributing activities. If a service
centre is not assessed on its contribution to quality
improvement, it has no motive to spend time on
"nding the root cause of the customer's problem
and to communicate this to the other parties in the
PCP. Consequently, there is no information #ow
from service centres to the other parties. The only
information exchange between designers and service centres concerns the serviceability of the products. As far as service is concerned, replacing the
failed modules or the whole product solves the
problem.
Apart from the general remarks in the forgoing
about the possible contribution of service centres to
quality improvement, there are some special circumstances why just now a di!erent role for service
is most advantages. In our opinion the most important ones are the following:
1. Nowadays the "eld problems service centres are
confronted which are of a di!erent nature than
in the past. With the increasing reliability of the
components and the also increasing complexity
of the functionality, component-related reliability problems have become a minority of current
"eld complaints (cf. Fig. 2). As service centres are
close to the customer, they are in a good position
to examine the root cause for all fault categories.
This requires a new approach, because, as mentioned before, up till today a service centre just
replaces components or modules by spare-parts
without looking for the root cause. Actually,
a service centre today is usually not capable of
"nding the root cause, because it has no substantial knowledge about design and production.
The best way forward is, in our view, to intensify the collaboration between service centres
and development and production by exchange
of information and exchange of people (cf. [2]).
2. Especially in high-volume consumer products
there is a high degree of innovation. The more
29
Fig. 2. Observed categories of reliability problems [6].
innovative new products are, the more di$cult it
is to predict the way customers will use them.
Therefore, companies must anticipate unanticipated hidden quality problems and latent
defects shortly after market introduction. As
a consequence, it is of most importance that
especially in this phase there is a detailed and
fast communication between Development, Production, Quality and Reliability, and Service
about all four business drivers. Again, the "rst
contact with the customer is in the service centre,
so that is a perfect place to start.
The conclusion is that on a company level it is
essential to see Service as a department that is
crucial in the control loop over the product creation process. Service is able to collect vital information about the "eld behaviour of products in
interaction with customers. If this information is
well analysed and communicated, the recurrence of
old problems in new products will drastically be
reduced and so will the expenses on recalls, repairs,
warranties, and liabilities.
Now we will focus our attention on the information #ow that is essential for the quality control
loop.
3. Information 6ow
This paper concentrates on the contribution of
Service to quality improvement. Therefore, in this
section we will mainly focus on the information
#ow from service centres to Development and Production. We will also summarise the basic idea of
30
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
the Maturity Index on Reliability (cf. [3] and [4])
as this index structures the information #ow.
The overall aim of this section is to prepare for the
discussion on the metrics in the following section.
MIR level 1: quantitative information available
about the number of failures
There is a basic feedback system that gives
quantitative information indicating
3.1. Information yow leaving a service centre
f the performance during production,
f the performance in the "eld.
As has been mentioned before, service centres are
in a unique position to collect "eld failure data and
data about customer use, and to analyse the relation between them. This information is very helpful
in several phases of the PCP:
This information must be in the form of metrics
that objectively describe the performance of the
process between production and the "eld (see Section 4).
f "eld failure data determines the critical parts of
a design in relation to customer use,
f "eld failure data demonstrates problem areas in
production,
f information about customer use is vital for the
determination of the test programme. In particular, tests are necessary to "nd out whether the
"eld failures as collected by Service have e!ectively been anticipated in design and production.
f After release of a new product, "eld failures must
be communicated to development and production as soon as possible. Serious quality problems could lead to a disaster like a recall of
a whole generation of products.
3.2. Maturity index on reliability
The basic idea behind the maturity index on
reliability (MIR) is that the quality improvement
loop over the PCP requires a full exchange of
information between all parties. As this is exactly
the conclusion of Section 2, it makes sense to see
how the MIR principle can be used in order to
utilise the unique position a service centre potentially has in the quality improvement loop.
We only give a short description of the "ve MIR
levels, for a more extensive discussion we refer to
[3,4]. The MIR levels are structured in such a way
that a higher level includes a lower level.
MIR level 0: no quantitative information available
about the ,eld behaviour
The manufacturer has no quantitative evidence
of the "eld behaviour of the products and, consequently, there is no feedback system from Service to
Development and Production.
MIR level 2: quantitative information available
about the origin of the problems
The feedback system contains information about
the origin of the problems. There is quantitative
information about:
f primary causes: design, material, production
process, customer use,
f secondary location of failure, i.e. the location
within the primary cause.
MIR level 3: detailed information available on rootcause level
There is detailed information on root-cause level
for all dominant failures, such that causes of failures
in previous products and processes can be translated into risks in future products and processes.
MIR level 4: continuous improvement via an adaptive
system
The system is adaptive. Techniques and tools are
in place in the organisation to anticipate risks for
new products and processes and to eliminate these
risks where necessary. Local optimisation is replaced by global optimisation.
Obviously service centres are in a position to
collect "eld information that is valuable for each
MIR level. Collecting that data is, however, not
easy. For MIR level 1 the number of failed products
in the "eld must be counted. But in order to be
informative, it is essential to record all vital information about a failed product, like type, series
number, date of production, utilisation, etc. This
data can be translated into business information if
also information is available about the number of
products on the market.
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
In this paper we concentrate on the role of service centres in the information #ow that is needed
in order to reach MIR level 1. It is quite clear that
service centres are invaluable for the other MIR
levels as well. Information for MIR levels 2, 3 and 4,
for example, can be collected by a service centre
that has the right knowledge about design and
production. This will be the topic of forthcoming
papers.
If all information is collected and e!ectively used,
a company is able to reach MIR level 4. It is then
a learning organisation where continuous improvement is a matter of course. For MIR level 1, however, it su$ces to have useful metrics that describe
the product quality performance in production and
in the "eld. This is the subject of the next section.
4.1.4 we give a list of criteria that give insight in the
costs of (non-)quality. For MIR level 1 companies
must translate the quantitative information about
fall-o! in production and about "eld failures into
costs.
4.1.1. Reliability concepts
As high-volume consumer products are seldom
repaired more than once, we will only mention two
reliability concepts that are based on the time to
("rst) failure. This means that we do not discuss
reliability models for repairable systems.
4.1.1.1. Reliability. Product reliability is, for
example according to [7] de"ned as the probability
R(t) that a product starting at time zero will survive
a given time t:
4. Metrics
P
R(t)"P(¹*t)"
In this section we will "rst (in Section 4.1) analyse
the type of information that is needed for MIR level
1. In Section 4.2 we will discuss some metrics that
are presently used in industry. Finally, in Section
4.3, we will come up with operational de"nitions of
two metrics that are based on Section 4.1 and that
in our opinion are very informative.
4.1. Quality information for MIR level 1
On MIR level 1 there is a basic feedback system
that gives quantitative information indicating
the number of problems during production and the
number of "eld failures. In order to be informative,
these numbers must be seen in proportion to the
number of products actually produced, respectively, in use. The metrics must do more than just
describe the situation, they must also be able to
detect changes over time. In production it is relatively easy to collect the indispensable information.
It is much harder to get useful quantitative information about "eld failures.
Before we go into the problem of collecting
quantitative information about failures, in Section
4.1.1 we "rst summarise two important reliability
concepts. In the Sections 4.1.2 and 4.1.3 we concentrate on information about problems in production
and in the "eld, respectively. Finally, in Section
31
=
f (q) dq,
t
where ¹ is a suitable continuous random variable
representing time to failure with failure probability
density function f (t).
In applications the probability density function
f (t) is usually not known. If an estimate of f (t) is
available, then this estimate gives an estimate of
R(t). If there is no estimate of f (t) available and if all
products are taken into use at the same time, say
t"0, then one of the common non-parametric
estimators of R(t) can be used (cf. [7]). In this paper
we will use
N(t)
,
RI (t)"
N#1
where N(t) is the number of unfailed products on
the market at time t, and N,N(0).
In the more realistic situation that &identical'
products are taken into use at di!erent time points
during a period of, say, half a year, the estimation
procedure is more complex; we come back to this in
Section 4.3.
4.1.1.2. Hazard function. If a product is still operating at time ¹, then we are not interested anymore
in the unconditional probability to survive a particular time point t. In this case we are interested in
32
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
the conditional probability to survive ¹#t given
that the product already survived ¹. The concept of
hazard function (also called hazard rate, failure rate
or force of mortality, see [8]), covers this idea. The
hazard function j(t) represents the instantaneous
failure probability and is de"ned in the following
way:
1 dR(t)
f (t)
"!
.
j(t),
R(t) dt
R(t)
If all products are taken into use at the same time
t"0, the hazard function can be estimated via jI (t)
in the following way. First we notice that the de"nition of the hazard function shows that
R(t)!R(t#*t)
1
j(t)"
lim
.
*t
R(t)
t?0
Next we divide the relevant interval (0, ¹) in
k subintervals with length *q, such that ¹"k*q.
Now let q "i*q, then on the interval (q , q #*q)
i
i i
the hazard function j(t) can be estimated by
RI (q )!RI (q #*q)
i
jI (q , q #*t), i
i i
*qRI (q )
i
or, with M(t) being number of failures on the interval (0, t), N(t) the number of products on the market
at time t; this leads to
M(q #*t)!M(q )
i
i .
jI (q , q #*t)"
i i
*qN(q )
i
The de"nition of the time t needs some attention.
As f (t) is the probability density function of the
failure time, t is the time since the beginning of the
failure behaviour. For consumer products the interpretation: t"time since sales, seems reasonable,
even though sometimes the product will not be
bought by the end user.
If all products are sold at time zero, then M(t)
denotes the number of failures at time t in the class
of all products (all sold at time zero). If the products
are sold at di!erent times, as is normally the case,
then the estimation of the hazard function is more
complicated. We come back to this in Section 4.3.
For a more thorough discussion of estimators of
the hazard function we refer to [9,10].
4.1.2. Information about production problems
In production all kinds of product-related problems can occur. Concentrating on quality and reliability it is important to collect data that
demonstrates whether particular production processes need improvement. Therefore, it is important
to record the following characteristics as a function
of time (per process step and on component, module and product level):
f the fraction of scrap,
f the fraction of rework,
f the productivity.
If the performance on these characteristics is
a!ected by particular circumstances, these circumstances must be recorded as well. Common relevant
circumstances are the following:
f
f
f
f
the time of production,
characteristics of the batch,
the production speed,
the operator and/or shift.
As it is in principle simple to collect all this
information and it is also quite clear how this
information must be analysed, in the rest of this
section we concentrate on metrics that are based on
"eld information.
4.1.3. Information about xeld problems
Relevant quantitative information about the
"eld behaviour of products/subsystems/modules/
components is given by the following characteristics (see also [11]):
f the fraction of customer complaints within the
warranty period, or, more general,
f the fraction of customer complaints within a particular time-interval,
f the fraction of zero-hour failures (dead-on-arrival)
f the hazard function,
f the segmentation of customer complaints over
the categories: design problem, production problem, component problem, product level problem,
customer use, no fault found.
Just as in the case of production problems, also the
"eld behaviour of products is a!ected by several
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
characteristics. The most important ones are the
following:
f the time of production,
f the date the product is put into use,
f the quantity of use (amount of time, number of
cycles, etc.),
f the way of use (whether or not according to the
user speci"cations),
f the environment in which the product has been
used (for example warm and humid or cold and
dry).
The main di!erence between the information about
problems in production and problems in the "eld is
the in#uence of the factor time. The fall-o! in production concerns instantaneous failures and in
principle the performance of production is well
known at any moment. A quantitative analysis of
the number of "eld failures is much more complicated, because at time t only the number of failures
that occurred before time t is known. This does not
a!ect the estimation of, for example, the fraction of
zero-hour failures, but it does a!ect, the estimation
of the reliability and the hazard function (see Section 4.3).
The estimation of some characteristics is also
complicated by the fact that it is far from easy, even
during the warranty period, to determine the total
number of sold products, the total number of products still in use, and the total number of customer
complaints.
4.1.4. Costs of (non-)quality
Information about the performance on quality is
not complete without a full view on the costs that
are related with making quality, or, better, making
non-quality. Some metrics for these costs are:
f
f
f
f
f
f
f
f
costs of the design process itself,
costs of design changes,
costs of process changes,
warranty costs,
costs of service activities,
product liability costs,
image costs and costs of losing customers,
extra costs for the customer.
In this paper we will not go into this in more detail,
we will concentrate on reliability metrics.
33
4.2. Current metrics
Some of the metrics that are presently used in
industry will be presented below and their advantages and disadvantages will be explained. In Section 4.2.1 the classical "eld call rate will be
presented and in Section 4.2.2 the warranty call
rate. We will only discuss "eld data, because collecting and analysing data about fall-o! in production is not a serious problem.
A service centre is, of course, the primary source
for quantitative information about "eld problems.
This does not mean that it is easy to get quantitative information about the "eld performance on
product level. For example, it is hardly possible to
collect quantitative information about "eld problems expressed as the percentage of products that
fail within a one-year warranty period. The reason
is, of course, that service centres normally see, even
within the warranty period, only part of all products that fail.
4.2.1. Classical xeld call rate
The classical "eld call rate is used to monitor the
number of "eld failures of a given product and was
developed for logistic purposes. As the hazard function depends on the age of the product it has, in
those cases where not all products are sold simultaneously, little to do with the number of repairs that
are expected in a certain time interval.
The de"nition of the classical "eld call rate
FCR
is very close to the de"nition of the
#-!44*#!natural estimator jI (q, q#*q) of the hazard function (see Section 4.1.1). On the interval (q , q #*q)
i i
the FCR
is estimated by
#-!44*#!M(q #*t)!M(q )
i
i ,
(q , q #*t)"
FCR
#-!44*#!- i i
*qN(q )
i
where M(t) is the number of failures at time t, N(t)
the number of products on the market at time t.
The di!erence is in the meaning of the time t. In
jI (q, q#*t) time t is measured from the moment the
failure behaviour starts. For consumer products
this will usually be close to the time since sales. In
the estimator of FCR
, however, t is the time
#-!44*#!since market introduction of the product. It is important to note that the expression does not take into
34
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
account the age of a product at the moment of
failure, it just uses the total number of failures.
Furthermore, it is important to realise that not all
products are sold at time t"0 and this reduces the
value of the estimator of the classical "eld call rate,
just as it in#uences the meaning of the estimator of
the hazard function.
Another disadvantage of the classical "eld call
rate is the fact that in the early phases of the life of
a product and in the last phases, the number of
products actually in use on the market, N(t), has
a high level of uncertainty.
The FCR is used to monitor the number of "eld
failures of a given product and was developed for
logistic purposes, like the estimation of the number
of spare parts that will be necessary at a given
moment in time and at a given location. As the
hazard function usually is a function of the age of
the product, the FCR has, in those cases where not
all products are sold simultaneously, little to do
with the number of repairs that are expected in
a certain time interval.
4.2.2. Warrantee call rate
A di!erent method uses the so-called warrantee
package method. The warrantee package method is
used especially for "nancial purposes. The de"nition of the warrantee call rate WCR
cal8!33!/5%%
culated according to the warrantee package
method is very close to the de"nition of FCR
.
#-!44*#!On the interval (q , q #*t) the warrantee call rate
i i
is estimated by
M (q #*t)!M (q )
8 i,
(q , q #*t)" 8 i
WCR
8!33!/5%% i i
*qN (q )
8 i
where M (t) is the number of failures at time t of
8
products within warrantee, N (t) the number of
8
products within warrantee on the market at time t.
As the name of the model indicates, the main
focus in this model is on warrantee aspects of products: what fraction of the products fails during the
warrantee period. Therefore, although the formula
is mathematically close to the formula for the classical "eld call rate, this metric uses a kind of moving
time-window and therefore will lead to conceptually di!erent results. And, again, the expression
does not take into account the age of a product at
the moment of failure.
From the foregoing it will be clear that the "eld
call rate and the warrantee call rate are not very
informative. Furthermore, as products are normally taken into use at di!erent time points, a more
sophisticated estimator of the hazard function is
necessary than the one given in Section 4.1.1. This
is the subject of Section 4.3.
4.3. New metrics
In Section 4.3.1 we present an estimator of the
reliability and in Section 4.3.2 an estimator of the
hazard function. Both estimators make full use of
all available data.
4.3.1. Reliability
After market introduction the number of products
on the market increases (for some time) and with it
the number of defects. If the hazard function is estimated at time t, the data are censored on the right, i.e.
some products have not yet failed, and their failure
times are known only to be beyond their present
running time. This type of data is known in the
literature as type I multiply censored data. Instead of
the estimator jI (q, q#*t) given in Section 4.1.1, it is
much better to use the product-limit estimator of the
reliability function, as developed by Kaplan}Meier
[12]. The product-limit estimator is usually given for
the situation that all &products' start at the same time
t"0 and that the censoring at the right is not the
same for all products. In our situation the products
are put into use at di!erent moments and all products
are censored at the same moment. Of course, it is
possible to transform the failure time data by shifting
all starting points to q"0. However, in order to
come up with expressions that are relatively easy to
use in industry, we prefer not to do this, but to use the
data as they are. This leads to the following derivation
of the reliability.
We divide the time axes in intervals of, say, one
month. Let q denote the endpoint of the ith interi
val I with q "0 (see Fig. 3). To keep things
i
0
simple, we act as if sales only take place at the time
points q , q , q 2. Furthermore, we suppose that
0 1 2
all products are functioning at the time of sales, but
they can fail a split second afterwards.
De"ne n as the number of products functioning
ji
at q of all products sold at q , d the number of
i
j ji
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
35
Analogously, at q we estimate P(¹*i unitsD¹*
j
i!1 units), with j*i*2, by
Fig. 3. Time axis with intervals.
products failing in interval I of all products sold at
i
q , ¹ the age of a product at the time of failing.
j
This implies that a product that is sold at q surj
vives i time units if for that product holds: ¹*q .
j`i
The Kaplan}Meier estimator of the reliability of
a product to survive i time units, say R(i units), is
given by
R(i units)"P(¹*i unitsD¹*i!1 units) P(¹*i
!1 unitsD¹*i!2 units) ) ) ) P(¹*2 unitsD¹*1
unit)P(¹*1 units).
For a product that is sold at time q the probabilj
ity P(¹*1 units) can be estimated by
+j~1 d
p ,1! k/0 k,1`k .
ji
+j~1 n
k/0 k,i`k~1
Therefore, at time q the following estimator of the
j
reliability is available:
j
RK (i units)" < p
with i)j.
j
ji
i/1
4.3.2. Hazard function
Let h(i units) be the hazard function of a product
at i time units after it has been sold. From the
analysis in the foregoing section it follows that this
hazard function at time q can be estimated by
j
combining all available information. That means
that at time q the instantaneous failure probability
j
after surviving i time units, j(i units) is estimated by
total number of products that failed immediately after surviving i time units
hK (i units)"
j
total number of products that survived i time units
d
#d
#2#d
1,i`2
j~i~1,j
" 0,i`1
n #n
#2#n
0,i
1,i`1
j~i~1,j~1
PK (¹*1 unit)"PK (¹*q )
j
j
j`1
d
"1! j,j`1
n
j,j
and P(¹*i units D¹*i!1 units), with i*2, can
be estimated by
PK (¹*i unitsD¹*i!1 units)
j
"PK (¹*q D¹*q
)
j
j`1
j`i~1
d
"1! j,j`1 .
n
j,j`i~1
At q we estimate P(¹*i!1 unit) by combining
j
all data for products sold at q , q ,2, q . This
0 1
j~1
gives the following estimator p of P(¹*1 unit):
j1
with i)j!1.
If this estimator is plotted as a function of i, it is
possible to compare the graph with the rollercoaster curve. In this way additional information
about the failure mechanisms can be collected (see
[27]). In a next paper we will give a practical
example of this estimator in high-volume consumer
products. Of course, in this way it is also possible to
predict the fraction of failures under warranty that
can be expected, and this estimate is far superior to
the warrantee call rate.
5. Conclusion
Based on the information in the previous sections the following conclusions can be made:
total number of products sold at or before q
that failed in the first month after sales
j~1
p ,1!
ji
total number of products sold at or before q
j~1
d #d #2#d
12
j~1,j .
"1! 01
n #n #2#n
00
11
j~1,j~1
36
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
f In mass production the "rst contact between
a customer and the manufacturer is in a service
centre, therefore service centres are most suitable
for collecting information about the behaviour of
products in real operating conditions.
f Service centres are focused on logistic and not on
their contribution to quality improvement. As
a consequence valuable information is not collected.
f Service centres only communicate with the other
departments, like production and development,
about serviceability.
f In industry some metrics, like the classical "eld
call rate and the warrantee call rate, are used that
hardly give any sensible information. In particular these metrics do not give any information
about the quality of the products and processes.
f There are simple metrics based on the Kaplan}Meier estimator of the reliability and the
hazard function that give information that is
useful for logistics purposes as well as for quality
improvement.
It will be clear that service centres have to be
prepared for their new assignment. This holds even
stronger when the contribution from service centres
to the MIR levels 2, 3 and 4 is analysed, because
then a thorough knowledge of design and production is indispensable. But this is the subject of
ongoing research.
References
[1] T.P.J. Berden, A.C. Brombacher, P.C. Sander, The building bricks of product quality: An overview of some basic
concepts and principles, International Journal of Production Economics 67 (1) (2000) 3}15.
[2] P.T. Petkova, P.C. Sander, A.C. Brombacher, The role of
the service centre in improvement processes, Quality and
Reliability Engineering International 15 (1999).
[3] A.C. Brombacher, MIR: Covering non-technical aspects of
IEC61508 reliability certi"cation, Reliability Engineering
and System Safety (1999).
[4] P.C. Sander, A.C. Brombacher, MIR: The use of reliability
information #ows as a maturity index for quality management. Quality and Reliability Engineering International 15
(1999).
[5] Y. Akao (Ed.), Quality Function Deployment. Productivity Press, Cambridge, 1990.
[6] A.C. Brombacher, Predicting reliability of high volume
consumer products: Some experiences 1986}1996, Symposium `The Reliability Challengea organised by Finn
Jensen Consultancy, London, 1996.
[7] E.E. Lewis, Introduction to Reliability Engineering, 2nd
Edition, Wiley, New York, 1996.
[8] H.E. Ascher, H. Feingold, Repairable Systems Reliability:
Modelling, Inference, Misconceptions and their Causes,
Marcel Dekker, New York, 1984.
[9] S.H. Lo et al., Density and hazard rate estimation for
censored data via strong representation of the Kaplan}
Meier estimator, Probability Theory and Related Fields 80
(1989) 461}473.
[10] W. Stute, Strong and weak representations of cumulative
hazard function and Kaplan}Meier estimators on increasing sets. Journal of statistical planning and inference (1994)
315}329.
[11] J.W. Wesner, J.M. Hiatt, D.C. Trimble, Winning with
Quality, Addison-Wesley, New York, 1995.
[12] E.L. Kaplan, P. Meier, Nonparametric estimation from
censored incomplete observations, Journal of the American Statistical Association 53 (1958) 457}481.
The use of quality metrics in service centres
Valia T. Petkova*, Peter C. Sander, Aarnout C. Brombacher
Eindhoven University of Technology, Faculty of Technology Management/Section Product and Process Quality, P.O. Box 513,
5600 MB Eindhoven, The Netherlands
Abstract
In industry it is not well realised that a service centre is potentially one of the major contributors to quality
improvement. Service is able to collect vital information about the "eld behaviour of products in interaction with
customers. If this information is well analysed and communicated, the recurrence of old problems in new products will
drastically be reduced and so will the expenses on recalls, repairs, warranties, and liabilities. In this paper we discuss the
kind of information a service centre has to collect and some quality-related metrics that organisations use, like the "eld
call rate, or should use, like the hazard function. ( 2000 Elsevier Science B.V. All rights reserved.
Keywords: Reliability; Service centre; Quality improvement; Quality metrics; Maturity index on reliability (MIR)
1. Introduction
For high-volume consumer products there is
a "erce and world-wide competition, in which the
four most important business drivers are:
f Functionality: As a result of a fast technological
development, the functionality of products increases sharply. For example, any 10-year-old
photo camera is as to functionality far behind
a recent one in the same price category.
f Time to market: The technological development
is so fast that products are outdated in months
instead of in years. This has as a consequence
that the time to market has to be very short,
otherwise a product is already obsolete by the
time it reaches the market. Desktop computers
are a clear example.
* Corresponding author. Tel.: #31-40-247-5944; fax: #3140-246-7497.
E-mail address: [email protected] (V.T. Petkova).
f Quality and reliability: Customers expect excellent quality even for relatively inexpensive products. In line with this the warranty period is
extended from half a year to sometimes three
years, or even longer. A short time to market and
excellent quality are con#icting requirements.
For example, a serious test programme needs
time, and if the tests show that improvement
actions are necessary, then that takes even more
time.
f Proxtability: The fact that products are obsolete
in months, has as a consequence that there is
a fast price erosion. One of the most striking
examples is probably the price of desktop computers in relation to their functionality. Because
of this price erosion, the necessary investments,
and the heavy competition, it is not easy to make
a pro"t on consumer products.
In order to survive as a company producing consumer products, it is vital to be best in class on one
or more business drivers. This means that there
0925-5273/00/$ - see front matter ( 2000 Elsevier Science B.V. All rights reserved.
PII: S 0 9 2 5 - 5 2 7 3 ( 0 0 ) 0 0 0 0 7 - 4
28
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
Fig. 1. Feedback control loop.
must be a coherent and company wide approach, in
which every department knows its role and understands that it is not the pro"tability of the individual department that counts; what counts is the
pro"tability of the company as a whole. This means
that departments must be judged by their contribution to the company wide vital business processes.
Functionality, time-to-market, quality and pro"tability are all the result of business processes, and
these processes must constantly be improved and
kept in line with the changing environment in order
to become or stay best in class and consequently
survive as a company.
The availability of feedback control loops is important for the improvement of business processes.
In principle a feedback control loop has a simple
structure. There is a process and the output of the
process has to ful"l certain criteria. In order to
check whether the output is in accordance with the
speci"cations, some measurements are done. If the
measurements make clear that there is a di!erence
between the output and the criteria, some action is
necessary (cf. Fig. 1). For an overview about dealing
with instability of business processes we refer to [1]
in this issue. The control of processes is complicated by the fact that normally there are all kinds of
disturbances, in particular in the input, in the process, in the measurements and even in the action.
In an organisation feedback control loops are
necessary on all levels. On a low level they are used
in production, examples are automatic control
mechanisms and statistical process control. On
a high-level feedback loops have to be used in order
to make sure that departments keep in line with the
overall company goal.
In a previous paper [2], we demonstrated that
service centres are an essential element in the
control loop aimed at quality improvement.
In the present paper we discuss some metrics that
presently are used in service centres and that at best
have some value for logistic purposes, but are not
related to the business drivers. Subsequently, some
metrics are discussed that focus on the quality of
the products.
The structure of the paper is as follows. In Section 2 the new role for service centres is shortly
described, namely the contribution of service
centres to quality improvement. In Section 3 the
information #ow is presented that facilitates the
new role of the service centre. This information #ow
is structured by the maturity index on reliability (cf.
[3,4]). In Section 4 it is discussed what metrics are
valuable in the process of enhancing the quality
and reliability of consumer products. We consider
two metrics that are currently used in industry, and
we propose two metrics that are much more
tailored to quality improvement. The conclusions
are presented in Section 5.
2. Role of service centres in quality improvement
Up to about 10 years ago companies could see
product quality as something &nice to have'. Nowadays it is a must, a boundary condition. Without it
there is no reason to enter the market. A unique
characteristic of a service centre is that there the
customer and the manufacturer have their "rst contact when there is a quality problem. That is, when
there is a mis"t between what the customer expects
and what he gets. This mis"t is the "nal result of the
product creation process (PCP). We de"ne the PCP
in a wide sense, i.e. it includes all business processes
that directly a!ect the "nal product; in particular
the business processes in marketing, development,
production and service, including the suppliers.
In order to prevent a mis"t between the customer's expectations and the product, it makes
sense to &listen to the voice of the customer'. But it is
not easy to integrate the voice of the customer in
the PCP. Of course, some companies systematically
use Quality Function Deployment (cf. [5]), in particular in the automotive industry. But QFD is
usually only used in the design phase. The customer
turns up at the other end of the PCP and that
customer is hardly ever approached to "nd out
whether the product really satis"es his needs and
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
expectations. The only contact between a customer
and the manufacturer is in a service centre when
a customer has a complaint. In particular when the
complaint is covered by the warranty, the service
centre will try to repair the product as fast as
possible and with minimum costs. Service centres
will try to reduce local costs by skipping expensive
and locally non-contributing activities. If a service
centre is not assessed on its contribution to quality
improvement, it has no motive to spend time on
"nding the root cause of the customer's problem
and to communicate this to the other parties in the
PCP. Consequently, there is no information #ow
from service centres to the other parties. The only
information exchange between designers and service centres concerns the serviceability of the products. As far as service is concerned, replacing the
failed modules or the whole product solves the
problem.
Apart from the general remarks in the forgoing
about the possible contribution of service centres to
quality improvement, there are some special circumstances why just now a di!erent role for service
is most advantages. In our opinion the most important ones are the following:
1. Nowadays the "eld problems service centres are
confronted which are of a di!erent nature than
in the past. With the increasing reliability of the
components and the also increasing complexity
of the functionality, component-related reliability problems have become a minority of current
"eld complaints (cf. Fig. 2). As service centres are
close to the customer, they are in a good position
to examine the root cause for all fault categories.
This requires a new approach, because, as mentioned before, up till today a service centre just
replaces components or modules by spare-parts
without looking for the root cause. Actually,
a service centre today is usually not capable of
"nding the root cause, because it has no substantial knowledge about design and production.
The best way forward is, in our view, to intensify the collaboration between service centres
and development and production by exchange
of information and exchange of people (cf. [2]).
2. Especially in high-volume consumer products
there is a high degree of innovation. The more
29
Fig. 2. Observed categories of reliability problems [6].
innovative new products are, the more di$cult it
is to predict the way customers will use them.
Therefore, companies must anticipate unanticipated hidden quality problems and latent
defects shortly after market introduction. As
a consequence, it is of most importance that
especially in this phase there is a detailed and
fast communication between Development, Production, Quality and Reliability, and Service
about all four business drivers. Again, the "rst
contact with the customer is in the service centre,
so that is a perfect place to start.
The conclusion is that on a company level it is
essential to see Service as a department that is
crucial in the control loop over the product creation process. Service is able to collect vital information about the "eld behaviour of products in
interaction with customers. If this information is
well analysed and communicated, the recurrence of
old problems in new products will drastically be
reduced and so will the expenses on recalls, repairs,
warranties, and liabilities.
Now we will focus our attention on the information #ow that is essential for the quality control
loop.
3. Information 6ow
This paper concentrates on the contribution of
Service to quality improvement. Therefore, in this
section we will mainly focus on the information
#ow from service centres to Development and Production. We will also summarise the basic idea of
30
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
the Maturity Index on Reliability (cf. [3] and [4])
as this index structures the information #ow.
The overall aim of this section is to prepare for the
discussion on the metrics in the following section.
MIR level 1: quantitative information available
about the number of failures
There is a basic feedback system that gives
quantitative information indicating
3.1. Information yow leaving a service centre
f the performance during production,
f the performance in the "eld.
As has been mentioned before, service centres are
in a unique position to collect "eld failure data and
data about customer use, and to analyse the relation between them. This information is very helpful
in several phases of the PCP:
This information must be in the form of metrics
that objectively describe the performance of the
process between production and the "eld (see Section 4).
f "eld failure data determines the critical parts of
a design in relation to customer use,
f "eld failure data demonstrates problem areas in
production,
f information about customer use is vital for the
determination of the test programme. In particular, tests are necessary to "nd out whether the
"eld failures as collected by Service have e!ectively been anticipated in design and production.
f After release of a new product, "eld failures must
be communicated to development and production as soon as possible. Serious quality problems could lead to a disaster like a recall of
a whole generation of products.
3.2. Maturity index on reliability
The basic idea behind the maturity index on
reliability (MIR) is that the quality improvement
loop over the PCP requires a full exchange of
information between all parties. As this is exactly
the conclusion of Section 2, it makes sense to see
how the MIR principle can be used in order to
utilise the unique position a service centre potentially has in the quality improvement loop.
We only give a short description of the "ve MIR
levels, for a more extensive discussion we refer to
[3,4]. The MIR levels are structured in such a way
that a higher level includes a lower level.
MIR level 0: no quantitative information available
about the ,eld behaviour
The manufacturer has no quantitative evidence
of the "eld behaviour of the products and, consequently, there is no feedback system from Service to
Development and Production.
MIR level 2: quantitative information available
about the origin of the problems
The feedback system contains information about
the origin of the problems. There is quantitative
information about:
f primary causes: design, material, production
process, customer use,
f secondary location of failure, i.e. the location
within the primary cause.
MIR level 3: detailed information available on rootcause level
There is detailed information on root-cause level
for all dominant failures, such that causes of failures
in previous products and processes can be translated into risks in future products and processes.
MIR level 4: continuous improvement via an adaptive
system
The system is adaptive. Techniques and tools are
in place in the organisation to anticipate risks for
new products and processes and to eliminate these
risks where necessary. Local optimisation is replaced by global optimisation.
Obviously service centres are in a position to
collect "eld information that is valuable for each
MIR level. Collecting that data is, however, not
easy. For MIR level 1 the number of failed products
in the "eld must be counted. But in order to be
informative, it is essential to record all vital information about a failed product, like type, series
number, date of production, utilisation, etc. This
data can be translated into business information if
also information is available about the number of
products on the market.
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
In this paper we concentrate on the role of service centres in the information #ow that is needed
in order to reach MIR level 1. It is quite clear that
service centres are invaluable for the other MIR
levels as well. Information for MIR levels 2, 3 and 4,
for example, can be collected by a service centre
that has the right knowledge about design and
production. This will be the topic of forthcoming
papers.
If all information is collected and e!ectively used,
a company is able to reach MIR level 4. It is then
a learning organisation where continuous improvement is a matter of course. For MIR level 1, however, it su$ces to have useful metrics that describe
the product quality performance in production and
in the "eld. This is the subject of the next section.
4.1.4 we give a list of criteria that give insight in the
costs of (non-)quality. For MIR level 1 companies
must translate the quantitative information about
fall-o! in production and about "eld failures into
costs.
4.1.1. Reliability concepts
As high-volume consumer products are seldom
repaired more than once, we will only mention two
reliability concepts that are based on the time to
("rst) failure. This means that we do not discuss
reliability models for repairable systems.
4.1.1.1. Reliability. Product reliability is, for
example according to [7] de"ned as the probability
R(t) that a product starting at time zero will survive
a given time t:
4. Metrics
P
R(t)"P(¹*t)"
In this section we will "rst (in Section 4.1) analyse
the type of information that is needed for MIR level
1. In Section 4.2 we will discuss some metrics that
are presently used in industry. Finally, in Section
4.3, we will come up with operational de"nitions of
two metrics that are based on Section 4.1 and that
in our opinion are very informative.
4.1. Quality information for MIR level 1
On MIR level 1 there is a basic feedback system
that gives quantitative information indicating
the number of problems during production and the
number of "eld failures. In order to be informative,
these numbers must be seen in proportion to the
number of products actually produced, respectively, in use. The metrics must do more than just
describe the situation, they must also be able to
detect changes over time. In production it is relatively easy to collect the indispensable information.
It is much harder to get useful quantitative information about "eld failures.
Before we go into the problem of collecting
quantitative information about failures, in Section
4.1.1 we "rst summarise two important reliability
concepts. In the Sections 4.1.2 and 4.1.3 we concentrate on information about problems in production
and in the "eld, respectively. Finally, in Section
31
=
f (q) dq,
t
where ¹ is a suitable continuous random variable
representing time to failure with failure probability
density function f (t).
In applications the probability density function
f (t) is usually not known. If an estimate of f (t) is
available, then this estimate gives an estimate of
R(t). If there is no estimate of f (t) available and if all
products are taken into use at the same time, say
t"0, then one of the common non-parametric
estimators of R(t) can be used (cf. [7]). In this paper
we will use
N(t)
,
RI (t)"
N#1
where N(t) is the number of unfailed products on
the market at time t, and N,N(0).
In the more realistic situation that &identical'
products are taken into use at di!erent time points
during a period of, say, half a year, the estimation
procedure is more complex; we come back to this in
Section 4.3.
4.1.1.2. Hazard function. If a product is still operating at time ¹, then we are not interested anymore
in the unconditional probability to survive a particular time point t. In this case we are interested in
32
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
the conditional probability to survive ¹#t given
that the product already survived ¹. The concept of
hazard function (also called hazard rate, failure rate
or force of mortality, see [8]), covers this idea. The
hazard function j(t) represents the instantaneous
failure probability and is de"ned in the following
way:
1 dR(t)
f (t)
"!
.
j(t),
R(t) dt
R(t)
If all products are taken into use at the same time
t"0, the hazard function can be estimated via jI (t)
in the following way. First we notice that the de"nition of the hazard function shows that
R(t)!R(t#*t)
1
j(t)"
lim
.
*t
R(t)
t?0
Next we divide the relevant interval (0, ¹) in
k subintervals with length *q, such that ¹"k*q.
Now let q "i*q, then on the interval (q , q #*q)
i
i i
the hazard function j(t) can be estimated by
RI (q )!RI (q #*q)
i
jI (q , q #*t), i
i i
*qRI (q )
i
or, with M(t) being number of failures on the interval (0, t), N(t) the number of products on the market
at time t; this leads to
M(q #*t)!M(q )
i
i .
jI (q , q #*t)"
i i
*qN(q )
i
The de"nition of the time t needs some attention.
As f (t) is the probability density function of the
failure time, t is the time since the beginning of the
failure behaviour. For consumer products the interpretation: t"time since sales, seems reasonable,
even though sometimes the product will not be
bought by the end user.
If all products are sold at time zero, then M(t)
denotes the number of failures at time t in the class
of all products (all sold at time zero). If the products
are sold at di!erent times, as is normally the case,
then the estimation of the hazard function is more
complicated. We come back to this in Section 4.3.
For a more thorough discussion of estimators of
the hazard function we refer to [9,10].
4.1.2. Information about production problems
In production all kinds of product-related problems can occur. Concentrating on quality and reliability it is important to collect data that
demonstrates whether particular production processes need improvement. Therefore, it is important
to record the following characteristics as a function
of time (per process step and on component, module and product level):
f the fraction of scrap,
f the fraction of rework,
f the productivity.
If the performance on these characteristics is
a!ected by particular circumstances, these circumstances must be recorded as well. Common relevant
circumstances are the following:
f
f
f
f
the time of production,
characteristics of the batch,
the production speed,
the operator and/or shift.
As it is in principle simple to collect all this
information and it is also quite clear how this
information must be analysed, in the rest of this
section we concentrate on metrics that are based on
"eld information.
4.1.3. Information about xeld problems
Relevant quantitative information about the
"eld behaviour of products/subsystems/modules/
components is given by the following characteristics (see also [11]):
f the fraction of customer complaints within the
warranty period, or, more general,
f the fraction of customer complaints within a particular time-interval,
f the fraction of zero-hour failures (dead-on-arrival)
f the hazard function,
f the segmentation of customer complaints over
the categories: design problem, production problem, component problem, product level problem,
customer use, no fault found.
Just as in the case of production problems, also the
"eld behaviour of products is a!ected by several
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
characteristics. The most important ones are the
following:
f the time of production,
f the date the product is put into use,
f the quantity of use (amount of time, number of
cycles, etc.),
f the way of use (whether or not according to the
user speci"cations),
f the environment in which the product has been
used (for example warm and humid or cold and
dry).
The main di!erence between the information about
problems in production and problems in the "eld is
the in#uence of the factor time. The fall-o! in production concerns instantaneous failures and in
principle the performance of production is well
known at any moment. A quantitative analysis of
the number of "eld failures is much more complicated, because at time t only the number of failures
that occurred before time t is known. This does not
a!ect the estimation of, for example, the fraction of
zero-hour failures, but it does a!ect, the estimation
of the reliability and the hazard function (see Section 4.3).
The estimation of some characteristics is also
complicated by the fact that it is far from easy, even
during the warranty period, to determine the total
number of sold products, the total number of products still in use, and the total number of customer
complaints.
4.1.4. Costs of (non-)quality
Information about the performance on quality is
not complete without a full view on the costs that
are related with making quality, or, better, making
non-quality. Some metrics for these costs are:
f
f
f
f
f
f
f
f
costs of the design process itself,
costs of design changes,
costs of process changes,
warranty costs,
costs of service activities,
product liability costs,
image costs and costs of losing customers,
extra costs for the customer.
In this paper we will not go into this in more detail,
we will concentrate on reliability metrics.
33
4.2. Current metrics
Some of the metrics that are presently used in
industry will be presented below and their advantages and disadvantages will be explained. In Section 4.2.1 the classical "eld call rate will be
presented and in Section 4.2.2 the warranty call
rate. We will only discuss "eld data, because collecting and analysing data about fall-o! in production is not a serious problem.
A service centre is, of course, the primary source
for quantitative information about "eld problems.
This does not mean that it is easy to get quantitative information about the "eld performance on
product level. For example, it is hardly possible to
collect quantitative information about "eld problems expressed as the percentage of products that
fail within a one-year warranty period. The reason
is, of course, that service centres normally see, even
within the warranty period, only part of all products that fail.
4.2.1. Classical xeld call rate
The classical "eld call rate is used to monitor the
number of "eld failures of a given product and was
developed for logistic purposes. As the hazard function depends on the age of the product it has, in
those cases where not all products are sold simultaneously, little to do with the number of repairs that
are expected in a certain time interval.
The de"nition of the classical "eld call rate
FCR
is very close to the de"nition of the
#-!44*#!natural estimator jI (q, q#*q) of the hazard function (see Section 4.1.1). On the interval (q , q #*q)
i i
the FCR
is estimated by
#-!44*#!M(q #*t)!M(q )
i
i ,
(q , q #*t)"
FCR
#-!44*#!- i i
*qN(q )
i
where M(t) is the number of failures at time t, N(t)
the number of products on the market at time t.
The di!erence is in the meaning of the time t. In
jI (q, q#*t) time t is measured from the moment the
failure behaviour starts. For consumer products
this will usually be close to the time since sales. In
the estimator of FCR
, however, t is the time
#-!44*#!since market introduction of the product. It is important to note that the expression does not take into
34
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
account the age of a product at the moment of
failure, it just uses the total number of failures.
Furthermore, it is important to realise that not all
products are sold at time t"0 and this reduces the
value of the estimator of the classical "eld call rate,
just as it in#uences the meaning of the estimator of
the hazard function.
Another disadvantage of the classical "eld call
rate is the fact that in the early phases of the life of
a product and in the last phases, the number of
products actually in use on the market, N(t), has
a high level of uncertainty.
The FCR is used to monitor the number of "eld
failures of a given product and was developed for
logistic purposes, like the estimation of the number
of spare parts that will be necessary at a given
moment in time and at a given location. As the
hazard function usually is a function of the age of
the product, the FCR has, in those cases where not
all products are sold simultaneously, little to do
with the number of repairs that are expected in
a certain time interval.
4.2.2. Warrantee call rate
A di!erent method uses the so-called warrantee
package method. The warrantee package method is
used especially for "nancial purposes. The de"nition of the warrantee call rate WCR
cal8!33!/5%%
culated according to the warrantee package
method is very close to the de"nition of FCR
.
#-!44*#!On the interval (q , q #*t) the warrantee call rate
i i
is estimated by
M (q #*t)!M (q )
8 i,
(q , q #*t)" 8 i
WCR
8!33!/5%% i i
*qN (q )
8 i
where M (t) is the number of failures at time t of
8
products within warrantee, N (t) the number of
8
products within warrantee on the market at time t.
As the name of the model indicates, the main
focus in this model is on warrantee aspects of products: what fraction of the products fails during the
warrantee period. Therefore, although the formula
is mathematically close to the formula for the classical "eld call rate, this metric uses a kind of moving
time-window and therefore will lead to conceptually di!erent results. And, again, the expression
does not take into account the age of a product at
the moment of failure.
From the foregoing it will be clear that the "eld
call rate and the warrantee call rate are not very
informative. Furthermore, as products are normally taken into use at di!erent time points, a more
sophisticated estimator of the hazard function is
necessary than the one given in Section 4.1.1. This
is the subject of Section 4.3.
4.3. New metrics
In Section 4.3.1 we present an estimator of the
reliability and in Section 4.3.2 an estimator of the
hazard function. Both estimators make full use of
all available data.
4.3.1. Reliability
After market introduction the number of products
on the market increases (for some time) and with it
the number of defects. If the hazard function is estimated at time t, the data are censored on the right, i.e.
some products have not yet failed, and their failure
times are known only to be beyond their present
running time. This type of data is known in the
literature as type I multiply censored data. Instead of
the estimator jI (q, q#*t) given in Section 4.1.1, it is
much better to use the product-limit estimator of the
reliability function, as developed by Kaplan}Meier
[12]. The product-limit estimator is usually given for
the situation that all &products' start at the same time
t"0 and that the censoring at the right is not the
same for all products. In our situation the products
are put into use at di!erent moments and all products
are censored at the same moment. Of course, it is
possible to transform the failure time data by shifting
all starting points to q"0. However, in order to
come up with expressions that are relatively easy to
use in industry, we prefer not to do this, but to use the
data as they are. This leads to the following derivation
of the reliability.
We divide the time axes in intervals of, say, one
month. Let q denote the endpoint of the ith interi
val I with q "0 (see Fig. 3). To keep things
i
0
simple, we act as if sales only take place at the time
points q , q , q 2. Furthermore, we suppose that
0 1 2
all products are functioning at the time of sales, but
they can fail a split second afterwards.
De"ne n as the number of products functioning
ji
at q of all products sold at q , d the number of
i
j ji
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
35
Analogously, at q we estimate P(¹*i unitsD¹*
j
i!1 units), with j*i*2, by
Fig. 3. Time axis with intervals.
products failing in interval I of all products sold at
i
q , ¹ the age of a product at the time of failing.
j
This implies that a product that is sold at q surj
vives i time units if for that product holds: ¹*q .
j`i
The Kaplan}Meier estimator of the reliability of
a product to survive i time units, say R(i units), is
given by
R(i units)"P(¹*i unitsD¹*i!1 units) P(¹*i
!1 unitsD¹*i!2 units) ) ) ) P(¹*2 unitsD¹*1
unit)P(¹*1 units).
For a product that is sold at time q the probabilj
ity P(¹*1 units) can be estimated by
+j~1 d
p ,1! k/0 k,1`k .
ji
+j~1 n
k/0 k,i`k~1
Therefore, at time q the following estimator of the
j
reliability is available:
j
RK (i units)" < p
with i)j.
j
ji
i/1
4.3.2. Hazard function
Let h(i units) be the hazard function of a product
at i time units after it has been sold. From the
analysis in the foregoing section it follows that this
hazard function at time q can be estimated by
j
combining all available information. That means
that at time q the instantaneous failure probability
j
after surviving i time units, j(i units) is estimated by
total number of products that failed immediately after surviving i time units
hK (i units)"
j
total number of products that survived i time units
d
#d
#2#d
1,i`2
j~i~1,j
" 0,i`1
n #n
#2#n
0,i
1,i`1
j~i~1,j~1
PK (¹*1 unit)"PK (¹*q )
j
j
j`1
d
"1! j,j`1
n
j,j
and P(¹*i units D¹*i!1 units), with i*2, can
be estimated by
PK (¹*i unitsD¹*i!1 units)
j
"PK (¹*q D¹*q
)
j
j`1
j`i~1
d
"1! j,j`1 .
n
j,j`i~1
At q we estimate P(¹*i!1 unit) by combining
j
all data for products sold at q , q ,2, q . This
0 1
j~1
gives the following estimator p of P(¹*1 unit):
j1
with i)j!1.
If this estimator is plotted as a function of i, it is
possible to compare the graph with the rollercoaster curve. In this way additional information
about the failure mechanisms can be collected (see
[27]). In a next paper we will give a practical
example of this estimator in high-volume consumer
products. Of course, in this way it is also possible to
predict the fraction of failures under warranty that
can be expected, and this estimate is far superior to
the warrantee call rate.
5. Conclusion
Based on the information in the previous sections the following conclusions can be made:
total number of products sold at or before q
that failed in the first month after sales
j~1
p ,1!
ji
total number of products sold at or before q
j~1
d #d #2#d
12
j~1,j .
"1! 01
n #n #2#n
00
11
j~1,j~1
36
V.T. Petkova et al. / Int. J. Production Economics 67 (2000) 27}36
f In mass production the "rst contact between
a customer and the manufacturer is in a service
centre, therefore service centres are most suitable
for collecting information about the behaviour of
products in real operating conditions.
f Service centres are focused on logistic and not on
their contribution to quality improvement. As
a consequence valuable information is not collected.
f Service centres only communicate with the other
departments, like production and development,
about serviceability.
f In industry some metrics, like the classical "eld
call rate and the warrantee call rate, are used that
hardly give any sensible information. In particular these metrics do not give any information
about the quality of the products and processes.
f There are simple metrics based on the Kaplan}Meier estimator of the reliability and the
hazard function that give information that is
useful for logistics purposes as well as for quality
improvement.
It will be clear that service centres have to be
prepared for their new assignment. This holds even
stronger when the contribution from service centres
to the MIR levels 2, 3 and 4 is analysed, because
then a thorough knowledge of design and production is indispensable. But this is the subject of
ongoing research.
References
[1] T.P.J. Berden, A.C. Brombacher, P.C. Sander, The building bricks of product quality: An overview of some basic
concepts and principles, International Journal of Production Economics 67 (1) (2000) 3}15.
[2] P.T. Petkova, P.C. Sander, A.C. Brombacher, The role of
the service centre in improvement processes, Quality and
Reliability Engineering International 15 (1999).
[3] A.C. Brombacher, MIR: Covering non-technical aspects of
IEC61508 reliability certi"cation, Reliability Engineering
and System Safety (1999).
[4] P.C. Sander, A.C. Brombacher, MIR: The use of reliability
information #ows as a maturity index for quality management. Quality and Reliability Engineering International 15
(1999).
[5] Y. Akao (Ed.), Quality Function Deployment. Productivity Press, Cambridge, 1990.
[6] A.C. Brombacher, Predicting reliability of high volume
consumer products: Some experiences 1986}1996, Symposium `The Reliability Challengea organised by Finn
Jensen Consultancy, London, 1996.
[7] E.E. Lewis, Introduction to Reliability Engineering, 2nd
Edition, Wiley, New York, 1996.
[8] H.E. Ascher, H. Feingold, Repairable Systems Reliability:
Modelling, Inference, Misconceptions and their Causes,
Marcel Dekker, New York, 1984.
[9] S.H. Lo et al., Density and hazard rate estimation for
censored data via strong representation of the Kaplan}
Meier estimator, Probability Theory and Related Fields 80
(1989) 461}473.
[10] W. Stute, Strong and weak representations of cumulative
hazard function and Kaplan}Meier estimators on increasing sets. Journal of statistical planning and inference (1994)
315}329.
[11] J.W. Wesner, J.M. Hiatt, D.C. Trimble, Winning with
Quality, Addison-Wesley, New York, 1995.
[12] E.L. Kaplan, P. Meier, Nonparametric estimation from
censored incomplete observations, Journal of the American Statistical Association 53 (1958) 457}481.