Statistical analysis of metrics data
21.5.3 Statistical analysis of metrics data
Analysis of metrics data provides opportunities for comparing a series of project metrics. These certainly include comparison of metrics results against
Table 21.11: US and Japanese software industries – metrics Name
Calculation formula
Mean productivity (similar to DevP, Table 21.5) KNLOC –––––– WorkY
Failure density (similar to SSFD, Table 21.8) –––––– NYF KNLOC
Code reuse (similar to CRe, Table 21.5) ReKNLOC –––––––– KNLOC
Key: ■ KNLOC = thousands of non-comment lines of code. ■ WorkY = human work-years invested in the software development. ■ ReKNLOC = thousands of reused non-comment lines of code. Source: Based on Cusumano (1991)
Table 21.12: US and Japanese software industries – comparison of three software quality metrics
Software quality metrics
Japan Mean productivity
United States
12447 Failure density
21.5 Impl
parison relates to the effectiveness with which the metrics themselves fulfill
their respective aims. The following questions are just a sample of those that can be asked with respect to the metrics portion of the SQA process.
Are there significant differences between the HD teams’ quality of service?
ement
Do the metrics results support the assumption that application of the new version of a development tool contributes significantly to software quality?
ation of
Do the metrics results support the assumption that reorganization has contributed significantly to a team’s productivity?
sof
For the metrics data to be a valuable part of the SQA process, statistical
analysis is required of the metrics’ results. Statistical tools provide us with
tw
two levels of support, based on the type of statistics used:
are quality
Descriptive statistics
Analytical statistics.
metric Descriptive statistics Descriptive statistics, such as the mean, median and mode as well as use of graphic presentations such as histograms, cumulative distribution graphs,
pie charts and control charts (showing also the indicator values) to illustrate the information to which they relate, enable us to quickly identify trends in the metrics values. Exaggerated trends, identified by the appearance of acute deviations from target values, may indicate the need for corrective actions or, alternatively, to continue or expand application of a successful innovation. Because these statistics are so basic to quality assurance, the majority of the popular software statistical packages (statistical tools) available generally provide a rather complete menu of graphic presentations, including the ones mentioned above. However, it should be stressed that descriptive statistics, however sophisticated, are not intended to analyze the statistical significance – how much the observed trends are the result of chance rather than sub- stantive processes – of the events.
Analytical statistics Description is not always enough. To determine whether the observed changes in the metrics are meaningful, whatever the direction, the observed trends must be assessed for their significance. This is the role of analytic sta- tistics (e.g., regression tests, analysis of variance, or more basic tests such as the T-test and Chi-square test). However, the application of statistical analy-
432 What should be noted here is the fundamental difference between statisti- cal analysis of production line metrics by application of classical SPC
21 (statistical process control) methods and of software development and mainte- Softw
nance. While production line activities are repetitive, the development activities, by definition, vary from one project to the next; they are never repet- itive in the SPC sense. Although the statistical methods applied may be similar,
are quality
the subject matter differs, as may the implications of the statistical results.