Measures of Shape

2.3.3 Measures of Shape

The most popular measures of shape, exemplified for the PRT variable of the Cork Stoppers’ dataset (see Table 2.8), are presented next.

2.3.3.1 Skewness

A continuous symmetrical distribution around the mean, µ, is defined as a distribution satisfying:

This applies similarly for discrete distributions, substituting the density function by the probability function.

A useful asymmetry measure around the mean is the coefficient of skewness, defined as:

γ = Ε [ ( X − µ ) 3 ] / σ 3 . 2.14

This measure uses the fact that any central moment of odd order is zero for symmetrical distributions around the mean. For asymmetrical distributions γ reflects the unbalance of the density or probability values around the mean. The

formula uses a 3 σ standardization factor, ensuring that the same value is obtained for the same unbalance, independently of the spread. Distributions that are skewed

to the right (positively skewed distributions) tend to produce a positive value of γ, since the longer rightward tail will positively dominate the third order central moment; distributions skewed to the left (negatively skewed distributions) tend to produce a negative value of γ, since the longer leftward tail will negatively dominate the third order central moment (see Figure 2.24). The coefficient γ, however, has to be interpreted with caution, since it may produce a false impression of symmetry (or asymmetry) for some distributions. For instance, the probability function p k = {0.1, 0.15, 0.4, 0.35}, k = {1, 2, 3, 4}, has γ = 0, although it is an asymmetrical distribution.

The skewness of a dataset x 1 , …, x n is the point estimate of γ, defined as:

i = 1 ( x i − x ) / [ ( n − 1 )( n − 2 ) s 3 ] . 2.15

2.3 Summarising the Data

Note that:

– For symmetrical distributions, if the mean exists, it will coincide with the median. Based on this property, one can also measure the skewness using

g = (mean − median)/(standard deviation). It can be proved that –1 ≤ g ≤ 1. – For asymmetrical distributions, with only one maximum (which is then the

mode), the median is between the mode and the mean as shown in Figure

a mode

mode mean

mean

median

b median

Figure 2.24. Two asymmetrical distributions: a) Skewed to the right (usually with γ > 0); b) Skewed to the left (usually with γ < 0).

2.3.3.2 Kurtosis

The degree of flatness of a probability or density function near its center, can be characterised by the so-called kurtosis, defined as:

4 κ 4 = Ε [ ( X − µ ) ] / σ − 3 . 2.16

The factor 3 is introduced in order that κ = 0 for the normal distribution. As a matter of fact, the κ measure as it stands in formula 2.16, is often called coefficient of excess (excess compared to the normal distribution). Distributions flatter than the normal distribution have κ < 0; distributions more peaked than the normal distribution have κ > 0.

The sample estimate of the kurtosis is computed as:

k = [ n ( n + 1 ) M 4 − 3 ( n − 1 ) M 2 2 ] / [ ( n − 1 )( n − 2 )( n − 3 ) s 4 ] , 2.17

with: n M j =

i = 1 ( x i − x ) . Note that the kurtosis measure has the same shortcomings as the skewness measure. It does not always measure what it is supposed to.

The skewness and the kurtosis have been computed for the PRT variable of the Cork Stoppers’ dataset as shown in Table 2.8. The PRT variable exhibits a positive skewness indicative of a rightward skewed distribution and a positive kurtosis indicative of a distribution more peaked than the normal one.

66 2 Presenting and Summarising the Data

There are no functions in the R stats package to compute the skewness and kurtosis. We provide, however, as stated in Commands 2.8, R functions for that purpose in text file format in the book CD (see Appendix F). The only thing to be done is to copy the function text from the file and paste it in the R console, as in the following example:

> skewness <- function(x){ + n <- length(x) + y <- (x-mean(x))^3 + n*sum(y)/((n-1)*(n-2)*sd(x)^3) +} > skewness(PRT) [1] 0.592342

In order to appreciate the obtained skewness and kurtosis, the reader can refer to Figure 2.25 where these measures are plotted for several distributions (see Appendix B). For more details see (Dudewicz EJ, Mishra SN, 1988).

Table 2.8. Skewness and kurtosis for the PRT variable of the cork stopper dataset. Skewness Kurtosis

Impossible area

Uniform

2 Normal Beta area

Student t

4 Ga m m a

Figure 2.25. Skewness and kurtosis coefficients for several distributions.

Measures of Shape

Parts

Dokumen yang terkait

Semlit 3 – Implications for accounting, accountability and performance – Alifia

Semlit 8 – Governance, transparency and accountability An international comparison – Antonius Niha

Semlit 9 – The translation of accrual accounting and budgeting and the reconfiguration – Bungsu M

Sources of power and infrastructural conditions in medieval governmental accounting

SEMLIT UAS – 1 Adriana – Accountability and the accounting regime in the public sector

SEMLIT UAS – 10 Bungsu M – Earning Management and Stock Return

Gene Therapy Using Adeno-Associated Virus Vectors

10.1128CMR.00008-08. 2008, 21(4):583. DOI: Clin. Microbiol. Rev. Shyam Daya and Kenneth I. Berns - Clin. Microbiol. Rev. 2008 Daya 583 93

Oil Price History and Analysis

FILE Computational Statistics Handbook With MATLAB

Dukungan

Links

Measures of Shape

Parts

Dokumen yang terkait

Semlit 3 – Implications for accounting, accountability and performance – Alifia

Semlit 8 – Governance, transparency and accountability An international comparison – Antonius Niha

Semlit 9 – The translation of accrual accounting and budgeting and the reconfiguration – Bungsu M

Sources of power and infrastructural conditions in medieval governmental accounting

SEMLIT UAS – 1 Adriana – Accountability and the accounting regime in the public sector

SEMLIT UAS – 10 Bungsu M – Earning Management and Stock Return

Gene Therapy Using Adeno-Associated Virus Vectors

10.1128CMR.00008-08. 2008, 21(4):583. DOI: Clin. Microbiol. Rev. Shyam Daya and Kenneth I. Berns - Clin. Microbiol. Rev. 2008 Daya 583 93

Oil Price History and Analysis

FILE Computational Statistics Handbook With MATLAB

Dokumen yang Anda mencari sudah siap untuk unduhkan