VALIDATION STRATEGY RESULTS AND DISCUSSION

Mammographic Density Classification Using Multiresolution Histogram Technique – Izzati Muhimmah, Erika R.E. Denton, Reyer Zwiggelaar ISSN 1858-1633 2005 ICTS 21 different risks. On the other hand, a recently published paper by Hadjidemetriou et al. showed that different generic texture images with similar histograms can be discriminated by a multi-resolution approach [5]. Based on these findings, our aim was to investigate whether it is possible to automatically classify mammographic density using a multi-resolution histogram technique. The remainder of this paper is outlined as follows: the proposed multi-resolution histogram features are described in Section 2. Data to validate this methodology are explained in Section 3 and its statistical analysis method is described in Section 4. Section 5 gives results of the proposed method and discussion on our findings. Finally, conclusions appear in Section 6.

2. MULTI-RESOLUTION HISTOGRAM TECHNIQUE

The main aim is to obtain feature vectors which can be used to discriminate between the various mammographic density classes. A feature vector representing a mammogram is derived from a set of histograms {h , h 1 , h 2 , h 3 }, see Figure 2b. h is obtained from the original mammogram, and histograms h 1 , h 2 and h 3 are obtained after Gaussian Filtering the mammogram by 5x5 kernels and scaling in three stages. For all four histograms only grey level information from the breast area ignoring the pectoral muscle and background areas is used and the histograms are normalized with respect to this area. For increasing scales this shows the general shift to lower grey-level values and the narrowing of the peaks in the histogram data. It should be noted that these histograms deviate significantly from those described by Hadjidemetriou et al. [5] which start with delta function peaks which broaden on smoothing. Subsequently, the set of histograms are transformed into a set of cumulative histograms {c , c 1 , c 2 , c 3 }, see Figure 2c. The feature vector for each mammogram is constructed from the difference between subsequent cumulative histograms: {c – c 1 , c 1 - c 2 , c 2 - c 3 }. See Fig. 2d for an example. Between scales this shows a shift to lower grey-level values, but the overall shape of the data remains more or less constant. The dimensionality of the resulting feature space is equal to 768. a b c d Figure 2. Illustration of features selection process: a Example of ROI m29592L, b Histograms of a and its consecutive multi- resolution images, c Cumulative histogram of b, and d Difference of consecutive cumulative histogram c form the classification features

3. DATA

The above technique was evaluated on the dataset comprised sixty post 1998 mammograms from the UK NHS breast screening programme EPIC database, randomly selected representing the Boyds SCC [1] as classified by an expert radiologist. All mammograms are Fuji UMMA filmscreen combinations, medio- lateral views, and digitized using mobile-phone-CCD scanner with 8 bit per pixel accuracy. The breast areas are segmented using threshold and morphological operations, see Figure 2 a for an example. It should be noted that these images are pair mammograms LR from thirty patients.

4. VALIDATION STRATEGY

For classification, an automatic method is built based on the feature vectors in combination with a k- nearest neighbor approach. Here we have used three neighbors, an Euclidean distance in [5] a L1 norm was used and Bayesian probability. However, it is known that mammographic intensities vary with exposure levels and film characteristics [3, 4]. And, an imaging session, a woman likely had the mammogram Information and Communication Technology Seminar, Vol. 1 No. 1, August 2005 ISSN 1858-1633 2005 ICTS 22 captured using similar films andor exposure levels. Hence, to minimize bias, we used a leave-one-woman- out strategy in training. The result is shown in a form of confusion matrices.

5. RESULTS AND DISCUSSION

The result is presented in a confusion matrix as in Table 1. The results showed an agreement of 38.33 in comparison with expert assessment and 78.33 when minor classifications deviation is allowed. The low rate of agreement is below the reported state of the art, which comes partially as a surprise as some of the state of the art work relies on information extracted from single histograms. Table 1. Comparison between automatic, histogram based, and expert classification. Within the tables the proportion of dense tissue is represented as 1: 0, 2: 0-10, 3: 11-25, 4: 26-50, 5: 51- 75, and 6: 76-100. Instead of taking all six classes into account, for mammographic risk assessment it might be more appropriate to just take high and low density estimation classes into consideration, which means that the lower and higher three SCC classes are grouped together. Using such an approach the developed techniques shows an agreement of 80 with the expert assessment. We had applied this methodology into MIAS database [10], and an agreement of 55.17 for SCC and 61.56 for triple MIAS categories were achieved [8]. The latest was similar to those reported by Masek et al. [7], i.e 62.42 when using an Euclidean distance. Their method is based on direct distance measures of average histogram of original images for each density class. It should be noted that we used less data for training due to leave-one-woman-out strategy. Moreover, this is inline with our own single histogram h results, which were 61,99 for triple MIAS classification and 57,14 for SCC based classification. These results might indicate there is little benefit in using the multi-resolution histogram approach. It should be noted that our methodology slightly deviated from Hadjidemetriou et al [5]. Their implementation of the multi-scale approach includes a subsampling step which makes a second normalization essential. In our case, we only using smoothing stage of the multi-scale approach without the subsampling. As such the second normalization step is not used. Despite that multi-resolution histogram technique is claimed to be robust to match either synthetic, Brodatz, or CUReT textures [5], our results could not confirm its application in mammographic density classification. We would like to investigate whether this is caused by nature of the mammographic texture patterns andor imaging system effects. Thus, additional pre-processing to enhance the contrast between fatty and dense tissue, or to incorporate the X-ray imaging protocol information, are areas of future research.

6. CONCLUSION