Variability Analysis for Different Uncertainty Realizations

Two-Dimensional Probabilistic Risk Assessment Model 1519 Table III. Levels Defined for Factors in the Growth Estimation Part of the E. coli Model Number Levels and Corresponding Factor a of Levels Percentiles b Temp1 5 7.5–11, 11–14.5, 14.5–18, 18–21.5, 21.5 c Temp2 3 7.5–13.5, 13.5–19.5, 19.5 c Temp3 5 7.5–11, 11–14.5, 14.5–18, 18–21.5, 21.5 c Time1 12 0–24, 24–48, . . . , 264–288, 288 c Time2 2 0–3.5, 3.5 c Time3 12 0–24, 24–48, . . . , 264–288, 288 c MD 3 {6.5,6.5–8.5, 8.5} {20th, 80th} Percentiles LP1 4 {50, 50–65, 65–95, 95} {20th, 50th, 80th} Percentiles LP2 4 {35, 35–55, 55–90, 90} {20th, 50th, 80th} Percentiles LP3 4 {45, 45–65, 65–95, 95} {20th, 50th, 80th} Percentiles GT1 4 {7, 7–9.5, 9.5–12.5, 12.5} {20th, 50th, 80th} Percentiles GT2 4 {4.5, 4.5–8, 8–12, 12} {20th, 50th, 80th} Percentiles GT3 4 {6.5, 6.5–9.5, 9.5–13, 13} {20th, 50th, 80th} Percentiles a The abbreviations used for factors in this table are the same as those defined in Table I. b The ranges that define each factor level and the percentiles of the CDF corresponding to the breakpoint between factor levels are given. c For this factor equal intervals are used as levels. the power of statistical tests. 42 There is a tradeoff be- tween selecting a larger number of factor levels, which can produce more highly resolved insights regarding sensitivity, and getting statistically significant results. There is also a tradeoff between the desired number of iterations e.g., in a Monte Carlo simulation that are used to populate factor levels and the computa- tional time. Table III summarizes the levels defined for factors in the case studies. These levels are used in both proba- bilistic scenarios. Levels are mostly defined based on visual inspection of the CDF for each factor. Each CDF is prepared based on the generated values for a factor in the comingled analysis of variability and un- certainty, since this probabilistic approach gives the widest range of variation for each factor. For storage times at Stages 1 and 3, levels are defined at equal intervals.

4.2. Variability Analysis for Different Uncertainty Realizations

4.2.1. Rankings Based on ANOVA Table IV summarizes the results from variability analysis for different uncertainty realizations. Mean, minimum, and maximum estimated F values from all uncertainty realizations for each factor are given. The percentage of the uncertainty realizations that pro- duced a statistically significant F value for each factor is quantified. The mean rank and the range of ranks are presented. For each uncertainty realization, in- teractions between storage temperature and storage time at all three stages are evaluated. The factors were classified into four groups based on their ranks in uncertainty realizations: 1 Group 1: the most important factor with rank 1; 2 Group 2: factors with secondary importance with ranks between 2 and 4; 3 Group 3: factors with mi- nor importance with ranks 5 and 6; and 4 Group 4: factors that were unimportant with ranks between 7 and 13. Factors in the latter group typically were not identified as statistically significant in most of un- certainty realizations. Because the rankings of fac- tors typically varied among uncertainty realizations, often a given factor might be assigned to different groups in different uncertainty realizations. However, there was typically a group to which each factor was most frequently assigned. The probabilities associ- ated with assignment of a given factor to each of the four groups were estimated based on the num- ber of times the factor was assigned to the selected group divided by the total number of uncertainty realizations. Fig. 4 depicts the probabilities associated with selection of factors in each of the four importance groups. Based on the dominant importance groups for factors, Time3 was the most important factor as it was ranked 1 in 72 of uncertainty realizations. A group of three factors i.e., Temp3, Time1, and Temp1 were within Group 2 with mean ranks between 2.7 and 3.1. Factors in Group 3 included LP3 and LP1 1520 Mokhtari and Frey Table IV. Summary of the ANOVA Results for Two-Dimensional Variability Simulation for Different Uncertainty Realizations [min, max] Mean Range of Factor a Mean F Value Range of F values b Frequency c Rank d Rank Temp1 103.0 2.0, 1233.0 98 3.1 1–8 Temp2 1.0 0.0, 14.6 10 11.2 5–13 Temp3 118.9 16.9, 689.3 100 2.7 1–4 Time1 99.0 6.8, 495.4 100 2.8 1–4 Time2 0.3 0.1, 2.7 12.3 10–13 Time3 188.4 31.6, 1448.8 100 1.5 1–4 MD 1.1 0.1, 5.9 6 11.7 5–13 LP1 4.8 0.6, 18.1 77 6.2 4–11 LP2 1.1 0.1, 7.4 6 11.8 5–12 LP3 4.6 0.3, 14.4 87 6.0 4–12 GT1 2.0 0.1, 47.3 14 11.0 5–13 GT2 1.0 0.1, 4.4 10 11.5 5–13 GT3 1.4 0.1, 9.4 11 11.1 5–13 Time1 × Temp1 66.2 10.5, 339.8 100 Time2 × Temp2 0.7 0.1, 4.5 2 Time3 × Temp3 122.8 46.4, 839.9 100 a The abbreviations used for factors in this table are the same as those defined in Table I. b Represents minimum and maximum F values estimated for each factor in 100 uncertainty realizations. c The percentage of the uncertainty realizations for which the F values were statistically significant. d Arithmetic average of 100 ranks for each factor. with mean ranks of 6.0 and 6.2, respectively. Group 4 contained seven factors—GT1, GT3, GT2, Temp2, LP2, MD, and Time2 with mean ranks between 11.0 and 12.3. The interaction effect between storage tempera- ture and storage time at Stage 2 was unimportant. The interaction effect between storage time and storage temperature at home had higher importance than in- teraction between similar factors at retail stores. The mean F values estimated for these two interaction ef- fects differed by a ratio of approximately 2. Although the interaction effects were not considered in the over- all ranking in Table IV, their relative importance can be compared with main effects using their mean F val- ues, similar to an approach presented by Rose et al. 43 Fig. 4. Probabilities associated with identification of each factor in an importance group based on ANOVA. Group 1: The most important factor; Group 2: Factors with secondary importance; Group 3: Factors with minor importance; and Group 4: Unimportant factors. Factors contributing to the two statistically significant interaction effects had relatively large mean F val- ues and were categorized in the first two importance groups. The interaction effect between Time3 and Temp3 had a mean F value approximately the same as that of Temp3. Thus, the interaction had comparable importance with Temp3. 4.2.2. Diagnostic Check for Results Based on ANOVA As a diagnostic check on the results, R 2 was es- timated for the ANOVA results for each uncertainty realization. Thus, 100 values of R 2 were estimated, representing the amount of variability in the response Two-Dimensional Probabilistic Risk Assessment Model 1521 Fig. 5. Cumulative probability distribution of R 2 values in the two-dimensional analysis for a model including only main effects and a model including both main and interaction effects. variable, Y, captured by the ANOVA model at each uncertainty realization. Two ANOVA models were considered based on: 1 only the main effects of all factors; and 2 the main effects of all factors along with interaction effects given in Table IV. The com- parison of the two cases provided insight with respect to improvement in the amount of the variability in the response captured by the ANOVA model when interactions were also considered. Fig. 5 shows CDF graphs for R 2 values estimated in 100 uncertainty realizations for the two ANOVA models. For an ANOVA model based only on the main effects, the 95 probability range of R 2 values was between 0.47 and 0.71 with a mean value of 0.62. For an ANOVA model based on all of the main effects and selected interactions, the 95 probability range of R 2 values was between 0.78 and 0.93 with a mean value of 0.87. Therefore, interaction effects between storage time and storage temperature in Stages 1 to 3 accounted for an average of 25 of the total vari- ability in the response variable. Because the ANOVA model with interaction terms accounted for a substan- tial amount of variability in the response, ranks based on the F values were deemed to be reliable. 4.2.3. Rankings Based on Pearson and Spearman Correlation Analyses Pearson and Spearman correlation analyses were selected as conventional sensitivity analysis methods for comparison with ANOVA. Inputs were ranked based on the absolute values of statistically significant correlation coefficients with a significance level of 5. Table V summarizes mean ranks, range of ranks, and the percentage of the uncertainty realizations that produced a statistically significant correlation coef- ficient. These methods are not able to provide insight with respect to interactions between inputs. Similar to ANOVA results, inputs were grouped into four cate- gories based on their ranks. The probabilities associated with selection of in- puts in each of the four importance groups are shown in Fig. 6. Inputs within a given group were typically dif- ferent from those based on the ANOVA results. Some inputs implied to be sensitive based on PCC and SCC results e.g., GT3 and LP3 were not found to be sen- sitive based on ANOVA. Among statistically signifi- cant inputs, there was not a clear agreement between the ranks based on the ANOVA results and those from Pearson and Spearman correlation analyses. For example, while ANOVA selected Time3 as the most important factor, Temp3 and Time1 were selected as the most important inputs based on PCC and SCC, respectively. Table VI presents the results from top- down correlation analysis for pairwise compar- isons of the results for PCC, SCC, and ANOVA, based on the two-dimensional probabilistic sim- ulation. For each uncertainty realization, ranks based on each pairwise combination of sensitiv- ity analysis methods were compared and the top- down correlation coefficients between ranks were calculated. The top-down correlation results were typically lowest for comparisons of ANOVA to PCC and of ANOVA to SCC, implying that ANOVA leads to sub- stantially different rankings of the importance of in- puts compared to the two correlation-based methods. However, stronger agreement was observed when comparing PCC and SCC, including a few uncertainty 1522 Mokhtari and Frey Table V. Summary of Pearson and Spearman Correlation Analyses Results for Two-Dimensional Variability Simulation for Different Uncertainty Realizations Pearson Correlation Analysis Spearman Correlation Analysis Mean Range Mean Range of Inputs a Frequency b Rank of Rank Frequency c Rank Rank Temp1 93 4.4 1–12 97 6.9 3–12 Temp2 2 11.0 8–13 7 11.1 9–13 Temp3 100 1.8 1–6 100 5.1 3–8 Time1 91 4.4 1–10 100 1.7 1–5 Time2 2 10.7 7–13 6 10.9 7–13 Time3 100 3.8 1–8 100 2.0 1–6 MD 3 10.8 9–13 1 10.9 9–13 LP1 81 7.0 3–13 99 4.8 2–8 LP2 5 10.5 6–13 8 10.8 7–13 LP3 99 4.9 2–111 100 3.3 2–8 GT1 80 6.8 2–13 94 7.3 4–12 GT2 1 11.3 9–13 4 11.2 10–13 GT3 99 3.7 2–9 100 5.2 3–8 a The abbreviations used for factors in this table are the same as those defined in Table I. b The percentage of the uncertainty realizations for which the F values were statistically significant. c The percentage of uncertainty realizations for which coefficients were statistically significant. realizations in which the top-down correlation was greater than 0.8. 4.2.4. Summary of the Results for the Variability Analysis for Different Uncertainty Realizations The key insights and findings based on analysis of variability for different uncertainty realizations in- clude: Fig. 6. Probabilities associated with identification of each input in an importance group based on: a Pearson correlation; and b Spearman correlation. Group 1: The most important input; Group 2: Inputs with secondary importance; Group 3: Inputs with minor importance; and Group 4: Unimportant inputs. r There is a substantial range of uncertainty in the relative importance of inputs due to uncer- tainty in the probability distribution of inputs. r Although the ranking of inputs varied among uncertainty realizations, inputs could be clas- sified into four groups based on their impor- tance. The importance groups were robust to uncertainty in inputs, and hence can be used confidently for decision making. Two-Dimensional Probabilistic Risk Assessment Model 1523 Table VI. Top-Down Correlation Matrix for Input Rankings with Different Sensitivity Analysis Methods Top-Down Correlation Results b Method a ANOVA PCA SCA − 0.04 0.10 ANOVA −0.34, 0.36 −0.28, 0.45 PCA 0.59 0.26, 0.87 11 a ANOVA: analysis of variance; PCA: Pearson correlation analysis; SCA: Spearman correlation analysis. b For each pair of methods, mean, 95 probability range, and number of times that the top-down correlation coefficient was larger than 0.8 in uncertainty realizations are given. r The results based on Pearson and Spearman correlation typically were different compared to those based on ANOVA with respect to identification of key important inputs. How- ever, the two correlation-based methods iden- tified similar results with respect to unimpor- tant inputs compared to ANOVA. Table VII. Summary of the ANOVA Results for Comingled Analysis of Variability and Uncertainty R