Fault Trees with PH Distributions 121

7.1. Fault Trees with PH Distributions 121

arises from the fact that OR and AND gates in the fault trees correspond exactly to the minimum and respectively maximum operators we have discussed in the previous chapters. Hence, they also correspond to the choice ( +) and respectively the parallel

( k) operators of CCC. Let P i denote a CCC process describing the time to failure of processor P i, for

i = 1, 2, 3. Further, let M i be a CCC process describing the time to failure of memory module M i, for i = 1, 2, and similarly with B, the process describing the time to failure of the bus B. Then the fault tree model in Figure 7.2 can be expressed by CCC process

(7.1) To analyze the reliability of the 3 P 2 M model, we conduct several experiments. In

(P 1 kP 2 kP 3 ) + (M 1 kM 2 ) + B.

each experiment, the time to failure of each component is governed by Erlang distri- butions of a particular phase (number of states). The mean value of the Erlang dis- tributions governing each component, however, are kept the same in all experiments, which means that the rates must be adjusted accordingly. The mean failure times of each processor, each memory, and the bus are set to 5, 3, and 7 years, respectively. Table 7.1 lists the parameters of the Erlang distributions used in the experiments.

Table 7.1: Erlang Distributions Used in the Experiments

Phase Processors

Erl( 10 3 10 , 10) Erl( 7 , 10)

5 , 100) Erl( 3 , 100) Erl( 7 , 100)

Table 7.2 summarizes the result of the experiments. We have six 3 P 2 M models, where we vary the phases of the Erlang distributions governing the basic events, rang- ing from 1, which corresponds to exponential distributions, to 100. The second column of the table (Original) describes the number of states in the resulting CTMC models when they are generated without any size reduction whatsoever. The third column

(Inter.) corresponds to the number of states in the largest intermediate CTMC models the reduction algorithm encounters while minimizing each of the six 3 P 2 M models. Recall that after carrying out each operation in Equation (7.1), the reduction algo- rithm can be used to reduce the resulting intermediate model. In this way, the size of the results of subsequent operations can be kept small. The state spaces shown in the column are usually the intermediate results prior to the last operation, which in this case is the last choice operation. The fourth column (Final) corresponds to the number of states in the final CTMC models. Compared to the original state spaces, the size of the final state spaces is orders-of-magnitude smaller. While the size of an origi- nal model grows multiplicatively in the sizes of its components, the size of a reduced representation grows additively in the sizes of its components.

The computation times (in seconds) for transforming, reducing, and composing the models are shown in the last three columns of Table 7.2. In the “S PA ” column,

122 Chapter 7. Case Studies

Table 7.2: State Space Reduction of the 3 P 2 M Models Phase

Comp. Time (sec.)

S PA

Red. Proc.

we provide the computation time spent to transform the models to ordered bidiago- nal representations. The “Red.” column describes the computation time spent by the reduction algorithm in reducing the models, namely eliminating the removable states once S PA is completed. The rest of the computation time is expended in other pro- cessing, such as building all components, carrying out the minimum and maximum operations, and storing the results in files. This is listed in the column “Proc.”. Com- pared to the S PA and reduction times, the processing time in most cases is negligible. On the other hand, S PA consumes most of the computation time. The computation times required by both S PA

and reduction grow extremely fast—faster than 3 O(n ). This is due to the fact that the implementation uses rational numbers. The storage

requirement of the used rational numbers grows in size over time, and hence also in their processing time.

With the current implementation of the reduction algorithm, we have seen the limit of the scalability of the algorithm in dealing with 3 P 2 M models or similar models that involve the minimum and maximum of highly structured APH representations. The computation time of S PA for the largest model in Table 7.2 is already more than

60 hours. As a rule of thumb, the algorithm should only be used when the largest intermediate model is no more than around 300000 states. Figure 7.3 depicts the distributions of the time (in years) to failure in the six 3 P 2 M models. The reliability of the system is actually not improved by introducing more states in each component’s Erlang distributions. Instead, the failure time becomes more “precise” as the probability mass increases in the range of 3 to 4 years, and approaching 3 years as the number of states becomes larger. This is to be expected, as each memory module fails with mean time 3 years, and the top level event of the fault tree is an OR gate, which corresponds to the minimum operation.

As a concluding remark, we would like to reemphasize the fact that minimal repre- sentations of the maximum of Erlang distributions grow exponentially with the num- ber of components. As we have observed in the case study, for a particular number of components, the minimal representation grows additively in the sizes of the com- ponents. Hence, when the number of components is small, the size of the resulting minimal representations is actually manageable.

The 3 P 2 M model we use is a modification of the original one, in which we replace

a 2/3 VOTING gate on the processors by an AND gate. A VOTING gate can actu- ally always be represented by a combination of AND and OR gates. In the case of