Statistics Algorithm for Z

Table 12.12. Field coefficients. The Betti numbers of K computed over field F and time for the persistence algorithm. We use a separate implementation for Z 2 coefficients. F β β 1 β 2 time s Z 2 1 2 1 0.01 Z 3 1 1 0.23 Z 5 1 1 0.23 Z 3203 1 1 0.23 Q 1 1 0.50 over different coefficient sets. Similarly, we can compare sets of P-intervals from different computations to discover torsion in a persistence complex. Note that our algorithm’s performance for this data set is about the same over arbitrary finite fields, as the coefficients do not get large. The computation over Q takes about twice as much time and space, since each rational is represented as two integers in GNU MP.

12.3.4 Higher Dimensions

We now examine the performance of this algorithm in higher dimensions us- ing the large-scale time-varying data. Again, we give the filtration sizes and timings in Table 12.11. Figure 12.7a displays β 2 for data set J . We observe a large number of two-dimensional cycles voids, as the co-dimension is 2. Persistence allows us to do to decompose this graph into the set of P-intervals. Although there are 730,692 P-intervals in dimension 2, most are empty as the topological attribute is created and destroyed at the same function level. We draw the 502 nonempty P-intervals in Figure 12.7b. Note that the P-intervals represent a compact and general shape descriptor for arbitrary spaces. For the large data sets, we do not compute persistence over alternate fields as the computation requires in excess of 2 GB of memory. In the case of finite fields Z p , we may restrict the prime p to be less than the maximum size of an integer. This is a reasonable restriction, as on most modern machines with 32-bit integers, it implies p 2 32 . Given this restriction, any coefficient will be less than p and representable as a 4-byte integer. The GNU MP exact integer format, on the other hand, requires at least 16 bytes for representing any integer. 10 20 30 40 50 60 70 80 90 50 100 150 200 250 β 2 f f a Graph of β f 1 b The P-intervals Fig. 12.7. The data set J defines function f , the flow velocity, over the four-dimensional space-time manifold. We show the graph of f top and the 502 nonempty P-intervals in dimension 2. The amalgamation of these intervals gives the graph.

12.4 Topological Simplification

In this section, we first present a case study of the five reordering algorithms described in Chapter 8 and illustrated in Figure 8.7. We then provide experi- mental evidence of the utility of the algorithms, as well as the rarity of basic and recursive conflicts. We end this chapter with visualizations of persistent complexes.

12.4.1 A Case Study

In this brief picturesque study, we show the effect of the reordering algorithms in the presence of conflicts. Figure 12.8a displays the k-triangles of the data set SOD. This zeolite does not contain any basic conflicts, but it does have 26 recursive conflicts. We are interested in the tip of the region of large overlap- ping 1-triangles, shown in Figure 12.8b. The rest of the figures in c–l show how this area changes with the different reordering algorithms in Figure 8.7. Note that the differences for the pseudo-triangle algorithm cancel, as each cy- cle is given its due influence, given its persistence. Consequently, we will use this algorithm as the default method for simplification.

12.4.2 Timings and Statistics

We have implemented all of the reordering algorithms for experimentation. The algorithms have the basic structure and therefore take about the same time. So, we only give the time taken for the Pseudo-triangle algorithm in Table 12.13. All timings were done on a Sun Ultra-10 with a 440 MHz Ultra- SPARC IIi processor and 256 megabyte RAM, running Solaris 8. Here, each complex is reordered with p equal to the size of the filtration. Generally, the reordering algorithms encounter the same number of conflicts, so we only list the number of basic and recursive conflicts for the pseudo-triangle algorithm in Table 12.13. The time taken for reordering correlates very well with the size of the filtration, as all algorithms make a single pass through the filter. A simplex may move multiple times during reordering, however, because of the recursive nature of the algorithms. The number of recursive conflicts is one indication of the complexity of the reordering. The table shows that the data sets with a large number of recursive conflicts, namely BOG, bearing, TAO, and bone, all have large reordering times.