Higher Dimensions Algorithm for Fields

a Terrain b QMS complex Fig. 12.10. Iran’s Alburz mountain range borders the Caspian sea top flat area, and its Zagros mountain range shapes the Persian Gulf left bottom. in a filtration. All timings were done on a Sun Ultra-10 with a 440 MHz UltraSPARC IIi processor and 256 megabyte RAM, running the Solaris 8.

12.6.1 Implementation

I have implemented all the algorithms in Chapter 10, except for the algorithm for computing λ mod 2. My implementation differs from the exposition in three ways. The implemented component tree is a standard union-find data structure with the union by rank heuristic, but no path compression Cormen et al., 1994. Edges are tagged with the union time and the least common an- cestor is found by two traversals up the tree. Although this structure has an O n log n construction time and an Olog n query time, it is very simple to implement and extremely fast in practice. We also use a heuristic to reduce the number of p-linked cycles by storing bounding boxes at the roots of the augmented union-find data structure. Before enumerating p-linked cycles, we check to see if the bounding box of the new cycle intersects with that of the stored cycles. If not, the cycles cannot be linked, so there’s no need for enu- meration. Finally, we only simulate the barycentric subdivision by storing a direction with each edge.

12.6.2 Timings and Statistics

We use the molecular data from Section 12.1 for experimentation. To compute linking, we first need to compute the canonical basis for each data set. Tables Table 12.17. Number of 1-cycles, time in seconds to construct the component tree, and the computation time and number of p-linked pairs alg, p-linked pairs with intersecting bounding boxes heur, and links. time in seconds pairs cycles tree alg heur links alg heur links hopf 1,653 0.00 0.00 0.00 0.00 1 1 1 SOD 1,108 0.00 0.00 0.01 0.04 1,108 692 1grm 2,005 0.00 0.01 0.01 0.01 112 möbius 2,710 0.00 0.01 0.01 0.01 LTA 7,176 0.02 0.06 0.12 1.77 296,998 6,320 1mbn 8,036 0.01 0.04 0.04 0.04 522 107 FAU 8,293 0.01 0.12 0.07 0.07 1,255,396 34 KFI 8,465 0.01 0.05 0.04 0.33 87,956 25,251 1qb0 9,327 0.01 0.04 0.05 0.05 765 84 BOG 10,106 0.01 0.05 0.04 0.08 170,338 305 1hiv 10,032 0.02 0.04 0.05 0.15 8,709 8,426 1hck 15,603 0.03 0.08 0.09 0.24 12,338 11,244 TAO 52,902 0.12 0.38 0.42 6.83 98,543 4,455 12.5 and 12.9 in Section 12.2 give the time to compute and canonize 1-cycles. Table 12.17 gives timings and statistics for the linking algorithm. The table shows that the component tree and augmented trees are very fast in practice. It also shows that the bounding box heuristic for reducing the number of p-linked pairs increases the computation time negligibly, if at all. The heuristic is quite successful, moreover, in reducing the number of pairs we have to check for linkage, eliminating 99 .8 of the candidates for data set BOG. The differences in total time of computation reflect the basic structure of the data sets, as well as their sizes. TAO has a large computation time, for instance, as the average size of the p-linked surfaces is approximately 266.88 triangles, compared to about 1.88 triangles for data set 1hck. Discussion. The experiments demonstrate the feasibility of the algorithms for fast computation of linking. The experiments fail to detect any links in the protein data, however. This is to be expected, as a protein consists of a single component, the primary structure of a protein being a single polypeptide chain of amino acids. Links, on the other hand, exist in different components by defi- nition. Proteins may have “links” on their backbone, resulting from disulphide bonds between different residues. We need other techniques to intelligently detect such links. 13 Applications In this chapter, we sample some of the potential applications of topology to problems in disparate scientific domains. Some of these questions motivated the theoretical concepts in this book to begin with, so it is reasonable to scruti- nize the applicability of the work by revisiting the questions. I am not an expert in any of these domains. Rather, my objective is to demonstrate the utility of the theory, algorithms, and software by giving a few illustrative examples. My hope is that researchers in the fields will find these examples instructive and inspiring, and utilize the tools I have developed for scientific inquiry. Applied work is an on-going process by nature, so I present both current and future work in this chapter, including nonapplied future directions.

13.1 Computational Structural Biology

The field of computational structural biology explores the structural properties of molecules using combinatorial and numerical algorithms on computers. The initial impetus for the work in this book was understanding the topologies of proteins through homology. In this section, I look at three applications of my work to structural biology: feature detection, knot detection, and structure determination.

13.1.1 Topological Feature Detection

In Chapter 6, the small protein gramicidin A motivated our study of persis- tence, as we were incapable of differentiating between noise and feature in the data captured by homology. The primary topological structure of this protein is a single tunnel. Figure 13.2 illustrates the speed with which one may identify this tunnel using persistent homology. A glance at the topology map of the data set 1grm in Figure 13.1 tells the user that there is a single persistent 1-cycle. 223