a Terrain b QMS complex
Fig. 12.10. Iran’s Alburz mountain range borders the Caspian sea top flat area, and its Zagros mountain range shapes the Persian Gulf left bottom.
in a filtration. All timings were done on a Sun Ultra-10 with a 440 MHz UltraSPARC IIi processor and 256 megabyte RAM, running the Solaris 8.
12.6.1 Implementation
I have implemented all the algorithms in Chapter 10, except for the algorithm for computing
λ mod 2. My implementation differs from the exposition in three ways. The implemented component tree is a standard union-find data
structure with the union by rank heuristic, but no path compression Cormen et al., 1994. Edges are tagged with the union time and the least common an-
cestor is found by two traversals up the tree. Although this structure has an O
n log n construction time and an Olog n query time, it is very simple to implement and extremely fast in practice. We also use a heuristic to reduce
the number of p-linked cycles by storing bounding boxes at the roots of the augmented union-find data structure. Before enumerating p-linked cycles, we
check to see if the bounding box of the new cycle intersects with that of the stored cycles. If not, the cycles cannot be linked, so there’s no need for enu-
meration. Finally, we only simulate the barycentric subdivision by storing a direction with each edge.
12.6.2 Timings and Statistics
We use the molecular data from Section 12.1 for experimentation. To compute linking, we first need to compute the canonical basis for each data set. Tables
Table 12.17. Number of 1-cycles, time in seconds to construct the component tree, and the computation time and number of p-linked pairs alg, p-linked
pairs with intersecting bounding boxes heur, and links.
time in seconds pairs
cycles tree
alg heur
links alg
heur links
hopf 1,653
0.00 0.00
0.00 0.00
1 1
1 SOD
1,108 0.00
0.00 0.01
0.04 1,108
692 1grm
2,005 0.00
0.01 0.01
0.01 112
möbius 2,710
0.00 0.01
0.01 0.01
LTA 7,176
0.02 0.06
0.12 1.77
296,998 6,320
1mbn 8,036
0.01 0.04
0.04 0.04
522 107
FAU 8,293
0.01 0.12
0.07 0.07
1,255,396 34
KFI 8,465
0.01 0.05
0.04 0.33
87,956 25,251
1qb0 9,327
0.01 0.04
0.05 0.05
765 84
BOG 10,106
0.01 0.05
0.04 0.08
170,338 305
1hiv 10,032
0.02 0.04
0.05 0.15
8,709 8,426
1hck 15,603
0.03 0.08
0.09 0.24
12,338 11,244
TAO 52,902
0.12 0.38
0.42 6.83
98,543 4,455
12.5 and 12.9 in Section 12.2 give the time to compute and canonize 1-cycles. Table 12.17 gives timings and statistics for the linking algorithm. The table
shows that the component tree and augmented trees are very fast in practice. It also shows that the bounding box heuristic for reducing the number of p-linked
pairs increases the computation time negligibly, if at all. The heuristic is quite successful, moreover, in reducing the number of pairs we have to check for
linkage, eliminating 99
.8 of the candidates for data set BOG. The differences in total time of computation reflect the basic structure of the data sets, as well
as their sizes. TAO has a large computation time, for instance, as the average size of the p-linked surfaces is approximately 266.88 triangles, compared to
about 1.88 triangles for data set 1hck.
Discussion. The experiments demonstrate the feasibility of the algorithms for fast computation of linking. The experiments fail to detect any links in the
protein data, however. This is to be expected, as a protein consists of a single component, the primary structure of a protein being a single polypeptide chain
of amino acids. Links, on the other hand, exist in different components by defi- nition. Proteins may have “links” on their backbone, resulting from disulphide
bonds between different residues. We need other techniques to intelligently detect such links.
13
Applications
In this chapter, we sample some of the potential applications of topology to problems in disparate scientific domains. Some of these questions motivated
the theoretical concepts in this book to begin with, so it is reasonable to scruti- nize the applicability of the work by revisiting the questions. I am not an expert
in any of these domains. Rather, my objective is to demonstrate the utility of the theory, algorithms, and software by giving a few illustrative examples. My
hope is that researchers in the fields will find these examples instructive and inspiring, and utilize the tools I have developed for scientific inquiry. Applied
work is an on-going process by nature, so I present both current and future work in this chapter, including nonapplied future directions.
13.1 Computational Structural Biology
The field of computational structural biology explores the structural properties of molecules using combinatorial and numerical algorithms on computers. The
initial impetus for the work in this book was understanding the topologies of proteins through homology. In this section, I look at three applications of
my work to structural biology: feature detection, knot detection, and structure determination.
13.1.1 Topological Feature Detection
In Chapter 6, the small protein gramicidin A motivated our study of persis- tence, as we were incapable of differentiating between noise and feature in the
data captured by homology. The primary topological structure of this protein is a single tunnel. Figure 13.2 illustrates the speed with which one may identify
this tunnel using persistent homology. A glance at the topology map of the data set 1grm in Figure 13.1 tells the user that there is a single persistent 1-cycle.
223