A Case Study Timings and Statistics

13 Applications In this chapter, we sample some of the potential applications of topology to problems in disparate scientific domains. Some of these questions motivated the theoretical concepts in this book to begin with, so it is reasonable to scruti- nize the applicability of the work by revisiting the questions. I am not an expert in any of these domains. Rather, my objective is to demonstrate the utility of the theory, algorithms, and software by giving a few illustrative examples. My hope is that researchers in the fields will find these examples instructive and inspiring, and utilize the tools I have developed for scientific inquiry. Applied work is an on-going process by nature, so I present both current and future work in this chapter, including nonapplied future directions.

13.1 Computational Structural Biology

The field of computational structural biology explores the structural properties of molecules using combinatorial and numerical algorithms on computers. The initial impetus for the work in this book was understanding the topologies of proteins through homology. In this section, I look at three applications of my work to structural biology: feature detection, knot detection, and structure determination.

13.1.1 Topological Feature Detection

In Chapter 6, the small protein gramicidin A motivated our study of persis- tence, as we were incapable of differentiating between noise and feature in the data captured by homology. The primary topological structure of this protein is a single tunnel. Figure 13.2 illustrates the speed with which one may identify this tunnel using persistent homology. A glance at the topology map of the data set 1grm in Figure 13.1 tells the user that there is a single persistent 1-cycle. 223 Fig. 13.1. Topology map of gramicidin A 1grm with cross-hair at 1016, 4768. a K 1016 ,4768 top b K 1016 ,4768 side c 1-cycle and surface in 1-skeleton Fig. 13.2. Detecting the topological feature of 1grm using CView. The user selects complex 1016, 4768 a,b and visualizes the complex’s single tunnel c. After clicking in the cycle’s k-triangle, the user may view the complex from different viewpoints, as shown in Figure 13.2a,b, and examine the 1-cycle and its spanning surface within the 1-skeleton of the persistent complex c. Not all molecular structures are as simple as this protein. The Zeolite BOG, for example, has a richer topology map, as shown in Figure 13.3. Observe that the structure features two groups of highly persistent 1-cycles. Again, the user may select to keep both groups of 1-cycles by choosing a point in the appropriate triangular region, as shown in Figure 13.4a,b. The two sets of tunnels interact to produce a basis of 44 1-cycles. The user may elect to discard the set of 12 1-cycles by increasing persistence, as shown in Figure 13.4c. The 8 longer-living tunnels d, however, survive. Zeolites are crystalline solids with very regular frameworks. This regularity of structure translates to simplicity of topology maps. Proteins, on the other Fig. 13.3. Topology map of BOG a K 4385 ,15000 view 1 b K 4385 ,15000 view 2 c K 4385 ,21000 view 1 d K 4385 ,21000 view 2 Fig. 13.4. Two views of persistent complexes with index 4385. Increasing persistence from 15,000 to 21,000, we eliminate the first group of tunnels and preserve the second. hand, do not exhibit regular structure in general. Their topology maps are not simple as a consequence. Figure 13.5 shows the topology map of 1hck, as well as the graph of its persistent β 1 numbers. We can no longer identify the features immediately, as p-persistent cycles exist for almost every value of p . We were able to distinguish between noise and feature for BOG because there were groups of 1-cycles with persistence significantly higher than the other 1-cycles. These groups are easily recognizable in the histogram of the persistence of 1-cycles for BOG in Figure 13.6a. We cannot perceive the same grouping in the histogram for 1hck b, however. Persistence, in other words, is not a silver bullet. Rather, it is yet another tool for exploring the complex structure of proteins. The examples above all use index-based persistence. Alternatively, one may examine structures using time-based persistence see Section 6.1 for defini- tions. Currently, I have implemented algorithms for computing time-based persistent Betti numbers.