An Alignment Equation for Using Mind Map

  An alignment equation for using mind maps to filter learning queries from Google Imran A. Zualkernan

  2. Formalization

  Mohammed A. AbuJayyab American University of Sharjah, UAE mjayyab@gmail.com

  Yaser A. Ghanam American University of Sharjah, UAE yghanam@gmail.com

  Abstract Search engines like Google play a critical role in life-long learning. However, the query capabilities of such engines remain simple and often yield a result set that is too large. In addition, search engines like Google rely on page ranking algorithm that represents the “collective consciousness” of millions of users. Learning about specifics often involves context. This paper shows how mind maps can be used as a contextual mechanism to specify what needs to be learned and to filter and retrieve the relevant sources of learning from the internet. In specific, a concept of “alignment” is introduced that filters only those information sources that are globally aligned to the mind map. The filter is implemented as a combinatorial optimization algorithm using Simulated Annealing.

  American University of Sharjah, UAE izualkernan@aus.edu

  Given a general purpose search engine (e.g., Google) that operates over a collection of web pages and a mind map given by a directed graph

  Mind ma and ontologies have also been abstracted from existing documents to discover the underlying latent semantic graphs (another graphical notation similar to mind maps), have also been suggested for use in “semantic ultiple techniques such as topological analysis, filtering, search and using a concept map to filter the results from a search engine like Google have also been proposed [13].

  The research presented in this paper relies on the topology of a mind map and the response set retrieved from a search engine like Google to arrive at a filtered set of results that are “aligned” with the mind map hence providing a user-specific “contextual” search mechanism.

1. Introduction

  It has been explicitly suggested that Google assigns “implicit” semantics to conpage ranking is based on million of users, the implicit semantics assigned using these algorithms can be termed “social semantics.” However, how does one specify what an individual needs to learn in a specific context? One line of research addresses how effective answers to specific questions from a user can be formulated used to specify “what needs to be learned” in a particular context. Mind maps for brain-storming have also been used as a

  where each response set

  is said to be aligned with a concept

  k

  with a response set bound of

  SRS

  A set of response set

  v to i v in graph G.

  the primary starting concept) and all subsequent nodes in the query are on a path from r

  v is a designated root node (or

  where r

  .. v v š

  form i r

  i rs is generated by a query of the

  } ..., , , { 3 2 1 l SRS rs rs rs rs

  ) , ( E

  search results. A query based on the mind map, returns a set of response sets

  P within the first k

  i P is more relevant than j

  page

  As semand specialized repositories of learning objects (e.g., [2]) come of age to include structured information about resources on the internet, search engines like Google [3] remain the most successful search engines for life-long learning. Search engines typically work using page and link sim

  representing potentially relevant pages. For a response set,

  k P P rs ,..., 1

  More formally, a search engine returns an ordered response set

  represents a concept in the mind map, the goal of this research is to find a subset of web pages that are “aligned” with the mind map.

  H

  V v i

  , where each

  V G

  j i implies that map

  G

  Intuitively, the alignment equation (1) means that if the distance between a concept A and a concept B in the mind map is less than the distance between the concept A and another concept C, then an alignment dictates that the distance between the response set of A and response set of B should also be less than the distance between the response set of A and the response set of C. And that this should be true of all pairs of concepts in the mind map.

  rank, the lowest traffic ranking and a low number of sites linking in. However, this page was selected by the algorithm. A quick review of the contents of the page, however, shows that this page is exactly what the mind map was designed for; the page contains an overview and links to all aspects of java technology ranging from FAQ’s and tutorials, documentation, forums, resources and developer links. The other page chosen was www.java.com which includes a download of the java virtual machine from Sun Microsystems.

  http://www.apl.jhu.edu/~hall/java/ ). The page http://www.apl.jhu.edu/~hall/java/ had the lowest page

  The alignment algorithm reduced these ten pages to just two pages ( www.java.com and

  Table 1 shows the original ten pages retrieved for the root concept “java.” Table 1 also shows the page rank (from Google), number of sites linking in to the page and the traffic rank (from www.alexa.com) for each page.

  There was a correlation (r = 0.462; p<0.02 using Pearson’s correlation) between the distance from the root and the number of pages left after filtering for each concept. For example, for the root node (see Java in Figure 1), the number of pages were narrowed from 10 to 2. While, for a node like “Web browser”, the number of relevant pages were reduced from 10 to 7 only. This is consistent with equation (1), because the closer a node is to the root, the more constraints it needs to satisfy.

  Figure 1. A mind map for an overview of java and related technologies

  24*10 = 240 page references. Simulated annealing reduced this set to 121 page references that are actually “aligned” to the mind map. So the total number of pages was narrowed by about half (121/240 = 50.4%). The average number of pages per concept was reduced from 10 to 5.04 (SD = 1.62). The maximum reduction was to 2 pages and the minimum to 8 pages.

  Starting from 38 initial violations of equation (1), the simulated annealing algorithm converged to a minimum Energy of 2 (meaning 2 violations of equation (1)) after 50,000 iterations. Since the mind map had a total of 24 concepts, with response set bound of 10, the initial set of response sets resulted in

  The mind map shown in Figure 1 is a subset of the mind map constructed for an overview of the Java technology by Sun Microsystems [17]. Objective of the mind map shown in Figure 1 is to provide a broad overview of java and related technologies. This mind map was used to evaluate the alignment algorithm described earlier. A response set bound of 10 was used for this experiment; only first ten results from each query were considered.

  for this problem is the number of violations of equation (1); for an ideal solution, the number of violations should be zero. The simulated annealing algorithm for this problem was implemented in Java and used the Google API [16] to connect and retrieve response sets from Google.

  E

  Deriving an aligned SRS from an initial set of response sets is a combinatorial optimization problem that can be solved using Simulated Annealing [15]. Simulated annealing uses an objective function of cost or Energy E to guide the search for a solution. The Energy

  . The edit distance measures the number of changes required to change one sequence into another.

  if the following equation holds for each

  j rs

  and

  i rs

  , on the other hand, is measured using the Levenshtein or edit distance [14] between two response sets

  D

  j v .

  and

  d is defined to be the number of hops between i v

  (1) Where

  D rs rs rs rs D v v d v v d Ÿ

  G v v v k j i  , , ) , ( ) , ( ) , ( ) , ( k i j i k i j i

  For each

  SRS rs i 

3. Evaluation

  • selected by the alignment algorithm

  5. References [1] Passin, T. B., Explorer's guide to the semantic web, Mannings Publication Company, 2004.

  [15] van Laarhoven, P.J., and Aarts, E.H., Simulated annealing: Theory and applications, Springer, 1987. [16] Google web APIs reference - http://www.google.com/apis/ reference.html Accessed, Feb 03, 2006.

  [14] Levenshtein Distance, http://www.merriampark.com /ld.htm, Accessed Feb 3, 2006.

  [13] Leake, D., Maguitman, A., Reichherser et al, “Googling from a concept map: Towards automatic concept-map-based query formation,” in Concept maps: Theory, methodology, technology Proceedings of the first int. conference on concept mapping, A. J. Cañas, J. D. Novak, F. M. González, Eds. Pamplona, Spain, 2004.

  [12]

  [11] C. Patel, K. Supekar, and Y. Lee. “Ontogenie: Extracting ontology instances from WWW,” in human language technology for the semantic web and web services, ISWC'03, Sanibel Island, Florida, 2003.

  [10]

  

   [8 ] [9]

  

  [5] [6]

  [2] http://www.merlot.org/Home.po, Accessed Feb 3, 2006. [3] http://www.google.com, Accessed Feb 3, 2006. [4] Borodin, A., Roberts, G. O., Rosenthal, J.S., Tsaparas, P., “Link analysis ranking: Algorithms, theory, and experiments,” ACM transactions on internet technology, February 2005, pp. 231–297.

  This paper introduced using the alignment of a mind map with a response set to filter search results from a search engine such as Google. The paper also presented an implementation of this concept using Simulated Annealing. A case study showing how a mind map thus used can find an appropriate subset of the relevant web pages was also presented. One limitation of the work is that computation of both edit distance and simulated annealing require significance computational resources. Currently, work is being done to transfer the algorithm to run under a distributed framework.

  Together, these two pages provide both the technology for running Java and a broad overview as intended in the mind map.

  4. Conclusion

  6 234

  14 http://www.developer.com/j ava/ 7 927 14144 http://www.apl.jhu.edu/~hal l/java/ *

  8 53509

  8 848 36507 http://javaboutique.internet. com/ 7 721 800 http://www.javaworld.com/ 8 953 12309 http://java.net/ 8 125 12636 http://www.microsoft.com/ mscorp/java/

  9 5946 486 http://www.java.com/ * 9 1463 1046 http://www.java.com/en/do wnload/manual.jsp 8 1463 1046 http://www.anfyteam.com/j ava/

  Linking Traffic Rank http://java.sun.com/

  Table 1. Original response set for Java (k=10) Page Page Rank Sites

  http://sun.java.com/getjava (for java virtual machine download).

  . Indeed, in another run of the algorithm, the reduced set of pages for “java” included http://sun.java.com and

  http://www.apl.jhu.edu/~hall/java/

  It would seem that http://java.sun.com should have been the logical choice for a broad overview of java technology. However, the information provided on this site is analogous to the one in

  [17] http://java.sun.com/developer/onlineTraining/ new2java/javamap/Java_Technology_Concept_Map.p df, Accessed, Dec 27, 2006.