Semantic Query-Manipulation and Personalized Retrieval of Health, Food and Nutrition Information

  Available online at www.sciencedirect.com

  Procedia Computer Science 19 ( 2013 ) 163 – 170 th The 4 International Conference on Ambient Systems, Networks and Technologies (ANT-2013)

  Semantic Query-Manipulation and Personalized Retrieval of Health, Food and Nutrition Information

  • Ahmed Al-Nazer, Tarek Helmy

  Information and Computer Science Department, College of Computer Science & Engineering, King Fahd University of Petroleum & Minerals, Dhahran 31216, Mail Box 413, Saudi Arabia,

  • * On leave from College of Engineering, Tanta University, Egypt, [g199739540, helmy]@kfupm.edu.sa

  Abstract

  Semantic manipulation of website content is important in many domains, but it is critical in some domains, such as health and nutrition. In such domains, users need to retrieve precise, trusted, and relevant health and food information. Even with a high-quality, semantic, Web-based search engine, it is not enough for retrieving the precise health- and nutrition-related information. That is because the retrieved information might not fit the user’s specific needs due to the huge amount of information scattered throughout the Web. Thus, semantic query manipulation and personalization techniques will help and guide users in retrieving more relevant health and nutrition information consistent with their needs. In this paper, we present our efforts to develop a framework for semantic query manipulation and personalization of health and nutrition information. We propose a user profile ontology based on culture, language, health and nutrition. The profile is used to enrich the query and to personalize the retrieved health and food information to be consistent with the user’s needs. Moreover, we propose query templates that are used for semantic manipulation and mapping of the user’s natural language queries into ontology-based queries. We have implemented the proposed framework, and the empirical evaluations show promising improvements in the relevancy of the retrieved results and of the user’s satisfaction.

  

  © 2013 The Authors. Published by Elsevier B.V. © 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of [name organizer] Selection and peer-review under responsibility of Elhadi M. Shakshuki

  Keywords: Personalization; query manipulation; food and nutrition; semantic Web; ontology 1.

   Introduction

  The semantic representation of Web contents and the semantic query manipulation help in retrieving more accurate results. However, they are not the only success factors in retrieving the relevant information for the user, as we have a huge amount of information scattered throughout the Web. Thus, the retrieved information needs to be filtered and personalized to fit the user’s exact needs, i.e., the health advice that fits one user based on his/her age, gender and health conditions might not fit another user with different conditions. Thus, personalization techniques will help and guide users in retrieving high-quality health

  164 Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 ( 2013 ) 163 – 170

  and nutrition-related Web contents. For personalization, we need a personal profile for each user to define his/her interests, preferences, health conditions and culture, in addition to customizing the retrieved results. We all do not share a common cultural background, and each culture has its own tastes [1]. Since we are focusing on the food and nutrition, some foods are accepted in a certain culture, while they are not preferred in a different culture. The remainder of the paper starts with a survey on the related work and is followed by a description of the proposed framework architecture. Then, we present the three main components: user profile ontology, query semantic analysis and results personalization. Next, we describe the experimental results and show some use cases. Finally, we conclude the paper and highlight trends for the future work.

  2. Related Work

  HealthFinland [2] is an intelligent semantic portal that provides relevant health information retrieved from the Web and various governmental, non-governmental, business and other organizations. It helps to find the relevant health content using basic vocabularies without the need for technical medical terminology. The limitation of HealthFinland is that it does not address the personalization retrieval. Personalized Health Information Retrieval System (PHIRS) [3] is a health information recommendation system which addresses the user’s modeling and implements a user-profile matching that customizes the retrieved health information to match the individual’s needs. There are two limitations of PHIRS: 1) it does not have enough features to identify the relevant health information; and 2) the personalization does not touch on the culture or language of the user. CarePlan [4] generates customized, patient-specific healthcare plans in an automatic way and determines the best clinical care plan based on the patient’s medical and personal profiles, the medical knowledge, clinical pathways, and personalized educational healthcare programs. The limitation of CarePlan is the lack of the full implementation details as well as the food and nutrition information that are related to the patient, in addition to the educational focus of the profile and the lack of cultural aspects. The authors in [5] propose an adaptive searching mechanism for medical information that retrieves cardiologic medical information from heterogeneous, distributed medical databases that mediate medical decisions of critical health conditions. The mechanism supports generating a personalized searching process for the users based on their personal profiles, but it lacks the use of semantic Web and the culture attributes in the personalization process. The authors in [6] introduce a trusted model as a one-stop shop access point to personalized health and medical information. The model centralizes personal information management to facilitate specific information aggregation tasks of individual clients. Experiments were conducted to demonstrate trade-off levels between retrieval performance and the degree of privacy preservation in the proposed query mixing strategies. This trade off did not consider the personalization from the user’s cultural point of view. A mixed initiative socio- semantic conversational search and recommendation system for finding health information is presented in [7]. In this system, users can have a live conversation about their health issues where the system connects relevant users together in the same conversation and provides context-based recommendations. The recommendation was to be based on the social context only. Based on this survey, there is a lack of cultural- and lingual-based personalization for the health, food and nutrition domain that will help in giving better recommendations for the users. Hence, we extend the current approaches by building a framework for a cross-cultural and cross-lingual recommendation tool with an ontology-based user profile to retrieve the relevant health and nutrition information that fits the user’s needs.

  3. Query Manipulation and Personalization Framework

  This work is part of a big project that aims to build a framework to help users find semantic health and nutrition information fit to their needs. The architecture of the project’s framework has three main

   Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 ( 2013 ) 163 – 170 165

  components: 1) the ontology managemen nt component, which maintains the health and nutrition d domain ontology; 2) the annotation component, wh hich annotates the health and nutrition data sources based d on the domain ontology; and 3) the query manip ulation and results personalization component, the focus of this paper, which users interact with and which h personalizes the retrieved health and food information. Figure 1 shows the details of the proposed fram mework for the query manipulation and results personal lization component.

  Fig. 1. Proposed framework architecture for query m anipulation and results personalization component

  In the proposed framework, the user en nters the query using the portal interface in one of two w ays: either by going through a wizard to complete th he query, or by using free text natural language query. T The cross- language service is used to translate the us ser’s query as needed. Then, the ontology query-template e reasoning engine is used to select the appropriate qu uery template for the user’s query. This requires referenc cing to the knowledge repository that hosts the ontol logies and the knowledge base. This helps in enriching the user’s query based on the user profile. Next, the reasoning engine processes the query and returns the pe ersonalized search results. The user browses the result s and all interactions are logged in the action log database

  e. The user preferences learner analyzes this log and u updates the user profile with new preferences.

4. User Profile Ontology

  This section highlights the user profile e ontology and answers the following questions: Why do o we need ontology to represent the user profile? W What are the factors that affect the food preferences? H ow do we capture the user’s preferences and how do we update them in a timely manner? 4.1.

   Food and health preferences

  Choosing the right food depends on many factors that sometimes conflict with each other. These factors can be categorized into three gro oups: food preferences, health conditions, and culture/r religion constraints. The food preferences group r ecognizes that each user has his/her own taste in food a and that s/he likes some types of food while disli iking others. The health condition factor involves any d diseases and/or allergies that the user may have. Th he health condition factor therefore restricts some types o of food and encourages other types. The culture/r religion factor takes into account that each culture has i its own preferred food and that some religions ha ave food restrictions. This is obvious when someone tra avels to countries where s/he sees different tastes a and different recipes outside of his/her own culture.

4.2. Preferences capturing

  The basic step to capturing the user’s p preferences is to make a form and ask the user to fill it o out, but the fact is that most users do not apprecia ate the value of taking the time to thoroughly fill out the profile

  166 Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 ( 2013 ) 163 – 170

  forms [8]. So, there should be a mechanism to infer the user’s interest; hence we show three ways that we can use to elicit the profile fields after consulting with the user to assure the user’s privacy. First, we can implement user queries, which are a good source for understanding the user’s needs and interests. For example, if the user is always asking about a certain type of food, we may infer that this food type interests him. Second, we can monitor the user’s behaviors and interactions, explicit or implicit, and the results will enhance the user profile. For example, if the user always clicks on a certain data source, this means that s/he trusts this source more than others, and therefore the results from this source should be prioritized. And third, interfacing with an external system that has the user’s information, such as a medical information system, can be reflected immediately in the user profile. The same ways could be used to dynamically update the user profile and adopt it with the latest user preferences.

  4.3. User profile representation

  User profiles are used to grasp the user needs and understand the query’s meaning as it relates to the user [9]. We capture and represent the preferences and interests of the users in order to personalize the retrieved results and give user-specific recommendations. There are many ways to represent the user profile. The authors of [10] showed three ways to represent the user profile. The first way is the keyword profile, which captures the keywords and assigns a weight for each keyword. The second way is the semantic network profile in which the keywords are added to a network of nodes where we can explicitly model the relationship between specific words and higher-level concepts. The third way is the concept profile in which the nodes represent abstract topics, and this helps in building deeper concept hierarchy, which can be based on taxonomy or ontologies. Since this work is part of a project in which we represent the health and food information as ontological format, we choose to represent the profile with the concept profile, which helps in enriching the user’s query and matching the query with the domain ontology.

  4.4. User profile ontology

  We created ontologies for the user profile, the culture and the religion. Then, we created the necessary relations between them and the food and health ontologies. The user profile ontology is represented as an ontological concept that consists of many properties as shown in the first box of Figure 2. For clarification, we visualize the user profile ontology properties in four categories: one category has the user’s basic information, such as name and age; one category has the user’s basic health information, such as the weight and the blood type; one category has the user’s medical information, such as the diseases and allergies; and finally, one category has the usage statistics, such as previous searches and user feedback. The arrow represents a relationship between two concepts, which is referred to in RDF terminology as “triple” [11].

   Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 ( 2013 ) 163 – 170 167

5. Semantic Query Manipulation

  This section explains how we understand and process the user’s query. We start by explaining the concept of query template, along with an example. Then, we show the query processing steps. After that, we go into more detail with the query enrichment step and show how we utilize the user profile in expanding the user’s query. Finally, we explain how we match the user’s query with the query templates. After the matching, the semantic query is ready to be sent to the reasoning component to retrieve the results.

  5.1. Query templates

  Since we are not doing natural language processing (NLP), it is necessary to define specific query templates in order to scope the user’s queries and match them to the related ontologies. Query templates, in our research, represent all expected queries from the user; define the concepts that could be extracted from the user’s query; correlate different ontologies that are needed to answer the query; and finally, specify the answer template for each query. Each query template consists of attributes shown in Table 1.

  Table 1. Query template attributes Field Description Example Template-ID Template identification

  1 Ontology-Lookup Ontologies needed to answer user’s query Food Ontology-Entities Ontologies needed to reason and retrieve the results Relation (disease), user, culture Confirmation-Question-Template Template for the confirmation question List all {0} that {1} {2} Subjective-Question-Template Template for the listing question Does {0} {1} {2}?

  5.2. Query processing steps

  After getting the user’s query, we identify the language since each language has its own syntax and way of processing; we consider both English and Arabic languages. Then, a spell checker is used to check the spelling of the query and suggest corrections if needed. After that, the query is classified into either a confirmation question, which has an answer of yes or no, or a subjective question, which has an answer of listing some items. Next, noise words, such as do, does, an, the, etc., are removed in order to have only the words that could be related to the domain ontology. Then, we identify the concepts related to food and health ontology through a populated list of all the ontology’s classes and the knowledge base’s instances. After that, WordNet is used to identify the possible relations between these concepts by finding all that are synonymous with the pre-defined relations. Next, we enrich the query based on the user profile. Then, we match the identified concepts and relations to the best query template. Finally, a semantic annotation that represents the user’s query is produced for retrieval. Figure 3 shows the query processing steps.

  Fig. 3. Query processing steps 5.3.

   Query enrichment

  Although we are in the query processing phase, the personalization starts from the query processing time utilizing the user profile ontology to enrich the query. The user profile ontology has defined relations to food and health ontologies, in addition to the culture and religion ontologies. The properties of the user

  168 Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 ( 2013 ) 163 – 170

  profile ontology can be used not only to enrich and expand the query, but also to fill the required fields for the query template. This leads to more accurate and relevant results by filtering the mass result records based on the user profile, health condition, culture and religion.

5.4. Matching user’s query with query templates

  Matching the user’s query to the pre-defined query templates is not black-or-white matching; it is more complicated. Identifying the concepts and relations within the user’s query that are related to the domain ontology is not sufficient to match them with any query template. We try to fill in the most appropriate query template concepts and relations, which were identified in the query-processing phase. However, there are some cases where we have incomplete information and hence we need to depend on other sources to fill the query template. After extracting everything we can from the query, we get aid from the domain ontology to detect the missing information based on what is found. Then, we look at the user profile information, if any, and fill in the missing information from the profile properties. Finally, we can go back to the user and ask him/her explicitly for more information in order to be able to match the query template.

6. Results Personalization

  The personalization helps in getting relevant results for the user’s query. As shown in the query- processing steps, the personalization starts with the query enrichment step, where we utilize the user profile to expand the query and to fill in the incomplete query templates. Here, we go into more detail with the results personalization steps and show how we capture the user’s feedback.

  6.1. Results personalization steps

  Personalizing the results involves presenting the results in the most effective way possible through several steps. The first step is answering the user’s query in the same language he asks it in, regardless of the language of the ontology and the knowledge base, which has the annotated data. The second step is answering the user’s query in appropriate syntax based on the question type; a confirmation question is different than a subjective question, as the user expects a “yes” or “no” answer in the first type, while s/he expects a list of items in the second type. So, the answer is personalized to express the understanding of the query and to be familiar to the user. The third step is ranking the results based on the user’s preferences and interests. While many healthy foods are recommended by the system, it is smart to show what the user likes first and what he does not like last. Finally, it filters the non-relevant food or health information based on the user profile.

  6.2. User’s feedback

  Continuous feedback collection is required to sharpen the user’s experiences. Feedback is not only explicit, but also implicit, as it can be collected through different measures. Many measures could help in reflecting the implicit feedback, such as time spent in browsing the results, clicks on the data sources, clicks on the result facets related to the search results, etc. All interactions and feedback are recorded and logged in the usage log which is analyzed after each query to know how effective the results are and how we can improve the future recommendations. This is reflected in the user profile ontology.

7. Experimentation and Evaluation

  We develop the interface screens and implement the semantic calls for the knowledge base. Figure 4 shows snapshots of the main screen and the user profile form. Next, we present a use case that shows a personalization example of using the system. Then, we show the query manipulation experimental results.

   Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 ( 2013 ) 163 – 170 169 Fig. 4. (a) Portal main screen snapshot; (b) User prof file screen snapshot 7.1.

   Personalization use case

  The user starts with the registration in t the system and creates a personal profile. Then, the user e enters a query; e.g., “which fruits are suitable for m me?” The system manipulates the query and enriches it w with the user profile. In this example, the user prof file contains the following: age (50 years), gender (male) , blood type (O+), health condition (diabetes, ir ron-deficiency anemia), culture (Middle Eastern) and r religion (Muslim). Then, the system tries to find th he query template that best matches the user’s query. Ne ext, the system looks at the knowledge base for fr ruits that suit the person with a low concentration of iron n, as he has malnutrition, and less sugar, as he is d diabetic. Also, the system matches his age and gender, a s some foods are not good for older males, and fin nally the system factors in his culture and religion by filter ring the inappropriate results. The results of the se earch are refined again to match other inferred user prefer rences, such as knowing from previous use that th he user prefers some specific types of fruits so that we giv ve them precedence. After showing the results, th he system monitors the user’s interactions while s/he nav vigates through the results and collects the user’s f feedback to update the profile.

  7.2. uery manipulation Experiment evaluation for semantic qu

  We have collected 100 questions from m different users and tested these questions in our sys stem to evaluate how effectively we could semanti cally interpret the questions. Our target is to calculate how w many name entities we can find in these que estions by comparing the system performance to the m manual annotation of these questions. Table 2 show ws the experiment statistics regarding the discovered conc cepts. It shows the number of relations found betwe een food and health condition. Then, it shows the number o of food items and nutrition items found in the quer ries. Next, it shows the number of diseases, body function ns (e.g., improve vision) and body parts (e.g., hea art) discovered. Then, we calculate the Precision, which h is the number of correct concepts found by the system divided by the total number of concepts found by the system. Finally, we calculate the Recall, which is the number of correct concepts found by the system divided by the total number of correct conc cepts found manually. The overall recall is 82.13%, which means that we are able to discover most of the c concepts in the questions. Also, the overall precision is 97 7.15%,

  170 Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 ( 2013 ) 163 – 170

  which is high because we pre-populate all of the concepts, except the relation, from the knowledge base. This explains the smaller Precision percentage in the relation, which is 91.36%, because we use WordNet in discovering the synonymous of the relations.

  Table 2. Experimental results statistics for query manipulation

Concept Total found Found correct Correct concepts Precision Recall

concepts concepts manually Relation 81 74

  92 91.36% 80.43% Food items

  71

  71 83 100.00% 85.54% Nutrition items

  16

  16 19 100.00% 84.21% Diseases

  53

  53 65 100.00% 81.54% Body functions

  10

  10 13 100.00% 76.92% Body items

  15

  15 19 100.00% 78.95%

Total 246 239 291 97.15% 82.13%

8. Conclusion and Future Work

  In this paper, we propose a framework for semantic query manipulation and personalization of health and nutrition information. We present the user profile ontology and its relation to other domain ontologies. Then, we explain the semantic query processing steps and present the result personalization steps. A complete scenario is illustrated to visualize the framework followed by experimental results. The empirical evaluation shows promising improvements in the relevancy of the retrieved results and of the user’s satisfaction. As a future work, and in order to validate the efficiency of the proposed framework, we will publicize the portal and collect users’ satisfaction feedback.

  Acknowledgements

  The authors would like to acknowledge the support provided by King Abdulaziz City for Science and Technology (KACST) through the Science & Technology Unit at King Fahd University of Petroleum & Minerals (KFUPM) for funding this work through project No.10-INF1381-04 as part of the National Science, Technology and Innovation Plan. Thanks are extended to the project’s consultants, Dr. Jeffrey M.

  Bradshaw, Dr. Yuri Tijerino and Dr. Andrzej Uszok.

  References [1] D. Matsumoto and L. Juang. Culture and Psychology. Cengage Learning, Inc. 5th edition, United States; 2012. [2] O. Suominen, E. Hyvönen, K. Viljanen and E. Hukka. HealthFinland-a national semantic publishing network and portal for health information. Web Semantics: Science, Services and Agents on the World Wide Web, ch.7; 4: 2009, pp. 287-297. [3] Y. Wang and Z. Liu. Personalized health information retrieval system. AMIA Annual Symposium Proceedings. Washington: DC; 2005, p. 1149.

  [4] S. R. R. Abidi and H. Chen. Adaptable personalized care planning via a semantic web framework. 20th International Congress of the European Federation for Medical Informatics (MIE 2006) , Maastricht: Netherlands; 2006.

  [5] S. Chessa, E. de la Vega, C. Vera, M. T. Arredondo, M. Garcia, A. Blanco and R. de las Heras. Adaptive searching mechanisms for a cardiology information retrieval system. Computers In Cardiology 2005; 32, pp. 147-150.

  [6] Y. Li, J. Mostafa and X. Wang. A privacy enhancing infomediary for retrieving personalized health information from the Web. Personal Information Management A SIGIR 2006 Workshop; 2006, pp. 82-85. [7] S. Sahay and A. Ram. Socio-semantic health information access. AAAI 2011 Spring Symposium; 2011.

  [8]

F. Carmagnola and F. Cena. User identification for cross-system personalisation. Information Sciences: an International Journal ; 2009; 179(1-2), pp. 16-32.

  [9]

X. Tao, Y. Li and N. Zhong. A personalized ontology model for web information gathering. IEEE Transactions on Knowledge and Data Engineering, IEEE Computer Society Digital Library ; 2011; 23(4), pp. 496–511.

  [10] S. Gauch, M. Speretta, A. Chandramouli and A. Micarelli. User profiles for personalized information access. The Adaptive Web, Methods and Strategies of Web Personalization , P. Brusilovsky, A. Kobsa, and W. Nejdl, Eds. Berlin, Germany: Springer- Verlag, 2007, pp. 54–89.

  [11] RDF: Resource Description Framework, http://www.w3.org/RDF/ , last visited on: 21-01-2013.