Section B: Workshop Report

Source Book on Digital Libraries 25

Chapter 1, Section B: Workshop Report

Introduction On October 10-11, 1991, just prior to the 14th International Conference on Research and Development in Information Retrieval, 30 researchers representing academia, industry and government met in Chicago for an NSF sponsored workshop dealing with the broad areas of information retrieval, document analysis, text understanding, and intelligent document management systems. This workshop was unique in its focus on information as opposed to data, on supporting humans’ fundamental needs to organize document databases and find documents in terms of their information content, and on the use of language processing and knowledge representation techniques to help in these processes. This report summarizes the discussion, highlights the research issues identified, and makes recommendations regarding the NSF’s future involvement in the area of Information Resources and Document Management. The remainder of the workshop report i.e., Part I is divided into two sections and four appendices. In Section 2 we describe the main research areas that are relevant to national needs and thus are recommended as candidates for sponsored research. In Section 3 specific recommendations and concluding remarks are presented. Appendix A.1 provides an overview of related NSF sponsored workshops and Appendix A.2 describes the relationships of this report’s recommendations to the HPCC effort and the NREN. Appendix A.3 summarizes the accomplishments of the last two decades in Information Retrieval and related disciplines and Appendix A.4 identifies emerging technologies that are relevant to the information industry. Further Background In 1945, Vannevar Bush, President Roosevelt’s science advisor, who played a key role in the development of the National Science Foundation, wrote of the need to develop and use tools to manage mankind’s accumulated knowledge. [7] Although Bush’s ‘memex’ idea didn’t turn out to be of much practical value, his vision inspired many researchers in the areas of information retrieval, information science, and hypertext. As a result, in part, of his insight, the first text retrieval systems were developed in the 1950s. The NSF was given a broad directive in its enabling act of 1950 ‘‘to foster the interchange of scientific information among scientists.’’ The Office of Science Information Service OSIS at the NSF was responsible for providing research funding toward improved methods and techniques for handling scientific Future Directions in Text Analysis, Retrieval and Understanding 26 Source Book on Digital Libraries information, including mechanized systems, through the Documentation Research Program. The activities of the program was expanded considerably around 1956 under the directorship of Mrs. Helen Brownson. The National Defense Education Act of 1958 directed the foundation to undertake projects of a fundamental or general nature that may produce new insights, knowledge, or techniques applicable to information systems and services. [6] The projects supported at this time were featured in the semiannual reports on Current Research and Development in Scientific Documentation. [30] During those early days, a number of conferences were sponsored by the NSF to chart a plan for research activities in the field of scientific documentation. National publicity prior to and following a conference on Coordination of scientific and technical information relating to federal RD activities Cleveland, February 3-4, 1958 attracted the attention of Senator Hubert H. Humphrey, who then arranged for the subcommittee of Senate Committee on Government Operations that he chaired to hold hearings on this topic. [44] An NSF sponsored conference, held in Washington, D.C. October 2-3, 1964, was intended to encourage research on Evaluation of Document Searching Systems and Procedures. Active research programs at the Chemical Abstracts Service and the American Society of Physics were supported in large part by the NSF. OSIS supported research activities in information retrieval not only in the U.S., but on a worldwide basis. For example, the funds for the famous Aslib-Cranfield project was provided by the NSF to Aslib in England and, in turn, Cleverdon received the funds to execute the project. Agencies such as the Ford Foundation as well as the American Society of Metals initiated and supported many studies of library problems. [26] There is a general consensus among those doing research on scientific documentation at that time that ‘‘the 1960s were really a good time for IR research.’’ In the 1970s, the significance of the area was further emphasized at the NSF by the establishment of, for example, the Division for Information Science and Technology. In the current structure at the NSF, IR and related areas come under the Database and Expert System Program Director: Dr. M. Zemankova, which is located within the IRIS Division of the Directorate for Computer Information Science Engineering. That research led to the growth of a vast industry, where the current systems support searching for relevant items in large collections, and where the items may correspond to library catalog entries, bibliographic records, citations with abstracts andor keywords, or full-text documents. In addition to searching, users can browse the documents or indexes. Some classes of retrieval systems, such as those for Selective Dissemination of Information SDI, route new documents to those users who have filed a profile that defines their information needs. These alert systems send notifications or documents to interested parties, upon the arrival of news or Source Book on Digital Libraries 27

Chapter 1, Section B: Workshop Report