About Managed Links Managing Linked Content with Link Manager

Managing Repository Content 2-23 This section covers the following topics: ■ About Managed Links on page 2-23 ■ Configuring Link Manager on page 2-25 ■ Managing Links on page 2-29 ■ Link Manager Database Tables on page 2-31 ■ Link Manager Filters on page 2-32 ■ Site Studio Integration on page 2-33

2.6.1 About Managed Links

This section covers the following topics: ■ Link Extraction Process on page 2-23 ■ File Formats and Conversion on page 2-24 ■ Link Status on page 2-25

2.6.1.1 Link Extraction Process

The Link Manager consists of an extraction engine and a pattern engine. The extraction engine includes a conversion engine HtmlExport. The conversion engine is used to convert files that the extraction engine cannot natively parse to a text-based file format HTML. Link Manager does not use HtmlExport to convert files that contain any of the following strings in the file format: hcs, htm, image, text, xml, jsp, and asp. These text-based files are handled by Link Manager without need for conversion. During the indexing cycle, the Link Manager component searches the checked-in content items to find URL Links. This occurs as follows: 1. The extraction engine converts the file using the conversion engine if necessary. 2. The extraction engine then uses the pattern engine to access the link evaluation rules defined in the LinkManagerPatterns table. 3. The evaluation rules tell the extraction engine how to sort, filter, evaluate, and parse the accepted URL links in the content items. Note: Because Link Manager does all of its work during the indexing cycle, it will increase the amount of time required to index content items and to rebuild collections. For information about disabling Link Manager during the rebuild cycle, refer to the sections on LkDisableOnRebuild and LkReExtractOnRebuild in the Oracle Fusion Middleware Idoc Script Reference Guide. However, the time taken may not be noticeable since most of the time is spent indexing the content item into the collection. Although, the amount of time required does depend on the type and size of the content items involved. That is, if the file needs to be converted, this requires more time than text-based HTML files. For more information about file formats, conversion, and link extraction, see Link Extraction Process on page 2-23 and File Formats and Conversion on page 2-24. 2-24 Application Administrators Guide for Content Server 4. The accepted URL links are inserted or updated in the ManagedLinks table.

2.6.1.2 File Formats and Conversion

There are various file formats such as Word that need to be converted by the conversion engine HtmlExport before links can be extracted. However, links in text-based files HTML can be extracted by Link Manager without requiring conversion by HtmlExport. Therefore, Link Manager does not use HtmlExport to convert files that contain any of the following strings in the file format: hcs, htm, image, text, xml, jsp, and asp. Furthermore, Link Manager handles all the variations of these file formats. For example, the hcs string matches the dynamic server page strings: hcst, hcsp, and hcsf. Also, the image string matches all comparable variants such as imagegif, imagejpeg, imagergb, imagetiff, etc. In addition to these, there may be other file types that you do not want to be converted. In this case, you can use a configuration variable to prevent their conversion. For more information, see the sections on LkDisallowConversionFormats in the Oracle Fusion Middleware Idoc Script Reference Guide. Link Manager recognizes links in the following file formats: ■ Text-based formats txt, html, xml, jsp, asp, csv, hcst, hcsf, and hcsp ■ Email msg and eml ■ Microsoft Word Caution: With this release, the Link Manager component uses HtmlExport 8 shipped with the current version of Content Server for file conversion. A link extractor template file is included with the Link Manager component. HtmlExport 8 requires this template. Do not edit this file. Important: To execute successfully, HtmlExport requires either a virtual or physical video interface adaptor VIA. For example, most Windows environments have graphics capabilities that provide HtmlExport access to a frame buffer. UNIX systems, however, may not have graphics cards and do not have a running X-Windows Server for use by HtmlExport. For systems without graphics cards, a virtual frame buffer VFB can be installed and used. Managing Repository Content 2-25 ■ Microsoft Excel ■ OpenOffice Writer ■ OpenOffice Calc

2.6.1.3 Link Status

All new and existing links are managed during the indexing cycle. When content items are checked in, the accepted links in these content items are added to or updated in the ManagedLinks table. Additionally, existing links are evaluated for changes resulting from content items being checked in or deleted. As links are added or monitored, they are marked as either valid or invalid. When one content item in the system references another content item in the system, the resulting link is marked as valid. When an existing link references a content item that has been deleted, the link is reevaluated and the status changes from valid to invalid. Statuses are recorded as a Y valid or N invalid in the dLkState column of the ManagedLinks Table and displayed for the user in the State column of the Link Info page as Valid or Invalid. For more information on the Link Info page, see the Oracle Fusion Middleware Users Guide for Content Server.

2.6.2 Configuring Link Manager