Using L A TEX to Sort and Collate Indexes or Glossaries (datagidx package)

6.2 Using L A TEX to Sort and Collate Indexes or Glossaries (datagidx package)

Section 6.1 described how to create an index or glossaries using an ex- ternal indexing application. Some users stumble when it comes to invok- ing the indexing application. There is an alternative where TEX does the sorting and collating. This by-passes the need to use makeindex, xindy or makeglossaries , but it’s less efficient and takes longer to build your doc- ument. This section describes how to do this using the datagidx package. This package comes with my datatool bundle (at least version 2.13). The

documentation for datagidx is included in the datatool user manual [ 17 ].

The datatool package allows you to define databases that you can access in your document. The datagidx package has a special interface to this facil- ity that allows you to define databases for the purposes of indexing. These databases and their definitions must be defined in the preamble. In this sec- tion, the term “indexing” will be used to refer to either indexes or glossaries, as the same mechanism is used for both tasks.

A new indexing database is defined using: \newgidx{ ⟨label⟩ }{ ⟨title⟩ }

Definition

where ⟨label⟩ is a label that uniquely identifies this database and ⟨title⟩ is the title to be used when the index (or glossary) is displayed. For example:

\newgidx{ index }{ Index }

Input

creates a new database labelled index. When the index is displayed, it will have the section heading “Index”.

As in Section 6.1 , each term in the index (or glossary) database has an associated location list. This list is initially null. The locations are added to terms used in the document on the second L A TEX run. When you display the index, only those entries with a non-null location list or a cross-reference will be shown. The default location is the page number on which the entry was referenced. The datagidx package knows about the following page num- bering styles: arabic, roman, Roman, alph and Alph. If your document has another type of numbering style, or if you want to use a different counter

for the location, consult the datagidx section of the datatool manual [ 17 ]. Once you have defined the indexing database, you can now define terms associated with that database using

\newterm[ ⟨options⟩ ]{ ⟨name⟩ }

Definition

where ⟨name⟩ is the term and ⟨options⟩ is a list of ⟨key⟩=⟨value⟩ options. The following keys are available:

• database Identifies the database in which to store this term. For example:

\newterm[ database=index ]{ eigenvalue }

Definition

It can be somewhat cumbersome having to type the database for each new term. Instead you can define the default database using:

\DTLgidxSetDefaultDB{ ⟨label⟩ }

Definition

For example:

↑ Input

\newgidx{ index }{ Index } \DTLgidxSetDefaultDB{ index }

\newterm{ eigenvalue } \newterm{ eigenvector }

↓ Input

• label

A label uniquely identifying this term. If omitted the label is extracted from ⟨name⟩.

• sort The sort key. If omitted this is extracted from ⟨name⟩. • parent

The parent entry, if this is a sub-term. (The value should be the label identifying the parent, which must already be defined.)

• text

How the entry should appear in the document text. If omitted, ⟨name⟩ is used. If present, ⟨name⟩ indicates how the term should appear in the index/glossary.

• description An optional associated description.

• plural

The plural form of this term. If omitted this value is obtained by appending “s” to ⟨name⟩ (or the value of text if supplied).

• symbol An optional associated symbol.

• short

An associated short form, if required. (Defaults to ⟨name⟩ if omitted.) • long An associated long form, if required. (Defaults to ⟨name⟩ if omitted.) • shortplural

The plural of the associated short form. If omitted, the value is obtained by appending “s” to the short form.

• longplural

The plural of the associated long form. If omitted, the value is obtained by appending “s” to the long form.

• see

A cross-reference to a synonym. The value should be the label of another entry. This entry will not have a location list, just the reference to the other term.

• seealso

A cross-reference to a closely related term. Both this term and the cross-referenced term should have a location list.

It’s also possible to add your own custom keys. See the datagidx section of

the datatool user guide [ 17 ] for further details.

As with \newglossaryentry , discussed in Section 6.1.2 , if the term starts with an accented letter (or a ligature) the letter must be grouped.

Example:

↑ Input

\newterm[ label=elite,sort=elite ]{{ é } lite } \newterm

[% plural= {{ œ } sophagi } , label= { oesophagus } , sort= { oesophagus } , description= { tube connecting throat and stomach }

] {{ œ } sophagus }

↓ Input

There is a shortcut command for defining acronyms: \newacro[ ⟨options⟩ ]{ ⟨short⟩ }{ ⟨long⟩ }

Definition

where ⟨short⟩ is the abbreviation and ⟨long⟩ is the long form. The optional argument ⟨options⟩ is the same as for \newterm . This is equivalent to:

\newterm [% description= {\capitalisewords{ ⟨long⟩ }} , % short= {\acronymfont{ ⟨short⟩ }} , % long= { ⟨long⟩ } , % text= {\DTLgidxAcrStyle{ ⟨long⟩ }{\acronymfont{ ⟨short⟩ }}} , % plural= {\DTLgidxAcrStyle{ ⟨long⟩s }{\acronymfont{ ⟨short⟩s }}} , % sort= { ⟨short⟩ } , % ⟨options⟩ %

]% {\MakeTextUppercase{ ⟨short⟩ }}

where \DTLgidxAcrStyle{ ⟨long⟩ }{ ⟨short⟩ }

Definition

formats the full version of the acronym. This defaults to: ⟨long⟩ (⟨short⟩), and

\acronymfont{ ⟨text⟩ }

Definition Definition

\MakeTextUppercase{ ⟨text⟩ }

Definition

This is defined by the textcase package and converts ⟨text⟩ to uppercase. \capitalisewords{ ⟨text⟩ }

Definition

This is defined by the mfirstuc package and capitalises the first letter of each word in ⟨text⟩.

Example:

\newacro{ svm }{ support vector machine }

Input

Once you have defined the terms in the preamble, you can later use them in the document:

\gls{[ ⟨format⟩ ] ⟨label⟩ }

Definition

\glspl{[ ⟨format⟩ ] ⟨label⟩ }

Definition

\Gls{[ ⟨format⟩ ] ⟨label⟩ }

Definition

\Glspl{[ ⟨format⟩ ] ⟨label⟩ }

Definition

These are similar to those described in Section 6.1.2 , but they have a dif- ferent syntax. Here ⟨format⟩ is the name of a text-block commands (such as \textbf ) without the initial backslash that should be used to format the

location for this reference. This is analogous to the | special character described in Section 6.1.1 .

There are also commands associated with acronyms: \acr{[ ⟨format⟩ ] ⟨label⟩ }

Definition

\acrpl{[ ⟨format⟩ ] ⟨label⟩ }

Definition

\Acr{[ ⟨format⟩ ] ⟨label⟩ }

Definition

\Acrpl{[ ⟨format⟩ ] ⟨label⟩ }

Definition