[section {Keyword index serialization format}]

Here we specify the format used by the doctools v2 packages to
serialize keyword indices as immutable values for transport,
comparison, etc.

[para]

We distinguish between [term regular] and [term canonical]
serializations. While a keyword index may have more than one regular
serialization only exactly one of them will be [term canonical].

[para]

[list_begin definitions][comment {-- serializations --}]
[def {regular serialization}]

[list_begin enumerated][comment {-- regular points --}]
[enum]
An index serialization is a nested Tcl dictionary.

[enum]
This dictionary holds a single key, [const doctools::idx], and its
value. This value holds the contents of the index.

[enum]
The contents of the index are a Tcl dictionary holding the title of
the index, a label, and the keywords and references. The relevant keys
and their values are

[list_begin definitions][comment {-- keywords --}]
[def [const title]]
The value is a string containing the title of the index.

[def [const label]]
The value is a string containing a label for the index.

[def [const keywords]]
The value is a Tcl dictionary, using the keywords known to the index
as keys. The associated values are lists containing the identifiers of
the references associated with that particular keyword.

[para]
Any reference identifier used in these lists has to exist as a key in
the [const references] dictionary, see the next item for its
definition.

[def [const references]]
The value is a Tcl dictionary, using the identifiers for the
references known to the index as keys. The associated values are
2-element lists containing the type and label of the reference, in
this order.

[para]
Any key here has to be associated with at least one keyword,
i.e. occur in at least one of the reference lists which are the values
in the [const keywords] dictionary, see previous item for its
definition.

[list_end][comment {-- keywords --}]

[enum]
The [term type] of a reference can be one of two values,

[list_begin definitions][comment {-- types --}]
[def [const manpage]]
The identifier of the reference is interpreted as symbolic file name,
referring to one of the documents the index was made for.

[def [const url]]
The identifier of the reference is interpreted as an url, referring to
some external location, like a website, etc.

[list_end][comment {-- types --}]
[list_end][comment {-- regular points --}]

[def {canonical serialization}]

The canonical serialization of a keyword index has the format as
specified in the previous item, and then additionally satisfies the
constraints below, which make it unique among all the possible
serializations of the keyword index.

[list_begin enumerated][comment {-- canonical points --}]
[enum]

The keys found in all the nested Tcl dictionaries are sorted in
ascending dictionary order, as generated by Tcl's builtin command
[cmd {lsort -increasing -dict}].

[enum]
The references listed for each keyword of the index, if any, are
listed in ascending dictionary order of their [emph labels], as
generated by Tcl's builtin command [cmd {lsort -increasing -dict}].

[list_end][comment {-- canonical points --}]
[list_end][comment {-- serializations --}]