docbook.params.man.charmap.subset.profile.xml Maven / Gradle / Ivy
man.charmap.subset.profile
string
man.charmap.subset.profile
Profile of character map subset
@*[local-name() = 'block'] = 'Miscellaneous Technical' or
(@*[local-name() = 'block'] = 'C1 Controls And Latin-1 Supplement (Latin-1 Supplement)' and
(@*[local-name() = 'class'] = 'symbols' or
@*[local-name() = 'class'] = 'letters')
) or
@*[local-name() = 'block'] = 'Latin Extended-A'
or
(@*[local-name() = 'block'] = 'General Punctuation' and
(@*[local-name() = 'class'] = 'spaces' or
@*[local-name() = 'class'] = 'dashes' or
@*[local-name() = 'class'] = 'quotes' or
@*[local-name() = 'class'] = 'bullets'
)
) or
@*[local-name() = 'name'] = 'HORIZONTAL ELLIPSIS' or
@*[local-name() = 'name'] = 'WORD JOINER' or
@*[local-name() = 'name'] = 'SERVICE MARK' or
@*[local-name() = 'name'] = 'TRADE MARK SIGN' or
@*[local-name() = 'name'] = 'ZERO WIDTH NO-BREAK SPACE'
Description
If the value of the
man.charmap.use.subset parameter is non-zero,
and your DocBook source is not written in English (that
is, if the lang or xml:lang attribute on the root element
in your DocBook source or on the first refentry
element in your source has a value other than
en ), then the character-map subset specified
by the man.charmap.subset.profile
parameter is used instead of the full roff character map.
Otherwise, if the lang or xml:lang attribute on the root
element in your DocBook
source or on the first refentry element in your source
has the value en or if it has no lang or xml:lang attribute, then the character-map
subset specified by the
man.charmap.subset.profile.english
parameter is used instead of
man.charmap.subset.profile .
The difference between the two subsets is that
man.charmap.subset.profile provides
mappings for characters in Western European languages that are
not part of the Roman (English) alphabet (ASCII character set).
The value of man.charmap.subset.profile
is a string representing an XPath expression that matches attribute
names and values for output-character
elements in the character map.
The attributes supported in the standard roff character map included in the distribution are:
character
a raw Unicode character or numeric Unicode
character-entity value (either in decimal or hex); all
characters have this attribute
name
a standard full/long ISO/Unicode character name (e.g.,
"OHM SIGN"); all characters have this attribute
block
a standard Unicode "block" name (e.g., "General
Punctuation"); all characters have this attribute. For the full
list of Unicode block names supported in the standard roff
character map, see .
class
a class of characters (e.g., "spaces"). Not all
characters have this attribute; currently, it is used only with
certain characters within the "C1 Controls And Latin-1
Supplement" and "General Punctuation" blocks. For details, see
.
entity
an ISO entity name (e.g., "ohm"); not all characters
have this attribute, because not all characters have ISO entity
names; for example, of the 800 or so characters in the standard
roff character map included in the distribution, only around 300
have ISO entity names.
string
a string representing an roff/groff escape-code (with
"@esc@" used in place of the backslash), or a simple ASCII
string; all characters in the roff character map have this
attribute
The value of man.charmap.subset.profile
is evaluated as an XPath expression at run-time to select a portion of
the roff character map to use. You can tune the subset used by adding
or removing parts. For example, if you need to use a wide range of
mathematical operators in a document, and you want to have them
converted into roff markup properly, you might add the following:
@*[local-name() = 'block'] ='MathematicalOperators'
That will cause a additional set of around 67 additional "math"
characters to be converted into roff markup.
Depending on which XSLT engine you use, either the EXSLT
dyn:evaluate extension function (for xsltproc or
Xalan) or saxon:evaluate extension function (for
Saxon) are used to dynamically evaluate the value of
man.charmap.subset.profile at run-time. If you
don't use xsltproc, Saxon, Xalan -- or some other XSLT engine that
supports dyn:evaluate -- you must either set the
value of the man.charmap.use.subset parameter
to zero and process your documents using the full character map
instead, or set the value of the
man.charmap.enabled parameter to zero instead
(so that character-map processing is disabled completely.
An alternative to using
man.charmap.subset.profile is to create your
own custom character map, and set the value of
man.charmap.uri to the URI/filename for
that. If you use a custom character map, you will probably want to
include in it just the characters you want to use, and so you will
most likely also want to set the value of
man.charmap.use.subset to zero.
You can create a
custom character map by making a copy of the standard roff character map provided in the distribution, and
then adding to, changing, and/or deleting from that.
If you author your DocBook XML source in UTF-8 or UTF-16
encoding and aren't sure what OSes or environments your man-page
output might end up being viewed on, and not sure what version of
nroff/groff those environments might have, you should be careful about
what Unicode symbols and special characters you use in your source and
what parts you add to the value of
man.charmap.subset.profile .
Many of the escape codes used are specific to groff and using
them may not provide the expected output on an OS or environment that
uses nroff instead of groff.
On the other hand, if you intend for your man-page output to be
viewed only on modern systems (for example, GNU/Linux systems, FreeBSD
systems, or Cygwin environments) that have a good, up-to-date groff,
then you can safely include a wide range of Unicode symbols and
special characters in your UTF-8 or UTF-16 encoded DocBook XML source
and add any of the supported Unicode block names to the value of
man.charmap.subset.profile .
For other details, see the documentation for the
man.charmap.use.subset parameter.
Supported Unicode block names and "class" values
Below is the full list of Unicode block names and "class"
values supported in the standard roff stylesheet provided in the
distribution, along with a description of which codepoints from the
Unicode range corresponding to that block name or block/class
combination are supported.
C1 Controls And Latin-1 Supplement (Latin-1 Supplement) (x00a0 to x00ff)
class values
symbols
letters
Latin Extended-A (x0100 to x017f, partial)
Spacing Modifier Letters (x02b0 to x02ee, partial)
Greek and Coptic (x0370 to x03ff, partial)
General Punctuation (x2000 to x206f, partial)
class values
spaces
dashes
quotes
daggers
bullets
leaders
primes
Superscripts and Subscripts (x2070 to x209f)
Currency Symbols (x20a0 to x20b1)
Letterlike Symbols (x2100 to x214b)
Number Forms (x2150 to x218f)
Arrows (x2190 to x21ff, partial)
Mathematical Operators (x2200 to x22ff, partial)
Control Pictures (x2400 to x243f)
Enclosed Alphanumerics (x2460 to x24ff)
Geometric Shapes (x25a0 to x25f7, partial)
Miscellaneous Symbols (x2600 to x26ff, partial)
Dingbats (x2700 to x27be, partial)
Alphabetic Presentation Forms (xfb00 to xfb04 only)
© 2015 - 2025 Weber Informatics LLC | Privacy Policy