All Downloads are FREE. Search and download functionalities are using the official Maven repository.

docbook.params.man.charmap.subset.profile.xml Maven / Gradle / Ivy

There is a newer version: 2.4
Show newest version


man.charmap.subset.profile
string


man.charmap.subset.profile
Profile of character map subset





@*[local-name() = 'block'] = 'Miscellaneous Technical' or
(@*[local-name() = 'block'] = 'C1 Controls And Latin-1 Supplement (Latin-1 Supplement)' and
 (@*[local-name() = 'class'] = 'symbols' or
  @*[local-name() = 'class'] = 'letters')
) or
@*[local-name() = 'block'] = 'Latin Extended-A'
or
(@*[local-name() = 'block'] = 'General Punctuation' and
 (@*[local-name() = 'class'] = 'spaces' or
  @*[local-name() = 'class'] = 'dashes' or
  @*[local-name() = 'class'] = 'quotes' or
  @*[local-name() = 'class'] = 'bullets'
 )
) or
@*[local-name() = 'name'] = 'HORIZONTAL ELLIPSIS' or
@*[local-name() = 'name'] = 'WORD JOINER' or
@*[local-name() = 'name'] = 'SERVICE MARK' or
@*[local-name() = 'name'] = 'TRADE MARK SIGN' or
@*[local-name() = 'name'] = 'ZERO WIDTH NO-BREAK SPACE'




Description

If the value of the
man.charmap.use.subset parameter is non-zero,
and your DocBook source is not written in English (that
  is, if the lang or xml:lang attribute on the root element
  in your DocBook source or on the first refentry
  element in your source has a value other than
  en), then the character-map subset specified
  by the man.charmap.subset.profile
  parameter is used instead of the full roff character map.

Otherwise, if the lang or xml:lang attribute on the root
  element in your DocBook
  source or on the first refentry element in your source
  has the value en or if it has no lang or xml:lang attribute, then the character-map
  subset specified by the
  man.charmap.subset.profile.english
  parameter is used instead of
  man.charmap.subset.profile.

The difference between the two subsets is that
  man.charmap.subset.profile provides
  mappings for characters in Western European languages that are
  not part of the Roman (English) alphabet (ASCII character set).

The value of man.charmap.subset.profile
is a string representing an XPath expression that matches attribute
names and values for output-character
elements in the character map.

The attributes supported in the standard roff character map included in the distribution are:

  
    character
    
      a raw Unicode character or numeric Unicode
      character-entity value (either in decimal or hex); all
      characters have this attribute
    
  
  
    name
    
      a standard full/long ISO/Unicode character name (e.g.,
      "OHM SIGN"); all characters have this attribute
    
  
  
    block
    
      a standard Unicode "block" name (e.g., "General
      Punctuation"); all characters have this attribute. For the full
      list of Unicode block names supported in the standard roff
      character map, see .
    
  
  
    class
    
      a class of characters (e.g., "spaces"). Not all
      characters have this attribute; currently, it is used only with
      certain characters within the "C1 Controls And Latin-1
      Supplement" and "General Punctuation" blocks. For details, see
      .
    
  
  
    entity
    
      an ISO entity name (e.g., "ohm"); not all characters
      have this attribute, because not all characters have ISO entity
      names; for example, of the 800 or so characters in the standard
      roff character map included in the distribution, only around 300
      have ISO entity names.
      
    
  
  
    string
    
      a string representing an roff/groff escape-code (with
      "@esc@" used in place of the backslash), or a simple ASCII
      string; all characters in the roff character map have this
      attribute
    
  


The value of man.charmap.subset.profile
is evaluated as an XPath expression at run-time to select a portion of
the roff character map to use. You can tune the subset used by adding
or removing parts. For example, if you need to use a wide range of
mathematical operators in a document, and you want to have them
converted into roff markup properly, you might add the following:

  @*[local-name() = 'block'] ='MathematicalOperators' 

That will cause a additional set of around 67 additional "math"
characters to be converted into roff markup. 


Depending on which XSLT engine you use, either the EXSLT
dyn:evaluate extension function (for xsltproc or
Xalan) or saxon:evaluate extension function (for
Saxon) are used to dynamically evaluate the value of
man.charmap.subset.profile at run-time. If you
don't use xsltproc, Saxon, Xalan -- or some other XSLT engine that
supports dyn:evaluate -- you must either set the
value of the man.charmap.use.subset parameter
to zero and process your documents using the full character map
instead, or set the value of the
man.charmap.enabled parameter to zero instead
(so that character-map processing is disabled completely.


An alternative to using
man.charmap.subset.profile is to create your
own custom character map, and set the value of
man.charmap.uri to the URI/filename for
that. If you use a custom character map, you will probably want to
include in it just the characters you want to use, and so you will
most likely also want to set the value of
man.charmap.use.subset to zero.
You can create a
custom character map by making a copy of the standard roff character map provided in the distribution, and
then adding to, changing, and/or deleting from that.


If you author your DocBook XML source in UTF-8 or UTF-16
encoding and aren't sure what OSes or environments your man-page
output might end up being viewed on, and not sure what version of
nroff/groff those environments might have, you should be careful about
what Unicode symbols and special characters you use in your source and
what parts you add to the value of
man.charmap.subset.profile.
Many of the escape codes used are specific to groff and using
them may not provide the expected output on an OS or environment that
uses nroff instead of groff.
On the other hand, if you intend for your man-page output to be
viewed only on modern systems (for example, GNU/Linux systems, FreeBSD
systems, or Cygwin environments) that have a good, up-to-date groff,
then you can safely include a wide range of Unicode symbols and
special characters in your UTF-8 or UTF-16 encoded DocBook XML source
and add any of the supported Unicode block names to the value of
man.charmap.subset.profile.



For other details, see the documentation for the
man.charmap.use.subset parameter.

Supported Unicode block names and "class" values
  

  Below is the full list of Unicode block names and "class"
  values supported in the standard roff stylesheet provided in the
  distribution, along with a description of which codepoints from the
  Unicode range corresponding to that block name or block/class
  combination are supported.

  
    
      C1 Controls And Latin-1 Supplement (Latin-1 Supplement) (x00a0 to x00ff)
      class values
        
        
          symbols
        
        
          letters
        
      
    
    
      Latin Extended-A (x0100 to x017f, partial)
    
    
      Spacing Modifier Letters (x02b0 to x02ee, partial)
    
    
      Greek and Coptic (x0370 to x03ff, partial)
    
    
      General Punctuation (x2000 to x206f, partial)
      class values
        
        
          spaces
        
        
          dashes
        
        
          quotes
        
        
          daggers
        
        
          bullets
        
        
          leaders
        
        
          primes
        
      
      
    
    
      Superscripts and Subscripts (x2070 to x209f)
    
    
      Currency Symbols (x20a0 to x20b1)
    
    
      Letterlike Symbols (x2100 to x214b)
    
    
      Number Forms (x2150 to x218f)
    
    
      Arrows (x2190 to x21ff, partial)
    
    
      Mathematical Operators (x2200 to x22ff, partial)
    
    
      Control Pictures (x2400 to x243f)
    
    
      Enclosed Alphanumerics (x2460 to x24ff)
    
    
      Geometric Shapes (x25a0 to x25f7, partial)
    
    
      Miscellaneous Symbols (x2600 to x26ff, partial)
    
    
      Dingbats (x2700 to x27be, partial)
    
    
      Alphabetic Presentation Forms (xfb00 to xfb04 only)
    
  







© 2015 - 2025 Weber Informatics LLC | Privacy Policy