next up previous contents index
Next: Deleting Data Up: Using t1lib Previous: Loading Fonts Explicitly   Contents   Index


Functions for Encoding Handling

As mentioned earlier, the encoding mechanism used in the PostScript-language allows a font to contain more than 256 different characters, although only 256 are accessible at a given time. The characters which are accessible are given by the elements of the current encoding vector. In order to maximize flexibility, t1lib allows for changing the current encoding vector. This is also called ``Reencoding a font''. A new encoding vector is defined and made known to the library by creating an encoding-file and loading its contents into memory. Before describing the functions needed for this, we should briefly describe the format of an encoding file.

An encoding file is an ASCII text file. No assumptions about filename extensions are made. Here are the rules for scanning the file:

As well known from PostScript, non-existent characters have to be named .notdef.

Here's an example of such an encoding file:

Sample encoding file for t1lib!
The first two lines are considered to be comments!
Encoding=ISOLatin1Encoding       
.notdef                          /* '000  000  "00 */ 
.notdef                          /* '001  001  "01 */ 
.notdef                          /* '002  002  "02 */ 
  .                                   .
  .                                   .
  .                                   .
greater                          /* '076  062  "3E */
question                         /* '077  063  "3F */
at                               /* '100  064  "40 */
A                                /* '101  065  "41 */
B                                /* '102  066  "42 */
  .                                   .
  .                                   .
  .                                   .
yacute                           /* '375  253  "FD */
thorn                            /* '376  254  "FE */
ydieresis                        /* '377  255  "FF */

Since V. 1.2, t1lib is also able to load encoding files in the format used by dvips. This makes a large set of existing encoding files available to the user. When parsing dvips encoding files, t1lib requires PostScript syntax. This means white space may be interspersed freely and line comments are defined by the character %. The mark-characters, [ and], are considered as special tokens and need not be preceded or followed by white space. Similarly, the literal escape character / delimits a preceding token without interspersed white space. When parsing dvips encoding files, t1lib tolerates less than 256 character name definitions. If characters are missing, they are substituted by .notdef until the counter reaches . Aside from comments, no PostScript tokens are allowed after the encoding definition in a dvips encoding file is complete.

With the defining terms above, it turns out that a file which has successfully been scanned as a dvips encoding file, cannot specify a valid t1lib encoding after the PostScript encoding definition is complete (because no valid character name can start with % and because at least a line such as Encoding=, would have to follow the PostScript encoding). Hence the file format are mutually exclusive and it is possible to read both format using one function. In a first pass t1lib tries to read the file as a dvips encoding file, and if that fails, it assumes to have a t1lib encoding file.

Once such an encoding file of either type has been created, it can be loaded into memory. This is done with the function

 char **T1_LoadEncoding( char *filename)

to 0pt The function will use the search path definitions read from the configuration file during initialization (see [*], ENCODING=). If no errors occur, an array of pointers to strings is created and initialized. The start address of this pointer array is returned as a double pointer to a char. This pointer is intended to be used to reencode a font via T1_ReencodeFont(). If the encoding data structure could not be created, NULL is returned to indicate the error.

The memory allocated by T1_LoadEncoding() is organized in two continuous blocks. One block is the pointer array of size 2577 and the other block contains the character name strings plus the encoding scheme specification, separated by ASCII-zeros. This memory can be returned to the system using the function

 int T1_DeleteEncoding( char **Encoding)

to 0pt t1lib does not check whether a valid pointer value was passed. So be careful to pass the correct pointer. An error in this function should almost always be followed by a segmentation violation.

A newly loaded encoding is applied to an existent font by calling

 int T1_ReencodeFont( int FontID, char **Encoding)

to 0pt FontID must be a valid font identification and Encoding a pointer returned from a successful call to T1_LoadEncoding().

There are two requirements in order to reencode a font:

  1. The font must already have been loaded into memory.
  2. No size-dependent data exists for this font. If it does, it must be removed explicitly prior to calling T1_ReencodeFont().

It follows that there are two ways to reencode a font. The first is to load a font explicitly and reencode it before any size dependent data is created. The second is to use an automatically loaded font and delete all of its size dependent data before reencoding it.

The user may also specify the special pointer NULL as the Encoding-argument. This would reencode the font to its internal encoding vector.

In case of success, the function returns 0, otherwise -1 is returned.

Reencoding a font takes a considerable amount of time since the mapping tables have to be reorganized. In situations where it is à priori foreseeable that the font will be reencoded using some standard encoding vector, it makes sense to assign that particular encoding vector as the default encoding vector, thereby overwriting the internal encoding vector of each font at load time before the mapping tables are setup. Setting the default encoding can be achieved using

 int T1_SetDefaultEncoding( char **Encoding)

to 0pt Here Encoding encoding is assumed to be a valid t1lib encoding vector, e.g., created by a call to T1_LoadEncoding. T1_SetDefaultEncoding() has to be called after initialization. It returns 0 if this condition is fulfilled and -1 otherwise. In the latter case T1_errno is set appropriately. Notice that the internal encoding of the font is still accessible by reencoding the font using NULL as encoding specification (see above). Note further that the default encoding vector is only applied to those font that have StandardEncoding as internal encoding. This is to prevent fonts like ZapfDingbats, Symbol or Sonata8from being reencoded automatically at load time because this would be surely inappropriate for such fonts.

It is also possible to query the encoding scheme that the font associated with FontID uses. This is achieved with the function

 char *T1_GetEncodingScheme( int FontID)

to 0pt The return value is a pointer to a string which describes the encoding scheme in question. The are 3 possible cases: Notice that the name of the encoding scheme is also accessible as Encoding[256] where Encoding is the pointer returned by a successful call to T1_LoadEncoding().


next up previous contents index
Next: Deleting Data Up: Using t1lib Previous: Loading Fonts Explicitly   Contents   Index
2005-01-12