Tutorial: iText by Example

The Chunk Object

Introduction:
A Chunk is the smallest significant part of text that can be added to a document. It is the 'atom' building block of most of the other High Level Text objects. All the contents of a Chunk are of the same font, fontsize, style, color, etc... In the following sections, all Chunk-functionality will be described:
Go to top of the page
Adding lines under, through or above a Chunk:
As you can read in the chapter on Fonts, you can define the style as Font.UNDERLINE or Font.STRIKETHRU to underline the text or strike through it. This works for RTF and HTML, but in PDF, you get a lot more functionality if you use the method Chunk.setUnderline. There are two variations of this method:
  • Chunk.setUnderline(float thickness, float yPosition) draws a line that has the length of the Chunk, is thickness points thick and positioned yPosition above the baseline of the Chunk.
    Chunk underlined = new Chunk("underlined");
    underlined.setUnderline(0.2f, -2f);
    Chunk strikethru = new Chunk("strike through example");
    strikethru.setUnderline(0.5f, 3f);
  • Chunk.setUnderline(Color color, float thickness, float thicknessMul, float yPosition, float yPositionMul, int cap) lets you define the color of the line. You can set the absolute thickness and y postition with thickness and yPosition, but you can also let these values depend on the fontsize with thicknessMul and yPositionMul. In the example, the same Chunk with the same linedefinitions is written twice, but with a different fontsize. As you can see, some lines vary along with the fontsize, others don't. This all depends on the parameters that were passed with the method setUnderline. Finally you can define the line cap style (see Table 4.4 in the PDF Reference Manual).
    line cap styles
    Chunk c = new Chunk("Multiple lines");
    c.setUnderline(new Color(0xFF, 0x00, 0x00),
    	0.0f, 0.3f, 0.0f, 0.4f,
    	PdfContentByte.LINE_CAP_ROUND);
    c.setUnderline(new Color(0x00, 0xFF, 0x00),
    	5.0f, 0.0f, 0.0f, -0.5f,
    	PdfContentByte.LINE_CAP_PROJECTING_SQUARE);
    c.setUnderline(new Color(0x00, 0x00, 0xFF),
    	0.0f, 0.2f, 15.0f, 0.0f,
    	PdfContentByte.LINE_CAP_BUTT);
Example: java com.lowagie.examples.objects.chunk.Lines
Demonstrates how to add lines under, through,... a Chunk: see Lines.pdf
Go to top of the page
Sub- and Superscript:
If you want to add a Chunk above or below the current y-position, you can use the method setTextRise. In the example, we marked the baseline with c.setUnderline(new Color(0xC0, 0xC0, 0xC0), 0.2f, 0.0f, 0.0f, 0.0f, PdfContentByte.LINE_CAP_BUTT); to demonstrate how a positive textrise puts text above the baseline and a negative textrise below.
Remark that 'underline' definitions are not affected by this value. The underline has to follow the baseline otherwise mixing normal text and superscript would cause a discontinuity. As for subscript: it doesn't have much sense mixing it with underline text.
Example: java com.lowagie.examples.objects.chunk.SubSupScript
Demonstrates the use of sub- and superscript: see SubSupScript.pdf
Go to top of the page
Changing the background color:
With the methods setBackground(Color color) and setBackground(Color color, float extraLeft, float extraBottom, float extraRight, float extraTop), you can change the background area of a Chunk (for instance to highlight a word). The first method fills the box that surrounds the Chunk, with the second method, you can make the rectangle bigger or smaller.
Example: java com.lowagie.examples.objects.chunk.Background
How to change the background color of a Chunk: see Background.pdf
Go to top of the page
Stroking vs. Filling:
In the PDF syntax characters (glyphs) are regarded as filled shapes. If you want to change the color of a character, you change the color of the font, but in reality you change the 'fill' color. This is demonstrated in the example:
Example: java com.lowagie.examples.objects.chunk.ChunkColor
Changing the color of a Chunk: see ChunkColor.pdf
With the method setTextRenderMode(int mode, float strokeWidth, Color strokeColor), you can also change the 'outline' of the text. As you can read in the chapters on direct content, lines are not 'filled', but 'striken'. With the parameters strokeWidth and strokeColor, you define the width and the color of the strokes used to draw the character.
The mode parameter, can be one of the following values:
  1. PdfContentByte.TEXT_RENDER_MODE_FILL: glyphs will be filled (fontcolor), the strokeWidth and strokeColor don't play a role here.
  2. PdfContentByte.TEXT_RENDER_MODE_STROKE: you will only see the outline of the glyphs, with the given strokeWidth and strokeColor.
  3. PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE: glyphs will be filled with the fontcolor and will have an outline with the given strokeWidth and strokeColor.
  4. PdfContentByte.TEXT_RENDER_MODE_INVISIBLE: the glyphs will be invisible.
This method can also be used to simulate a bold font (for instance if you are using a font that doesn't support Font.Bold).
Example: java com.lowagie.examples.objects.chunk.Rendering
Some special rendering functionality: see Rendering.pdf
Go to top of the page
Skewing text:
Method setSkew(float alpha, float beta) can be used to simulate italic fonts (if you are using a font that doesn't support Font.ITALIC), just take 0f as value for alfa and 12f as value for beta.
Alfa is the angle of the baseline in degrees. In the example, we rotate some text 45f and -45f degrees. Beta is the square-angle on the baseline.
Example: java com.lowagie.examples.objects.chunk.Skew
Skewing Chunks: see Skew.pdf
Go to top of the page
Generic functionality:
If all the features summed up above still doesn't answer your needs, you can create your own functionality, using the PageEvent onGenericTag(PdfWriter writer, Document document, Rectangle rect, String text). You can create your own class implementing the PdfPageEvent-interface. The best way to do this, is to extend the class PdfPageEventHelper and define some custom functionality, overriding the onGenericTag-method.
public class Generic extends PdfPageEventHelper {
  public void onGenericTag(PdfWriter writer, Document document, Rectangle rect, String text) {
    // writer is the current writer to which the Chunk is written
    // document is the document to which the Chunk was added
    // rect is the area surrounding the Chunk
    // text is the text you passed with setGenericTag
  }
}
Don't forget to set the PageEvent:
writer.setPageEvent(new Generic());
In the first example, we look at the text that is passed with the genericTag (setGenericTag(String text)). If this text equals to "ellipse", we draw an ellipse OVER the Chunk. It we passed the word "box", we draw a rectangle UNDER the Chunk.
public void onGenericTag(PdfWriter writer, Document document, Rectangle rect, String text) {
  if ("ellipse".equals(text)) {
    PdfContentByte cb = writer.getDirectContent();
    cb.setRGBColorStroke(0xFF, 0x00, 0x00);
    cb.ellipse(rect.left(), rect.bottom() - 5f, rect.right(), rect.top());
    cb.stroke();
    cb.resetRGBColorStroke();
  }
  else if ("box".equals(text)) {
    PdfContentByte cb = writer.getDirectContentUnder();
    rect.setGrayFill(0.5f);
    cb.rectangle(rect);
  }
}
Read the chapters on direct content to learn about the PDF syntax that gives you almost unlimited possibilities.
Example: java com.lowagie.examples.objects.chunk.Generic
Using the Generic tag to add styles: see Generic.pdf
In the following example, we use the generic tag to keep a TreeMap of words and we register the page on which the words were used. Once the document is completed, we add an extra page, that gives us a glossary of words. It's a very simple example, but with some changes, you can do a lot of really complex stuff with it.
Example: java com.lowagie.examples.objects.chunk.Glossary
Other use of the Generic tag: register keywords for a glossary: see Glossary.pdf
Go to top of the page
Measuring and Scaling a Chunk:
If you have a Chunk and you want to know it's width on a page, you can use the method getWidthPoint(). It gives you the width in points and you will have to do maths to know the width in inches or centimeters.
With the method setHorizontalScaling(float scale), you can shrink (scale < 1.0f) or expand (scale > 1.0f) the Chunk. In the example we first print the Chunk in its actual size, then we print it twice at half its size (scale = 0.5f).
Example: java com.lowagie.examples.objects.chunk.Width
How to measure and scale the width of a Chunk: see Width.pdf
Go to top of the page
When the end of a line is reached:
Phrase
If you keep on adding Chunks to a document, you reach the end of the line at a certain moment. Of course, you haven't defined yet, what iText should do when this happens. By default, iText performs a 'carriage return', but what about 'newline'? How much space should iText take? The space between two (base)lines in a text, is called 'leading' and you can't define it in class Chunk. You need other classes such as Paragraph for this kind of stuff.
In the following example we use the lesser known (and a little bit superfluous) class Phrase. A Phrase is a series of chunks (that can have different styles) and with a certain leading as extra parameter. Unlike class Paragraph, it doesn't know anything about indentation. The default fontsize is 12, so we take a leading of 16. Take a look at the first page of the example: all Chunks are written on the same line, over and over again. On the second page, we added the same Chunks, but grouped in a Phrase with a leading of 16. When the end of the line is reached, a new line is started.
Example: java com.lowagie.examples.objects.chunk.EndOfLine
What happens when the end of the line is reached?: see EndOfLine.pdf
SplitCharacter
The default behaviour of iText, is to try to add as many 'complete words' to a line as possible. iText will split sentences when a space or a hyphen is encountered. But what if the Chunk is longer than the page width? In that case, iText will split the Chunk, just before the first character that doesn't fit the page. In some cases, you don't want this to happen. You can avoid this by implementing the interface SplitCharacter and add this class to the Chunk with setSplitCharacter(SplitCharacter splitCharacter). You have to write your own isSplitCharacter, which may seem rather complex at first sight. This is the default implementation:
public boolean isSplitCharacter(int start, int current, int end, char[] cc, PdfChunk[] ck) {
   char c;
   if (ck == null)
       c = cc[current];
   else
       c = ck[Math.min(current, ck.length - 1)].getUnicodeEquivalent(cc[current]);
   if (c <= ' ' || c == '-') {
       return true;
   }
   if (c < 0x2e80)
       return false;
   return ((c >= 0x2e80 && c < 0xd7a0)
   || (c >= 0xf900 && c < 0xfb00)
   || (c >= 0xfe30 && c < 0xfe50)
   || (c >= 0xff61 && c < 0xffa0));
}
You can copy this method in your own SplitCharacter implementation and make these small changes to have iText split words on dots and slashes too:
if (c <= ' ' || c == '-' || c == '.' || c == '/') {
    return true;
}
Example: java com.lowagie.examples.objects.chunk.SplitChar
Defining Split Characters: see SplitChar.pdf
Hyphenation
To conclude this section, it is also possible to let iText hyphenate your Chunks automatically. To achieve this, you have to create an instance of the class HyphenationAuto:
// Hyphenating a British text
Chunk ckEN = new Chunk(textEN);
HyphenationAuto autoEN = new HyphenationAuto("en", "GB", 2, 2);
ckEN.setHyphenation(autoEN);
// Hyphenating a Dutch text
Chunk ckNL = new Chunk(textNL);
HyphenationAuto autoNL = new HyphenationAuto("nl", null, 2, 2);
ckNL.setHyphenation(autoNL);
In the example, the same text (the introduction of Charles Dickens's epic Tale of Two Cities) is hyphenated once according to UK-hyphenation rules and once according to US-hyphenation rules. As you can see, there is a slight difference.
Important remark: you need the itext-hyph-xml.jar in your CLASSPATH if you want to use this functionality. In this jar, you'll find files like:
  • en_GB.xml
  • en_US.xml
  • fr.xml
  • nl.xml
These files contain all the hyphenation rules that are specific for a certain language (GB english, US english, French, Dutch,...). As you can see the first two parameters of the HyphenationAuto constructor correspond with parts of the filename, the third and fourth parameter specify how many characters may be 'orphaned' at the start of a word resp. at the end of a word.
Remark that the XML files were not created by the iText developers. The XML files were created for Apache FOP. We have downloaded them and we have put them in a separate jar for your convenience. Some of them may be GPL or not usable for commercial purposes so, read the licenses and decide what to keep.
If you can't find the hyphenation pattern you are looking for, you can create your own as described at the FOP site. Put your xml file as a resource in the package com.lowagie.text.pdf.hyphenation.hyph or put the xml file in some directory and call Hyphenator.setHyphenDir().
Go to top of the page
To be continued...:
There's a lot more to be told on Chunk, see Anchors and Actions.
Go to top of the page



Amazon books: