The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Text::Layout - Pango style markup formatting

This module will cooperate with PDF::API2, PDF::Builder, Cairo, and Pango.

SYNOPSIS

Text::Layout provides methods for Pango style text formatting. Where possible the methods have identical names and (near) identical behaviour as their Pango counterparts.

See https://developer.gnome.org/pango/stable/pango-Layout-Objects.html.

The package uses Text::Layout::FontConfig (included) to organize fonts by description.

If module HarfBuzz::Shaper is installed, Text::Layout can use it for text shaping.

Example, using PDF::API2 integration:

    use PDF::API2;              # or PDF::Builder
    use Text::Layout;

    # Create a PDF document.
    my $pdf = PDF::API2->new;   # or PDF::Builder->new
    $pdf->default_page_size("a4")       # ISO A4

    # Set up page and get the text context.
    my $page = $pdf->page;
    my $ctx  = $page->text;

    # Create a markup instance.
    my $layout = Text::Layout->new($pdf);

    # This example uses PDF corefonts only.
    Text::Layout::FontConfig->register_corefonts;

    $layout->set_font_description(Text::Layout::FontConfig->from_string("times 40"));
    $layout->set_markup( q{The <i><span foreground="red">quick</span> <span size="20"><b>brown</b></span></i> fox} );

    # Center text.
    $layout->set_width(595);    # width of A4 page
    $layout->set_alignment("center");

    # Render it.
    $layout->show( 0, 600, $ctx );
    $pdf->save("out.pdf");

All PDF::API2 graphic and text methods can still be used, they won't interfere with the layout methods.

NOTES FOR PDF::API2/Builder USERS

Baselines

PDF::API2 and PDF::Builder render texts using the font baseline as origin.

This module typesets text in an area of possibly limited width and height. The origin is the top left of this area. Currently this area contains only a single line of text. This will change in the future when line breaking and paragraph formatting is implemented.

PDF::API2 and PDF::Builder coordinates have origin bottom left. This module produces information with respect to top left coordinates.

IMPORTANT NOTES FOR PANGO USERS

Coordinate system

Pango, layered upon Cairo, uses a coordinate system that starts off top left. So for western text the direction is increasing x and increasing y.

PDF::API2 uses the coordinate system as defined in the PDF specification. It starts off bottom left. For western text the direction is increasing x and decreasing y.

Pango Conformance Mode

Text::Layout can operate in one of two modes: convenience mode (enabled by default), and Pango conformance mode. The desired mode can be selected by calling the method set_pango_scaling().

Pango coordinates

Pango uses two device coordinates units: Pango units and device units. Pango units are 1024 (PANGO_SCALE) times the device units.

Several methods have two variants, e.g. get_size() and get_pixel_size(). The pixel-variant uses device units while the other variant uses Pango units.

In convenience mode, this module assumes no scaling. All units are PDF device units (1/72 inch).

Pango device units

Device units are used for font rendering. Pango device units are 96dpi while PDF uses 72dpi.

In convenience mode this is ignored. E.g. a Times 20 font will be of equal size in the two systems,

In Pango conformance mode you would need to specify a font size of 15360 to get a 20pt font.

SUPPORTED MARKUP

Text::Layout recognizes most of the Pango markup as provided by the Pango library version 1.50 or newer. However, not everything is supported.

Span attributes

font="DESC" font_desc="DESC"

Specifies a font to be used, e.g. Serif 20.

font_face="FAM" face="FAM"

Specifies a font family to be used.

font_family="FAM"

Same as font_face="FAM".

size=FNUM size=FNUMpt size=FNUM%

Font size in 1024ths of a point (conformance mode), or in points (e.g. '12.5pt'), or a percentage (e.g. '200%'), or one of the relative sizes 'smaller' or 'larger'.

Note that in Pango conformance mode, the actual font size is 96/72 larger. So "45pt" gives a 60pt font.

style="STYLE" font_style="STYLE"

Specifes the font style, e.g. italic.

weight="WEIGHT" font_weight="WEIGHT"

Specifies the font weight, e.g. bold.

foreground="COLOR" fgcolor="COLOR" color="COLOR"

Specifies the foreground colour, e.g. black.

background="COLOR" bgcolor="COLOR"

Specifies the background colour, e.g. white.

underline="TYPE"

Enables underlining. TYPE must be none, single, or double.

underline_color="COLOR"

The colour to be used for underlining, if enabled.

overline="TYPE"

Enables overlining. TYPE must be none, single, or double.

overline_color="COLOR"

The colour to be used for ovderlining, if enabled.

strikethrough="ARG"

Enables or disables overlining. ARG must be true or 1 to enable, and false or 0 to disable.

strikethrough_color="COLOR"

The colour to be used for striking, if enabled.

rise=NUM

In convenience mode, lowers the text by NUM/1024 of the font size. May be negative to rise the text.

In Pango conformance mode, rises the text by NUM units from the baseline. May be negative to lower the text.

Note: In Pango conformance mode, rise does not accumulate. Use baseline_shift instead.

rise=NUMpt rise=NUM% rise=NUMem rise=NUMex

Rises the text from the baseline. May be negative to lower the text.

Units are points if postfixed by pt, and a percentage of the current font size if postfixed by %.

em units are equal to the current font size, ex half the font size.

Note: This is not (yet?) part of the Pango markup standard.

baseline_shift=NUM beseline_shift=NUMpt baseline_shift=NUM%

Like rise, but accumulates.

Also supported but not part of the official Pango Markup specification.

href="URL"

Creates a clickable target that activates the URL.

Img (image) attributes

This is an extension to Pango markup.

Note that image markup elements may only occur as closed elements, i.e., <img/>.

id="ID"

Implementation dependent.

src="URI"

Source filename or url for the image.

width="WIDTH" (short: w="WIDTH")

The width the image should be considered to occupy, regardless its actual dimensions.

height="HEIGHT" (short: h="HEIGHT")

The height the image should be considered to occupy, regardless its actual dimensions.

x="XDISP"

Horizontal displacement of the image relative to the <img/> element.

y="YDISP"

Vertical displacement of the image relative to the <img/> element.

border="THICK"

Provide a border around the element. THICK denotes its thickness.

Strut attributes

This is an extension to Pango markup.

A strut is a markup element that has bounding box dimensions but no ink dimensions.

Note that strut markup elements may only occur as closed elements, i.e., <img/>.

label="LABEL"

An optional identifying label.

width="WIDTH" (short w="WIDTH")

The width of the strut. Default value is zero.

Width may be expressed in points, em (font size) or ex (half of font size).

ascender="ASC" (short: a="ASC")

The ascender of the strut. Optional.

May be expressed in points, em (font size) or ex (half of font size).

descender="DESC" (short: d="DESC")

The descender of the strut. Optional.

May be expressed in points, em (font size) or ex (half of font size).

Shortcuts

Equivalent span attributes for shortcuts.

b

weight=bold

big

larger

emp

style=italic

i

style=italic

s

strikethrough=true

small

size=smaller

strong

weight=bold

sub

size=smaller rise=-30%

sup

size=smaller rise=30%

tt

face=monospace

u

underline=single

METHODS

new( $pdf )

Creates a new layout instance for PDF::API2. This is for convenience. It is equivalent to

    use Text::Layout::PDFAPI2;
    $layout = Text::Layout::PDFAPI2->new($pdf);

For other implementations only the above method can be used.

The argument is the context for text formatting. In the case of PDF::API2 this will be a PDF::API2 object.

copy

Copies (clones) a layout instance.

The content is copied deeply, the context and fonts are copied by reference.

get_context

Gets the context of this layout.

context_changed

Not supported.

get_serial

Not supported.

set_text( $text )

Puts a string in this layout instance. No markup is processed.

Note that if you have used set_markup() on this layout before, you may want to call set_attributes() to clear the attributes set on the layout from the markup as this function does not clear all attributes.

get_text

Gets the content of this layout instance as a single string. Markup is ignored.

Returns undef if no text has been set.

get_character_count

Returns the number of characters in the text of this layout.

Basically the same as length of get_text().

Returns undef if no text has been set.

set_markup( $string )

Puts a string in this layout instance.

The string can contain Pango-compatible markup. See https://developer.gnome.org/pygtk/stable/pango-markup-language.html.

Implementation note: Although all markup is parsed, not all is implemented.

set_markup_with_accel

Not supported.

set_attributes
get_attributes

Not yet implemented.

set_font_description( $description )

Sets the default font description for the layout. If no font description is set on the layout, the font description from the layout's context is used.

$description is a Text::Layout::FontConfig object.

get_font_description

Gets the font description for the layout.

Returns undef if no font has been set yet.

set_width( $width )

Sets the width to which the lines of the layout should align, wrap or ellipsized. A value of zero or less means unlimited width. The width is in Pango units.

Implementation note: Only alignment is implemented.

get_width

Gets the width in Pango units for for this instance, or zero if unlimited.

set_height( $height )

Sets the height in Pango units for this instance.

Implementation note: Height restrictions are not yet implemented.

get_height

Gets the height in Pango units for this instance, or zero if no height restrictions apply.

set_wrap( $mode )

Sets the wrap mode; the wrap mode only has effect if a width is set on the layout with set_width(). To turn off wrapping, set the width to zero or less.

Not yet implemented.

get_wrap

Returns the current wrap mode.

Not yet implemented.

is_wrapped

Queries whether the layout had to wrap any paragraphs.

set_ellipsize( $mode )

Sets the type of ellipsization being performed for the layout.

Not yet implemented.

get_ellipsize

Gets the type of ellipsization being performed for the layout.

is_ellipsized

Queries whether the layout had to ellipsize any paragraphs.

Not yet implemented.

set_indent( $value )

Sets the width in Pango units to indent for each paragraph.

A negative value of indent will produce a hanging indentation. That is, the first line will have the full width, and subsequent lines will be indented by the absolute value of indent.

The indent setting is ignored if layout alignment is set to center.

Not yet implemented.

get_indent

Gets the current indent value in Pango units.

set_spacing( $value )

Sets the amount of spacing, in Pango units, between lines of the layout.

When placing lines with spacing, things are arranged so that

    line2.top = line1.bottom + spacing

Note: By default the line height (as determined by the font) for placing lines is used. The spacing set with this function is only taken into account when the line-height factor is set to zero with set_line_spacing().

Not yet implemented.

get_spacing

Gets the current amount of spacing, in Pango units.

set_line_spacing( $factor )

Sets a factor for line spacing. Typical values are: 0, 1, 1.5, 2. The default value is 0.

If factor is non-zero, lines are placed so that

    baseline2 = baseline1 + factor * height2

where height2 is the line height of the second line (as determined by the font(s)). In this case, the spacing set with set_spacing() is ignored.

If factor is zero, spacing is applied as before.

Not yet implemented.

get_line_spacing

Gets the current line spacing factor.

set_justify( $state )

Sets whether each complete line should be stretched to fill the entire width of the layout. This stretching is typically done by adding whitespace.

Not yet implemented.

get_justify

Gets whether each complete line should be stretched to fill the entire width of the layout.

set_auto_dir( $state )
get_auto_dir

Not supported.

set_alignment( $align )

Sets the alignment for the layout: how partial lines are positioned within the horizontal space available.

$align must be one of left, center, or right,

get_alignment

Gets the alignment for this instance.

set_tabs( $stops )
get_tabs

Not yet implemented.

set_single_paragraph_mode( $state )
get_single_paragraph_mode

Not yet implemented.

get_unknown_glyphs_count

Counts the number unknown glyphs in the layout.

Not yet implemented.

get_log_attrs
get_log_attrs_readonly

Not implemented.

index_to_pos( $index )

Converts from a character index within the layout to the onscreen position corresponding to the grapheme at that index, which is represented as rectangle.

Not yet implemented.

index_to_line_x ( $index )

Converts from a character index within the layout to line and X position.

Not yet implemented.

xy_to_index ( $x, $y )

Converts from $x,$y position to a character index within the layout.

Not yet implemented.

get_extents

Computes the logical and ink extents of the layout.

Logical extents are usually what you want for positioning things.

Return value in scalar context is a hash ref with 4 values: x, y, width, and height describing the logical extents of the layout. In list context an array of two hashrefs is returned. The first reflects the ink extents, the second the logical extents.

In the extents, x will reflect the offset when text is centered or right aligned. It will be zero for left aligned text. For right aligned text, it will be the width of the layout.

y will reflect the offset when text is centered vertically or bottom aligned. It will be zero for top aligned text.

See also get_pixel_extents below.

Implementation note: If the PDF::API support layer cannot calculate ink, this function returns two identical extents.

get_pixel_extents

Same as get_extents, but using device units.

The returned values are suitable for (assuming $pdf_text and $pdf_gfx are the PDF text and graphics contexts):

    $layout->render( $x, $y, $pdf_text );
    $box = $layout->get_pixel_extents;
    $pdf_gfx->translate( $x, $y );
    $pdf_gfx->rect( @$box{ qw( x y width height ) } );
    $pdf_gfx->stroke;
get_size

Returns the width and height of this layout.

In list context, width and height are returned as an two-element list. In scalar context a hashref with keys width and height is returned.

get_pixel_size

Same as get_size().

get_iter

Returns the layout for the first line.

Implementation note: This is a dummy, it returns the layout. It is provided so you can write $layout->get_iter()->get_baseline() to be compatible with the official Pango API.

get_baseline

Gets the Y position of the baseline of the first line in this layout.

Implementation note: Position is relative to top left, so due to the PDF coordinate system this is a negative number.

Note: The Python API only supports this method on iteration objects. See get_iter().

METHODS NOT IMPLEMENTED

get_line_count
get_line( $index )
get_line_readonly( $index )
get_lines
get_lines_readonly
line_get_extents
line_get_pixel_entents
line_index_to_x
line_x_to_index
line_get_x_ranges
line_get_height
get_cursor_pos
move_cursor_visually

ADDITIONAL METHODS

The following methods are not part of the Pango API.

set_font_size( $size )

Sets the size for the current font.

get_font_size

Returns the size of the current font.

get_bbox

Returns the bounding box of the text, w.r.t. the (top-left) origin.

bb = ( bl, x, y, width, height )

bb[0] = baseline distance from the top.

bb[1] = displacement from the left, nonzero for centered and right aligned text

bb[2] = displacement from the top, usually zero

bb[3] = advancewidth

bb[4] = height

Note that the bounding box will in general be equal to the font bounding box except for the advancewidth.

NOTE: Some fonts do not include accents on capital letters in the ascend.

If an argument is supplied and true, get_bbox() will attempt to calculate the ink extents as well, and add these as another set of 4 elements,

In list context returns the array of values, in scalar context an array ref.

get_struts

Returns the list of the struts in the layout, if any.

Each element of the list is a hash, with key/value pairs for all the attributes of the corresponding <strut/> markup item.

Additionally, in each element there's a key _x that contains the horizontal displacement of the strut relative to the star of the layout.

In list context returns the array of values, in scalar context an array ref.

align_struts( $other )

Aligns the fragments in both layouts to each other, based on the struts.

This will adjust the widths of the struts of both participants.

spread_struts

Evenly distributes the available space over the struts, if any.

show( $x, $y, $text )

Transfers the content of this layout instance to the designated graphics environment.

Use this instead of Pango::Cairo::show_layout().

For PDF::API2, $text must be an object created by the $page->text method.

set_pango_mode( $enable )

Enable/disable Pango conformance mode. See "Pango Conformance Mode".

Returns the internal Pango scaling factor if enabled.

get_pango

See "Pango Conformance Mode".

Returns the internal Pango scaling factor if conformance mode is enabled, otherwise it returns 1 (one).

SEE ALSO

Description of the Pango Markup Language: https://docs.gtk.org/Pango/pango_markup.html#pango-markup.

Documentation of the Pango Layout class: https://docs.gtk.org/Pango/class.Layout.html.

PDF::API2, PDF::Builder, HarfBuzz::Shaper, Font::TTF.

AUTHOR

Johan Vromans, <JV at CPAN dot org>

SUPPORT

Development of this module takes place on GitHub: https://github.com/sciurius/perl-Text-Layout.

You can find documentation for this module with the perldoc command.

  perldoc Text::Layout

Please report any bugs or feature requests using the issue tracker on GitHub.

LICENSE

Copyright (C) 2019,2024 Johan Vromans

This module is free software. You can redistribute it and/or modify it under the terms of the Artistic License 2.0.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.