TITLE

Synopsis 26 - Documentation

AUTHOR

Damian Conway <damian@conway.org>

VERSION

Maintainer:

Damian Conway

Date:

9 Apr 2005

Last Modified:

14 Feb 2007

Perldoc

Perldoc is an easy-to-use markup language with a simple, consistent underlying document object model. Perldoc can be used for writing language documentation, for documenting programs and modules, as well as for other types of document composition.

Perldoc allows for multiple syntactic dialects, all of which map onto the same set of standard document objects. The standard dialect is named "Pod".

The Pod Dialect

Pod is an evolution of Perl 5's Plain Ol' Documentation (POD) markup. Compared to Perl 5 POD, Perldoc's Pod dialect is much more uniform, somewhat more compact, and considerably more expressive. The Pod dialect also differs in that it is a purely descriptive mark-up notation, with no presentational components.

General syntactic structure

Pod documents are specified using directives, which are used to declare configuration information and to delimit blocks of textual content. Every directive starts with an equals sign (=) in the first column.

The content of a document is specified within one or more blocks. Every Pod block may be declared in any of three equivalent forms: delimited style, paragraph style, or abbreviated style.

Anything in a document that is neither a Pod directive nor contained within a Pod block is treated as "ambient" material. Typically this would be the source code of the program that the Pod is documenting. Pod parsers still parse this text into the internal representation of the file (representing it as a Perldoc::Block::Ambient block), but renderers will usually ignore such blocks.

In Perl 5's POD format, once a POD directive is encountered, the parser considers everything that follows to be POD, until an explicit =cut directive is encountered, at which point the parser flips between POD and ambient text. The Perl 6 Pod format is different. A Pod parser always reverts to "ambient" at the end of each Pod directive or block. To cause the parser to remain in Pod mode, you must enclose the desired Pod region in a pod block:

=begin pod

=head1 A heading

This is Pod too. Specifically, this is a simple C<para> block

    $this = pod('also');  # Specifically, a code block

=end pod

Alternatively you can indicate an entire file contains only Pod, by giving it a .pod suffix.

Delimited blocks

Delimited blocks are bounded by =begin and =end markers, both of which are followed by a valid identifier1, which is the typename of the block. Typenames that are entirely lowercase (for example: =begin head1) or entirely uppercase (for example: =begin SYNOPSIS) are reserved.

After the typename, the rest of the =begin marker line is treated as configuration information for the block. This information is used in different ways by different types of blocks, and is specified using Perl6ish :key<value> or key=>value pairs (which must, of course, be constants since Perldoc is a specification language, not a programming language). See Synopsis 2 for a summary of the Perl 6 pair notation.

The configuration section may be extended over subsequent lines by starting those lines with an = in the first column followed by a whitespace character.

The lines following the opening delimiter and configuration are the data or contents of the block, which continue until the block's =end marker line. The general syntax is:

=begin BLOCK_TYPE  OPTIONAL CONFIG INFO
=                  OPTIONAL EXTRA CONFIG INFO
BLOCK CONTENTS
=end BLOCK_TYPE

For example:

=begin table  :caption<Table of Contents>
    Constants           1
    Variables           10
    Subroutines         33
    Everything else     57
=end table
=begin Name  :required
=            :width(50)
The applicant's full name
=end Name
=begin Contact  :optional
The applicant's contact details
=end Contact

Note that no blank lines are required around the directives; blank lines within the contents are always treated as part of the contents. This is a universal feature of Pod.

Note also that in the following specifications, a "blank line" is a line that is either empty or that contains only whitespace characters. That is, a blank line matches the Perl 6 pattern: /^^ \h* $$/. Pod uses blank lines as delimiters, rather than empty lines, the principle of least surprise.

Paragraph blocks

Paragraph blocks are introduced by a =for marker and terminated by the next Pod directive or the first blank line (which is not considered to be part of the block's contents). The =for marker is followed by the name of the block and optional configuration information. The general syntax is:

=for BLOCK_TYPE  OPTIONAL CONFIG INFO
=                OPTIONAL EXTRA CONFIG INFO
BLOCK DATA

For example:

=for table  :caption<Table of Contents>
    Constants           1
    Variables           10
    Subroutines         33
    Everything else     57
=for Name  :required
=          :width(50)
The applicant's full name
=for Contact  :optional   
The applicant's contact details

Abbreviated blocks

Abbreviated blocks are introduced by an '=' sign in the first column, which is followed immediately by the typename of the block. The rest of the line is treated as block data, rather than as configuration. The content terminates at the next Pod directive or the first blank line (which is not part of the block data). The general syntax is:

=BLOCK_TYPE  BLOCK DATA
MORE BLOCK DATA

For example:

=table
    Constants           1
    Variables           10
    Subroutines         33
    Everything else     57
=Name     The applicant's full name
=Contact  The applicant's contact details

Note that abbreviated blocks cannot specify configuration information. If configuration is required, use a =for or =begin/=end instead.

Block equivalence

The three block specifications (delimited, paragraph, and abbreviated) are treated identically by the underlying documentation model, so you can use whichever form is most convenient for a particular documentation task. In the descriptions that follow, the abbreviated form will generally be used, but should be read as standing for all three forms equally.

For example, although Headings shows only:

=head1 Top Level Heading

this automatically implies that you could also write that block as:

=for head1
Top Level Heading

or:

=begin head1
Top Level Heading
=end head1

Standard configuration options

Pod predefines a small number of standard configuration options that can be applied uniformly to built-in block types. These include:

:nested

This option specifies that the block is to be nested within its current context. For example, nesting might be applied to block quotes, to textual examples, or to commentaries. In addition the =code, =item, =input, and =output blocks all have implicit nesting.

Nesting of blocks is usually rendered by adding extra indentation to the block contents, but may also be indicated in others ways: by boxing the contents, by changing the font or size of the nested text, or even by folding the text (so long as a visible placeholder is provided).

Occasionally it is desirable to nest content by more than one level:

=begin para :nested
=begin para   :nested
=begin para     :nested
"We're going deep, deep, I<deep> undercover!"
=end para
=end para
=end para

This can be simplified by giving the :nested option a positive integer value:

=begin para :nested(3)
"We're going deep, deep, I<deep> undercover!"
=end para

You can also give the option a value of zero, to defeat any implicit nesting that might normally be applied to a paragraph. For example, to specify a block of code that should appear without its usual nesting:

=comment Don't nest this code block in the usual way...
=begin code :nested(0)
         1         2         3         4         5         6
123456789012345678901234567890123456789012345678901234567890
|------|-----------------------|---------------------------|
  line        instruction                comments
 number           code
=end code

Note that :!nested could also be used for this purpose:

=begin code :!nested

:numbered

This option specifies that the block is to be numbered. The most common use of this option is to create numbered headings and ordered lists, but it can be applied to any block.

It is up to individual renderers to decide how to display any numbering associated with other types of blocks.

:term

This option specifies that a list item is the definition of a term. See Definition lists.

:formatted

This option specifies that the contents of the block should be treated as if they had one or more formatting codes placed around them.

For example, instead of:

=for comment
    The next para is both important and fundamental,
    so doubly emphasize it...
=begin para
B<I<
Warning: Do not immerse in water. Do not expose to bright light.
Do not feed after midnight.
>>
=end para

you can just write:

=begin para :formatted<B I>
Warning: Do not immerse in water. Do not expose to bright light.
Do not feed after midnight.
=end para

The internal representations of these two versions are exactly the same, except that the second one retains the :formatted option information as part of the resulting block object.

Like all formatting codes, codes applied via a :formatted are inherently cumulative. For example, if the block itself is already inside a formatting code, that formatting code will still apply, in addition to the extra "basis" and "important" formatting specified by :formatted<B I>.

:like

This option specifies that a block or config has the same formatting properties as the type named by its value. This is useful for creating related configurations. For example:

=config head2  :like<head1> :formatted<I>

:allow

This option expects a list of formatting codes that are to be recognized within any V<> codes that appear in (or are implicitly applied to) the current block. The option is most often used on =code blocks to allow mark-up within those otherwise verbatim blocks, though it can be used in any block that contains verbatim text. See Formatting within code blocks.

Blocks

Pod offers notations for specifying a range of standard block types...

Headings

Pod provides an unlimited number of levels of heading, specified by the =headN block marker. For example:

=head1 A Top Level Heading
=head2 A Second Level Heading
=head3 A third level heading
=head86 A "Missed it by I<that> much!" heading

While Pod parsers are required to recognize and distinguish all levels of heading, Pod renderers are only required to provide distinct renderings of the first four levels of heading (though they may, of course, provide more than that). Headings at levels without distinct renderings would typically be rendered like the lowest distinctly rendered level.

Numbered headings

You can specify that a heading is numbered using the :numbered option. For example:

=for head1 :numbered
The Problem
=for head1 :numbered
The Solution
=for head2 :numbered
Analysis
=for head3 
Overview
=for head3
Details
=for head2 :numbered
Design
=for head1 :numbered
The Implementation

which would produce:

1. The Problem

2. The Solution

2.1. Analysis

Overview

Details

2.2: Design

3. The Implementation

It is usually better to preset a numbering scheme for each heading level, in a series of configuration blocks:

=config head1 :numbered
=config head2 :numbered
=config head3 :!numbered
=head1 The Problem
=head1 The Solution
=head2   Analysis
=head3     Overview
=head3     Details
=head2   Design
=head1 The Implementation

Alternatively, as a short-hand, if the first whitespace-delimited word in a heading consists of a single literal # character, the # is removed and the heading is treated as if it had a :numbered option:

=head1 # The Problem
=head1 # The Solution
=head2   # Analysis
=head3       Overview
=head3       Details
=head2   # Design
=head1 # The Implementation

Note that, even though renderers are not required to distinctly render more than the first four levels of heading, they are required to correctly honour arbitrarily nested numberings. That is:

=head6 # The Rescue of the Kobayashi Maru

should produce something like:

2.3.8.6.1.9. The Rescue of the Kobayashi Maru

Ordinary paragraph blocks

Ordinary paragraph blocks consist of text that is to be formatted into a document at the current level of nesting, with whitespace squeezed, lines filled, and any special inline mark-up applied.

Ordinary paragraphs consist of one or more consecutive lines of text, each of which starts with a non-whitespace character at column 1. The paragraph is terminated by the first blank line or block directive. For example:

=head1 This is a heading block
This is an ordinary paragraph.
Its text  will   be     squeezed     and
short lines filled. It is terminated by
the first blank line.
This is another ordinary paragraph.
Its     text    will  also be squeezed and
short lines filled. It is terminated by
the trailing directive on the next line.
=head2 This is another heading block

Within a =pod, =item, =nested, or =END block, ordinary paragraphs do not require an explicit marker or delimiters, but there is also an explicit para marker (which may be used anywhere):

=para
This is an ordinary paragraph.
Its text  will   be     squeezed     and
short lines filled.

and likewise the longer =for and =begin/=end forms. For example:

=begin para
    This is an ordinary paragraph.
    Its text  will   be     squeezed     and
    short lines filled.
=end para

As the previous example implies, when any form of explicit para block is used, any whitespace at the start of each line is removed, so the paragraph text no longer has to begin at column 1.

Code blocks

Code blocks are used to specify pre-formatted text (typically source code), which should be rendered without rejustification, without whitespace-squeezing, and without recognizing any inline formatting codes. Code blocks also have an implicit nesting associated with them. Typically these blocks are used to show examples of code, mark-up, or other textual specifications, and are rendered using a fixed-width font.

A code block may be implicitly specified as one or more lines of text, each of which starts with a whitespace character. The block is terminated by a blank line. For example:

This ordinary paragraph introduces a code block:

    $this = 1 * code('block');
    $which.is_specified(:by<indenting>);

Implicit code blocks may only be used within =pod, =item, =nested, or =END blocks.

There is also an explicit =code block (which can be specified within any other block type, not just =pod, =item, etc.):

The C<loud_update()> subroutine adds feedback:

=begin code

sub loud_update ($who, $status) {
    say "$who -> $status";

    silent_update($who, $status);
}

=end code

As the previous example demonstrates, within an explicit =code block the code can start at the first column. Furthermore, lines that start with whitespace characters have that whitespace preserved exactly (in addition to the implicit nesting of the code). Explicit =code blocks may also contain empty lines.

Formatting within code blocks

Although =code blocks automatically disregard all formatting codes, occasionally you may still need to specify some formatting within a code block. For example, you may wish to emphasize a particular keyword in an example (using a B<> code). Or you may want to indicate that part of the example is metasyntactic (using the R<> code). Or you might need to insert a non-ASCII character (using the E<> code).

You can specify a list of formatting codes that should still be recognized within a code block using the :allow option. The value of the :allow option must be a list of the (single-letter) names of one or more formatting codes. Those codes will then remain active inside the code block. For example:

=begin code :allow< B R >
sub demo {
    B<say> 'Hello R<name>';
}
=end code

would be rendered:

sub demo {
    say 'Hello name';
}

Although code blocks are verbatim by default, it can still occasionally be useful to explicitly :allow the verbatim formatting code (V<>). That's because, although the contents of an explicit =code block are allowed to start in column 1, they are not allowed to start with an equals sign in that first column2. So, if an = is needed in column 1, it must be declared verbatim:

=begin code :allow<V>

V<=> in the first column is always a Perldoc directive

=end code

I/O blocks

Pod also provides blocks for specifying the input and output of programs.

The =input block is used to specify pre-formatted keyboard input, which should be rendered without rejustification or squeezing of whitespace.

The =output block is used to specify pre-formatted terminal or file output which should also be rendered without rejustification or whitespace-squeezing.

Note that, like =code blocks, both =input and =output blocks have an implicit level of nesting. They are also like =code blocks in that they are typically rendered in a fixed-width font, though ideally all three blocks would be rendered in distinct font/weight combinations (for example: regular serifed for code, bold sans-serif for input, and regular sans-serif for output).

Unlike =code blocks, both =input and =output blocks honour any nested formatting codes. This is particular useful since a sample of input will often include prompts (which are, of course, output). Likewise a sample of output may contain the occasional interactive component. Pod provides special formatting codes (K<> and T<>) to indicate embedded input or output, so you can use the block type that indicates the overall purpose of the sample (i.e. is it demonstrating an input operation or an output sequence?) and then use the "contrasting" formatting code within the block.

For example, to include a small amount of input in a sample of output:

=begin output
    Name:    Baracus, B.A.
    Rank:    Sgt
    Serial:  1PTDF007

    Do you want additional personnel details? K<y>

    Height:  180cm/5'11"
    Weight:  104kg/230lb
    Age:     49

    Print? K<n>
=end output

Lists

Lists in Pod are specified as a series of contiguous =item blocks. No special "container" directives or other delimiters are required to enclose the entire list. For example:

The seven suspects are:
=item  Happy
=item  Dopey
=item  Sleepy
=item  Bashful
=item  Sneezy
=item  Grumpy
=item  Keyser Soze

List items have one implicit level of nesting:

The seven suspects are:

Lists may be multi-level, with items at each level specified using the =item1, =item2, =item3, etc. blocks. Note that =item is just an abbreviation for =item1. For example:

=item1  Animal
=item2     Vertebrate
=item2     Invertebrate
=item1  Phase
=item2     Solid
=item2     Liquid
=item2     Gas
=item2     Chocolate

which would be rendered something like:

• Animal

– Vertebrate

– Invertebrate

• Phase

– Solid

– Liquid

– Gas

– Chocolate

It is an error for a "level-N+1" =item block (e.g. an =item2, =item3, etc.) to appear anywhere except where there is a preceding "level-N" =item. That is, an =item3 can only be specified if an =item2 appears somewhere before it, and that =item2 can only appear if there is a preceding =item1.

Note that item blocks within the same list are not physically nested. That is, lower-level items should not be specified inside higher-level items:

=comment WRONG...
=begin item1          --------------
The choices are:                    | 
=item2 Liberty        ==< Level 2   |==<  Level 1
=item2 Death          ==< Level 2   |
=item2 Beer           ==< Level 2   |
=end item1            --------------
=comment CORRECT...
=begin item1          ---------------
The choices are:                     |==< Level 1
=end item1            ---------------
=item2 Liberty        ==================< Level 2
=item2 Death          ==================< Level 2
=item2 Beer           ==================< Level 2

Ordered lists

An item is part of an ordered list if the item has a :numbered configuration option:

=for item1 :numbered
Visito
=for item2 :numbered
Veni
=for item2 :numbered
Vidi
=for item2 :numbered
Vici

This would produce something like:

1. Visito

1.1. Veni

1.2. Vidi

1.3. Vici

although the numbering scheme is entirely at the discretion of the renderer, so it might equally well be rendered:

1. Visito

1a. Veni

1b. Vidi

1c. Vici

or even:

A: Visito

  (i) Veni

 (ii) Vidi

(iii) Vici

Alternatively, if the first word of the item consists of a single # character, the item is treated as having a :numbered option:

=item1  # Visito
=item2     # Veni
=item2     # Vidi
=item2     # Vici

To specify an unnumbered list item that starts with a literal #, either make it verbatim:

=item V<#> introduces a comment

or explicitly mark the item itself as being unnumbered:

=for item :!numbered
# introduces a comment

The numbering of successive =item1 list items increments automatically, but is reset to 1 whenever any other kind of non-ambient Perldoc block appears between two =item1 blocks. For example:

The options are:
=item1 # Liberty
=item1 # Death
=item1 # Beer
The tools are:
=item1 # Revolution
=item1 # Deep-fried peanut butter sandwich
=item1 # Keg

would produce:

The options are:

1. Liberty

2. Death

3. Beer

The tools are:

1. Revolution

2. Deep-fried peanut butter sandwich

3. Keg

The numbering of nested items (=item2, =item3, etc.) only resets (to 1) when the higher-level item's numbering either resets or increments.

To prevent a numbered =item1 from resetting after a non-item block, you can specify the :continued option:

=for item1
# Retreat to remote Himalayan monastery

=for item1
# Learn the hidden mysteries of space and time

I<????>

=for item1 :continued
# Prophet!

which produces:

1. Retreat to remote Himalayan monastery

2. Learn the hidden mysteries of space and time

????

3. Prophet!

Definition lists

To create term/definition lists, specify the term as a configuration value of the item, and the definition as the item's contents:

=for item  :term<MAD>
Affected with a high degree of intellectual independence.

=for item  :term<MEEKNESS>
Uncommon patience in planning a revenge that is worth while.

=for item  :term<MORAL>
Conforming to a local and mutable standard of right.
Having the quality of general expediency.

An item that's specified as a term can still be numbered:

=for item :numbered :term<SELFISH>
Devoid of consideration for the selfishness of others. 

=for item :numbered :term<SUCCESS> 
The one unpardonable sin against one's fellows.

Unordered lists

List items that do not specify either the :numbered or :term options are unordered. Typically, such lists are rendered with bullets. For example:

=item1 Reading
=item2 Writing
=item3 'Rithmetic

might be rendered:

•  Reading

—  Writing

¤  'Rithmetic

As with numbering styles, the bulleting strategy used for different levels within a nested list is entirely up to the renderer.

Multi-paragraph list items

Use the delimited form of the =item block to specify items that contain multiple paragraphs. For example:

Let's consider two common proverbs:
=begin item :numbered
I<The rain in Spain falls mainly on the plain.>
This is a common myth and an unconscionable slur on the Spanish
people, the majority of whom are extremely attractive.
=end item
=begin item :numbered
I<The early bird gets the worm.>
In deciding whether to become an early riser, it is worth
considering whether you would actually enjoy annelids
for breakfast.
=end item
As you can see, folk wisdom is often of dubious value.

which produces:

Let's consider two common proverbs:

  1. The rain in Spain falls mainly on the plain.

    This is a common myth and an unconscionable slur on the Spanish people, the majority of whom are extremely attractive.

  2. The early bird gets the worm.

    In deciding whether to become an early riser, it is worth considering whether you would actually enjoy annelids for breakfast.

As you can see, folk wisdom is often of dubious value.

Nesting blocks

Any block can be nested by specifying an :nested option on it:

=begin para :nested
    We are all of us in the gutter,E<NL>
    but some of us are looking at the stars!
=end para

However, qualifying each nested paragraph individually quickly becomes tedious if there are many in a sequence, or if multiple levels of nesting are required:

=begin para :nested
    We are all of us in the gutter,E<NL>
    but some of us are looking at the stars!
=end para
=begin para :nested(2)
        -- Oscar Wilde
=end para

So Pod provides a =nested block that marks all its contents as being nested:

=begin nested
We are all of us in the gutter,E<NL>
but some of us are looking at the stars!
=begin nested
-- Oscar Wilde
=end nested
=end nested

Nesting blocks can contain any other kind of block, including implicit paragraph and code blocks.

Tables

Simple tables can be specified in Perldoc using a =table block. The table may be given an associated description or title using the :caption option.

Columns are separated by whitespace, vertical lines (|), or border intersections (+). Rows can be specified in one of two ways: either one row per line, with no separators; or multiple lines per row with explicit horizontal separators (whitespace, intersections (+), or horizontal lines: -, =, _) between every row. Either style can also have an explicitly separated header row at the top.

Each individual table cell is separately formatted, as if it were a nested =para.

This means you can create tables compactly, line-by-line:

=table
    The Shoveller   Eddie Stevens     King Arthur's singing shovel   
    Blue Raja       Geoffrey Smith    Master of cutlery              
    Mr Furious      Roy Orson         Ticking time bomb of fury      
    The Bowler      Carol Pinnsler    Haunted bowling ball           

or line-by-line with multi-line headers:

=table
    Superhero     | Secret          | 
                  | Identity        | Superpower 
    ==============|=================|================================
    The Shoveller | Eddie Stevens   | King Arthur's singing shovel   
    Blue Raja     | Geoffrey Smith  | Master of cutlery              
    Mr Furious    | Roy Orson       | Ticking time bomb of fury      
    The Bowler    | Carol Pinnsler  | Haunted bowling ball           

or with multi-line headers and multi-line data:

=begin table :caption('The Other Guys')
                Secret                                         
Superhero       Identity          Superpower                     
=============   ===============   ===================
The Shoveller   Eddie Stevens     King Arthur's
                                  singing shovel   
Blue Raja       Geoffrey Smith    Master of cutlery              
Mr Furious      Roy Orson         Ticking time bomb
                                  of fury      
The Bowler      Carol Pinnsler    Haunted bowling ball           
=end table

Named blocks

Blocks whose names are not recognized as Pod built-ins are assumed to be destined for specialized renderers or parser plug-ins. For example:

=begin Xhtml
<object type="video/quicktime" data="onion.mov">
=end Xhtml

or:

=Image http://www.perlfoundation.org/images/perl_logo_32x104.png

Named blocks are converted by the Perldoc parser to block objects; specifically, to objects of a subclass of the standard Perldoc::Block::Named class.

For example, the blocks of the previous example would be converted to objects of the classes Perldoc::Block::Named::Xhtml and Perldoc::Block::Named::Image respectively. Both of those classes would be automatically created as subclasses of the Perldoc::Block::Named class (unless they were already defined via a prior =use directive).

The resulting object's .typename method retrieves the short name of the block type: 'Xhtml', 'Image', etc. The object's .config method retreives the list of configuration options (if any). The object's .contents method retrieves a list of the block's verbatim contents.

Named blocks for which no explicit class has been defined or loaded are usually not rendered by the standard renderers.

Note that all block names consisting entirely of lower-case or entirely of upper-case letters are reserved. See Semantic blocks.

Comments

Comments are Pod blocks that are never rendered by any renderer. They are, of course, still included in any internal Perldoc representation, and are accessible via the Perldoc API.

Comments are useful for meta-documentation (documenting the documentation):

=comment Add more here about the algorithm

and for temporarily removing parts of a document:

=item # Retreat to remote Himalayan monastery

=item # Learn the hidden mysteries of space and time

=item # Achieve enlightenment

=begin comment
=item # Prophet!
=end comment

Note that, since the Perl interpreter never executes embedded Perldoc blocks, comment blocks can also be used as (nestable!) block comments in Perl 6:

=begin comment
for my $file (@files) {
    system("rm -rf $file");
}
=end comment

The =END block

The =END block is special in that all three of its forms (delimited, paragraph, and abbreviated) are terminated only by the end of the current file. That is, neither =END nor =for END are terminated by the next blank line, and =end END has no effect within a =begin END block. A warning is issued if an explicit =end END appears within a document.

An =END block indicates the end-point of any ambient material within the document. This means that the parser will treat all the remaining text in the file as Perldoc, even if it is not inside an explicit block. In other words, apart from its special end-of-file termination behaviour, an =END block is in all other respects identical to a =pod block.

Data blocks

Named Perldoc blocks whose typename is DATA are the Perl 6 equivalent of the Perl 5 __DATA__ section. The difference is that =DATA blocks are just regular Pod blocks and may appear anywhere within a source file, and as many times as required. Synopsis 2 describes the new Perl 6 interface for inline data.

Semantic blocks

All other uppercase block typenames are reserved for specifying standard documentation, publishing, or source components. In particular, all the standard components found in Perl and manpage documentation have reserved uppercase typenames.

Standard semantic blocks include:

=NAME
=VERSION
=SYNOPSIS
=DESCRIPTION
=USAGE
=INTERFACE 
=METHOD
=SUBROUTINE
=OPTION
=DIAGNOSTIC
=ERROR
=WARNING
=DEPENDENCY
=BUG
=SEEALSO
=ACKNOWLEDGEMENT
=AUTHOR
=COPYRIGHT
=DISCLAIMER 
=LICENCE
=LICENSE
=TITLE
=SECTION
=CHAPTER
=APPENDIX
=TOC
=INDEX
=FOREWORD
=SUMMARY

The plural forms of each of these keywords are also reserved, and are aliases for the singular forms.

Most of these blocks would typically be used in their full delimited forms:

=begin SYNOPSIS
    use Perldoc::Parser

    my Perldoc::Parser $parser .= new();

    my $tree = $parser.parse($fh);
=end SYNOPSIS

The use of these reserved keywords is not required; you can still just write:

=head1 SYNOPSIS
=begin code
    use Perldoc::Parser
    
    my Perldoc::Parser $parser .= new();
    
    my $tree = $parser.parse($fh);
=end code

However, using the keywords adds semantic information to the documentation, which may assist various renderers, summarizers, coverage tools, and other utilities.

Note that there is no requirement that semantic blocks be rendered in a particular way (or at all). Specifically, it is not necessary to preserve the capitalization of the keyword. For example, the =SYNOPSIS block of the preceding example might be rendered like so:

3.  Synopsis

use Perldoc::Parser;
    
my Perldoc::Parser $parser .= new();
    
my $tree = $parser.parse($fh);

Formatting codes

Formatting codes provide a way to add inline mark-up to a piece of text within the contents of (most types of) block. Formatting codes are themselves a type of block, and most of them may nest sequences of any other type of block (most often, other formatting codes). In particular, you can nest comment blocks in the middle of a formatting code:

B<I shall say this loudly
=begin comment
and repeatedly
=end comment
and with emphasis.>

All Pod formatting codes consist of a single capital letter followed immediately by a set of angle brackets. The brackets contain the text or data to which the formatting code applies. You can use a set of single angles (<...>), a set of double angles («...»), or multiple single-angles (<<<...>>>).

Within angle delimiters, you cannot use sequences of the same angle characters that are longer than the delimiters:

=comment
    These are errors...

C< $foo<<bar>> >
The Perl 5 heredoc syntax was: C< <<END_MARKER >

You can use sequences of angles that are the same length as the delimiters, but they must be balanced. For example:

C<  $foo<bar>   >
C<< $foo<<bar>> >>

If you need an unbalanced angle, either use different delimiters:

C«$foo < $bar»
The Perl 5 heredoc syntax was: C« <<END_MARKER »

or delimiters with more consecutive angles than your text contains:

C<<$foo < $bar>>
The Perl 5 heredoc syntax was: C<<< <<END_MARKER >>>

A formatting code ends at the matching closing angle bracket(s), or at the end of the enclosing block or formatting code in which the opening angle bracket was specified, whichever comes first. Pod parsers are required to issue a warning whenever a formatting code is terminated by the end of an outer block rather than by its own delimiter (unless the user explicitly disables the warning).

Significance indicators

Pod provides three formatting codes that flag their contents with increasing levels of significance:

Definitions

The D<> formatting code indicates that the contained text is a definition, introducing a term that the adjacent text elucidates. For example:

There ensued a terrible moment of D<coyotus interruptus>: a brief
suspension of the effects of gravity, accompanied by a sudden
to-the-camera realisation of imminent downwards acceleration.

A definition may be given synonyms, which are specified after a vertical bar and separated by semicolons:

A D<Formatting code|formatting codes;formatters> provides a way
to add inline mark-up to a piece of text.

A definition would typically be rendered in italics or <dfn>...</dfn> tags and will often be used as a link target for subsequent instances of the term (or any of its specified synonyms) within a hypertext.

Example specifiers

Perldoc provides formatting codes for specifying inline examples of input, output, code, and metasyntax:

Verbatim text

The V<> formatting code treats its entire contents as being verbatim, disregarding every apparent formatting code within it. For example:

The B<V< V<> >> formatting code disarms other codes
such as V< I<>, C<>, B<>, and M<> >.

Note, however that the V<> code only changes the way its contents are parsed, not the way they are rendered. That is, the contents are still wrapped and formatted like plain text, and the effects of any formatting codes surrounding the V<> code are still applied to its contents. For example the previous example is rendered:

The V<> formatting code disarms other codes such as I<>, C<>, B<>, and M<> .

You can prespecify formatting codes that remain active within a V<> code, using the :allow option.

Inline comments

The Z<> formatting code indicates that its contents constitute a zero-width comment, which should not be rendered by any renderer. For example:

The "exeunt" command Z<Think about renaming this command?> is used
to quit all applications.

In Perl 5 POD, the Z<> code was widely used to break up text that would otherwise be considered mark-up:

In Perl 5 POD, the ZZ<><> code was widely used to break up text
that would otherwise be considered mark-up.

That technique still works, but it's now easier to accomplish the same goal using a verbatim formatting code:

In Perl 5 POD, the V<Z<>> code was widely used to break up text
that would otherwise be considered mark-up.

Moreover, the C<> code automatically treats its contents as being verbatim, which often eliminates the need for the V<> as well:

In Perl 5 POD, the C<Z<>> code was widely used to break up text
that would otherwise be considered mark-up.

The Z<> formatting code is the inline equivalent of a =comment block.

Links

The L<> code is used to specify all kinds of links, filenames, citations, and cross-references (both internal and external).

A link specification consists of a scheme specifier terminated by a colon, followed by an external address (in the scheme's preferred syntax), followed by an internal address (again, in the scheme's syntax). All three components are optional, though at least one must be present in any link specification.

Usually, in schemes where an internal address makes sense, it will be separated from the preceding external address by a #, unless the particular addressing scheme requires some other syntax. When new addressing schemes are created specifically for Perldoc it is strongly recommended that # be used to mark the start of internal addresses.

Standard schemes include:

http: and https:

A standard web URL. For example:

This module needs the LAME library
(available from L<http://www.mp3dev.org/mp3/>)

If the link does not start with // it is treated as being relative to the location of the current document:

See also: L<http:tutorial/faq.html> and
L<http:../examples/index.html>

file:

A filename on the local system. For example:

Next, edit the global config file (that is, either
L<file:/usr/local/lib/.configrc> or L<file:~/.configrc>).

Filenames that don't begin with a / or a ~ are relative to the current document's location:

Then, edit the local config file (that is, either
L<file:.configrc> or L<file:CONFIG/.configrc>.

mailto:

An email address. Typically, activating this type of link invokes a mailer. For example:

Please forward bug reports to L<mailto:devnull@rt.cpan.org>

man:

A link to the system manpages. For example:

This module implements the standard
Unix L<man:find(1)> facilities.

doc:

A link to some other documentation, typically a module or part of the core documentation. For example:

You may wish to use L<doc:Data::Dumper> to
view the results. See also: L<doc:perldata>.

defn:

A link to the definition of the specified term within the current document. For example:

He was highly prone to D<lexiphania>: an unfortunate proclivity for
employing sesquipedalian words (such as "proclivity",
"sesquipedalian", and indeed "lexiphania").

and later, to link back to the definition

To treat his chronic L<defn:lexiphania> the doctor prescribed an
immediate glossectomy or, if that proved ineffective, a complete
cephalectomy.

isbn: and issn:

The International Standard Book Number or International Standard Serial Number for a publication. For example:

The Perl Journal was a registered 
serial publication (L<issn:1087-903X>)

To refer to a specific section within a webpage, manpage, or Perldoc document, add the name of that section after the main link, separated by a #. For example:

Also see: L<man:bash(1)#Compound Commands>,
L<doc:perlsyn#For Loops>, and
L<http://dev.perl.org/perl6/syn/S04.html#The_for_statement>

To refer to a section of the current document, omit the external address:

This mechanism is described under L<doc:#Special Features> below.

The scheme name may also be omitted in that case:

This mechanism is described under L<#Special Features> below.

Normally a link is presented as some rendered version of the link specification itself. However, you can specify an alternate presentation by prefixing the link with the desired text and a vertical bar. Whitespace is not significant on either side of the bar. For example:

This module needs the L<LAME library|http://www.mp3dev.org/mp3/>.

You could also write the code
L<in Latin | doc:Lingua::Romana::Perligata>

Placement links

A second kind of link—the P<> or placement link—works in the opposite direction. Instead of directing focus out to another document, it allows you to draw the contents of another document into your own.

In other words, the P<> formatting code takes a URI and (where possible) places the contents of the corresponding document inline in place of the code itself.

P<> codes are handy for breaking out standard elements of your documentation set into reusable components that can then be incorporated directly into multiple documents. For example:

=COPYRIGHT
P<file:/shared/docs/std_copyright.pod>
=DISCLAIMER
P<http://www.MegaGigaTeraPetaCorp.com/std/disclaimer.txt>

might produce:

Copyright

This document is copyright (c) MegaGigaTeraPetaCorp, 2006. All rights reserved.

Disclaimer

ABSOLUTELY NO WARRANTY IS IMPLIED. NOT EVEN OF ANY KIND. WE HAVE SOLD YOU THIS SOFTWARE WITH NO HINT OF A SUGGESTION THAT IT IS EITHER USEFUL OR USABLE. AS FOR GUARANTEES OF CORRECTNESS...DON'T MAKE US LAUGH! AT SOME TIME IN THE FUTURE WE MIGHT DEIGN TO SELL YOU UPGRADES THAT PURPORT TO ADDRESS SOME OF THE APPLICATION'S MANY DEFICIENCIES, BUT NO PROMISES THERE EITHER. WE HAVE MORE LAWYERS ON STAFF THAN YOU HAVE TOTAL EMPLOYEES, SO DON'T EVEN *THINK* ABOUT SUING US. HAVE A NICE DAY.

If a renderer cannot find or access the external data source for a placement link, it must issue a warning and render the URI directly in some form, possibly as an outwards link. For example:

Copyright

See: std_copyright.pod

Disclaimer

See: http://www.MegaGigaTeraPetaCorp.com/std/disclaimer.txt

Space-preserving text

Any text enclosed in an S<> code is formatted normally, except that every whitespace character in it—including any newline—is preserved. These characters are also treated as being non-breaking (except for the newlines, of course). For example:

The emergency signal is: S<
dot dot dot   dash dash dash   dot dot dot>.

would be formatted like so:

The emergency signal is:
dot dot dot   dash dash dash    dot dot dot.

rather than:

The emergency signal is: dot dot dot dash dash dash dot dot dot.

Entities

To include named Unicode or XHTML entities, use the E<> code.

If the contents of the E<> are a number, that number is treated as the decimal Unicode value for the desired codepoint. For example:

Perl 6 makes considerable use of E<171> and E<187>.

You can also use explicit binary, octal, decimal, or hexadecimal numbers (using the Perl 6 notations for explicitly based numbers):

Perl 6 makes considerable use of E<0b10101011> and E<0b10111011>.
Perl 6 makes considerable use of E<0o253> and E<0o273>.
Perl 6 makes considerable use of E<0d171> and E<0d187>.
Perl 6 makes considerable use of E<0xAB> and E<0xBB>.

If the contents are not a number, they are interpreted as a Unicode character name (which is always upper-case), or else as an XHTML entity. For example:

Perl 6 makes considerable use of E<LEFT DOUBLE ANGLE BRACKET>
and E<RIGHT DOUBLE ANGLE BRACKET>.

or, equivalently:

Perl 6 makes considerable use of E<laquo> and E<raquo>.

Multiple consecutive entities can be specified in a single E<> code, separated by semicolons:

Perl 6 makes considerable use of E<laquo;hellip;raquo>.

Indexing terms

Anything enclosed in an X<> code is an index entry. The contents of the code are both formatted into the document and used as the (case-insensitive) index entry:

An X<array> is an ordered list of scalars indexed by number,
starting with 0. A X<hash> is an unordered collection of scalar
values indexed by their associated string key.

You can specify an index entry in which the indexed text and the index entry are different, by separating the two with a vertical bar:

An X<array|arrays> is an ordered list of scalars indexed by number,
starting with 0. A X<hash|hashes> is an unordered collection of
scalar values indexed by their associated string key.

In the two-part form, the index entry comes after the bar and is case-sensitive.

You can specify hierarchical index entries by separating indexing levels with commas:

An X<array|arrays, definition of> is an ordered list of scalars
indexed by number, starting with 0. A X<hash|hashes, definition of>
is an unordered collection of scalar values indexed by their
associated string key.

You can specify two or more entries for a single indexed text, by separating the entries with semicolons:

A X<hash|hashes, definition of; associative arrays>
is an unordered collection of scalar values indexed by their
associated string key.

The indexed text can be empty, creating a "zero-width" index entry:

X<|puns, deliberate>This is called the "Orcish Manoeuvre"
because you "OR" the "cache".

Annotations

Anything enclosed in an N<> code is an inline note. For example:

Use a C<for> loop instead.N<The Perl 6 C<for> loop is far more
powerful than its Perl 5 predecessor.> Preferably with an explicit
iterator variable.

Renderers may render such annotations in a variety of ways: as footnotes, as endnotes, as sidebars, as pop-ups, as tooltips, as expandable tags, etc. They are never, however, rendered as unmarked inline text. So the previous example might be rendered as:

Use a for loop instead.† Preferably with an explicit iterator variable.

and later:

Footnotes

† The Perl 6 for loop is far more powerful than its Perl 5 predecessor.

User-defined formatting codes

Perldoc modules can define their own formatting codes, using the M<> code. An M<> code must start with a colon-terminated scheme specifier. The rest of the enclosed text is treated as the (verbatim) contents of the formatting code. For example:

=use Perldoc::TT

=head1 Overview of the M<TT: $CLASSNAME > class
(version M<TT: $VERSION>)

M<TT: get_description($CLASSNAME) >

The M<> formatting code is the inline equivalent of a named block.

Internally an M<> code is converted to an object derived from the Perldoc::FormattingCode::Named class. The name of the scheme becomes the final component of the object's classname. For instance, the M<> code in the previous example would be converted to a Perldoc::FormattingCode::Named::TT object, whose .typename method retrieves the string "TT" and whose .contents method retrieves a list of the formatting code's (verbatim, unformatted) contents.

If the formatting code is unrecognized, the contents of the code (i.e. everything after the first colon) would normally be rendered as ordinary text.

Encoding

By default, Perldoc assumes that documents are Unicode, encoded in one of the three common schemes (UTF-8, UTF-16, or UTF-32). The particular scheme a document uses is autodiscovered by examination of the first few bytes of the file (where possible). If the autodiscovery fails, UTF-8 is assumed, and parsers may treat any non-UTF-8 bytes later in the document as fatal errors.

At any point in a document, you can explicitly set or change the encoding of its content using the =encoding directive:

=encoding ShiftJIS
=encoding Macintosh
=encoding KOI8-R

The specified encoding is used from the start of the next line in the document. If a second =encoding directive is encountered, the current encoding changes again after that line. Note, however, that the second encoding directive must itself be encoded using the first encoding scheme.

This requirement also applies to an =encoding directive at the very beginning of the file. That is, it must itself be encoded in the default UTF-8, -16, or -32. However, as a special case, the autodiscovery mechanism will (as far as possible) also attempt to recognize "self-encoded" =encoding directives that begin at the first byte of the file. For example, at the start of a ShiftJIS-encoded file you can specify =encoding ShiftJIS in the ShiftJIS encoding.

An =encoding directive affects any ambient code between the Perldoc as well. That is, Perl 6 uses =encoding directives to determine the encoding of its source code as well as that of any documentation.

Note that =encoding is a fundamental Perldoc directive, like =begin or =for; it is not an instance of an abbreviated block. Hence there is no paragraph or delimited form of the =encoding directive (just as there is no paragraph or delimited form of =begin).

Block pre-configuration

The =config directive allows you to prespecify standard configuration information that is applied to every block of a particular type.

For example, to specify particular formatting for different levels of heading, you could preconfigure all the heading directives with appropriate formatting schemes:

=config head1              :formatted<B U>  :numbered
=config head2 :like<head1> :formatted<I>
=config head3              :formatted<U>
=config head4 :like<head3> :formatted<I>

The general syntax for configuration directives is:

=config BLOCK_TYPE  CONFIG OPTIONS
=                   OPTIONAL EXTRA CONFIG OPTIONS

Like =encoding, a =config is a directive, not a block. Hence, there is no paragraph or delimited form of the =config directive. Each =config specification is lexically scoped to the surrounding block in which it is specified.

Note that, if a particular block later explicitly specifies a configuration option with the same key, that option overrides the pre-configured option. For example, given the heading configurations in the previous example, to specify a non-basic second-level heading:

=for head2 :formatted<I U>
Details

The :like option causes the current formatting options for the named block type to be (lexically) replaced by the complete formatting information of the block type specified as the :like's value. That other block type must already have been preconfigured. Any additional formatting specifications are subsequently added to that config. For example:

=comment  In the current scope make =head2 an "important" variant of =head1
=config head2 :like<head1> :formatted<I>

Incidentally, this also means you can arrange for an explicit :formatted option to augment an existing =config, rather than replacing it. Like so:

=comment  Mark this =head3 (but only this one) as being important
          (in addition to the normal formatting)...
=head3 :like<head3> :formatted<I>

Pre-configuring formatting codes

You can also lexically preconfigure a formatting code, by naming it with a pair of angles as a suffix. For example:

=comment  Always allow E<> codes in any (implicit or explicit) V<> code...
=config V<>  :allow<E>
=comment  All inline code to be marked as important...
=config C<>  :formatted<I>

Note that, even though the formatting code is named using single-angles, the preconfiguration applies regardless of the actual delimiters used on subsequent instances of the code.

Modules

Perldoc provides a mechanism by which you can extend the syntax, semantics, or content of your documentation: the =use directive.

Specifying a =use causes a Perldoc processor to load the corresponding Perldoc module at that point, or to throw an exception if it cannot.

Such modules can specify additional content that should be included in the document. Alternatively, they can register classes that handle new types of block directives or formatting codes.

Note that a module loaded via a =use statement can affect the content or the interpretation of subsequent blocks, but not the initial parsing of those blocks. Any new block types must still conform to the general syntax described in this document. Typically, a module will change the way that renderers parse the contents of specific blocks.

A =use directive may be specified with either a module name or a URI:

=use MODULE_NAME  OPTIONAL CONFIG DATA
=                 OPTIONAL EXTRA CONFIG DATA
=use URI

If a URI is given, the specified file is treated as a source of Pod to be included in the document. Any Pod blocks are parsed out of the contents of the =use'd file, and added to the main file's Pod representation at that point.

If a module name is specified, with a language prefix of pod:, then the corresponding .pod file is searched for in the $PERL6DOC "documentation path". If none is found, the corresponding .pm file is then searched for in the library path ($PERL6LIB). If either file is found, the Pod is parsed out of it and the resulting block objects inserted into the main file's representation.

If a module name is specified with any prefix except pod:, or without a prefix at all, then the corresponding .pm file (or another language's equivalent code module) is searched for in the appropriate module library path. If found, the code module require'd into the Pod parser (usually to add a class implementing a particular Pod extension). If no such code module is found, a suitable .pod file is searched for instead, the contents parsed as Pod, and the resulting block objects inserted into the main file's representation.

You can use fully and partially specified module names (as with Perl 6 modules):

=use Perldoc::Plugin::XHTML-1.2.1-(*)

Any options that are specified after the module name:

=use Perldoc::Plugin::Image  :Jpeg  prefix=>'http://dev.perl.org'

are passed to the internal require that loads the corresponding module.

Collectively these alternatives allow you to create standard documentation inserts or stylesheets, to include Pod extracted from other code files, or to specify new types of documentation blocks and formatting codes:

Note that =use is a fundamental Perldoc directive, like =begin or =encoding, so there is no paragraph or delimited form of =use.

SUMMARY

Directives

Directive

Specifies

=begin

Start of an explicitly terminated block

=config

Lexical modifications to a block or formatting code

=encoding

Encoding scheme for subsequent text

=end

Explicit termination of a =begin block

=for

Start of an implicitly (blank-line) terminated block

=use

Transclusion of content; loading of a Perldoc module

Blocks

Block typename

Specifies

=code

Verbatim pre-formatted sample source code

=comment

Content to be ignored by all renderers

=headN

Nth-level heading

=input

Pre-formatted sample input

=item

First-level list item

=itemN

Nth-level list item

=nested

Nest block contents within the current context

=output

Pre-formatted sample output

=para

Ordinary paragraph

=table

Simple rectangular table

=DATA

Perl 6 data section

=END

No ambient blocks after this point

=RESERVED

Semantic blocks (=SYNOPIS, =BUGS, etc.)

=Typename

User-defined block

Formatting codes

Formatting code

Specifies

B<...>

Basis/focus of sentence (typically rendered bold)

C<...>

Code (typically rendered fixed-width)

D<...|...;...>

Definition (D<R<defined term>|R<synonym>;R<synonym>;...>)

E<...>

Entity name or numeric codepoint

I<...>

Important (typically rendered in italics)

K<...>

Keyboard input (typically rendered fixed-width)

L<...|...>

Link (L<R<display text>|R<destination URI>>)

M<...:...>

Module-defined code (M<R<scheme>:R<contents>>)

N<...>

Note (not rendered inline)

P<...>

Placement link

V<R><...>

Replaceable component or metasyntax

S<...>

Space characters to be preserved

T<...>

Terminal output (typically rendered fixed-width)

U<...>

Unusual (typically rendered with underlining)

V<V><...>

Verbatim (internal formatting codes ignored)

X<...|..,..;...>

Index entry (X<R<display text>|R<entry>,R<subentry>;...>)

Z<...>

Zero-width comment (contents never rendered)

Notes

1A valid identifier is a sequence of alphanumerics and/or underscores, beginning with an alphabetic or underscore

2Because an = in the first column is always the start of a Pod directive