Synopsis 26 - Documentation
Damian Conway <damian@conway.org
>
Maintainer: |
Damian Conway |
Date: |
9 Apr 2005 |
Last Modified: |
14 Feb 2007 |
Perldoc is an easy-to-use markup language with a simple, consistent underlying document object model. Perldoc can be used for writing language documentation, for documenting programs and modules, as well as for other types of document composition.
Perldoc allows for multiple syntactic dialects, all of which map onto the same set of standard document objects. The standard dialect is named "Pod".
Pod is an evolution of Perl 5's Plain Ol' Documentation (POD) markup. Compared to Perl 5 POD, Perldoc's Pod dialect is much more uniform, somewhat more compact, and considerably more expressive. The Pod dialect also differs in that it is a purely descriptive mark-up notation, with no presentational components.
Pod documents are specified using directives, which are used to
declare configuration information and to delimit blocks of textual content.
Every directive starts with an equals sign (=
) in the first column.
The content of a document is specified within one or more blocks. Every Pod block may be declared in any of three equivalent forms: delimited style, paragraph style, or abbreviated style.
Anything in a document that is neither a Pod directive nor contained
within a Pod block is treated as "ambient" material. Typically this
would be the source code of the program that the Pod is documenting. Pod
parsers still parse this text into the internal representation of the
file (representing it as a Perldoc::Block::Ambient
block), but
renderers will usually ignore such blocks.
In Perl 5's POD format, once a POD directive is encountered, the parser
considers everything that follows to be POD, until an explicit =cut
directive is encountered, at which point the parser flips between POD
and ambient text. The Perl 6 Pod format is different. A Pod parser
always reverts to "ambient" at the end of each Pod directive or block.
To cause the parser to remain in Pod mode, you must enclose the desired
Pod region in a pod
block:
=begin pod =head1 A heading This is Pod too. Specifically, this is a simple C<para> block $this = pod('also'); # Specifically, a code block =end pod
Alternatively you can indicate an entire file contains only Pod, by
giving it a .pod
suffix.
Delimited blocks are bounded by =begin
and =end
markers, both of
which are followed by a valid identifier1, which is the typename of the block. Typenames
that are entirely lowercase (for example: =begin head1
) or entirely
uppercase (for example: =begin SYNOPSIS
) are reserved.
After the typename, the rest of the =begin
marker line is treated as
configuration information for the block. This information is used in
different ways by different types of blocks, and is specified using
Perl6ish :key<value>
or key=>value
pairs (which must, of
course, be constants since Perldoc is a specification language, not a
programming language).
See Synopsis 2
for a summary of the Perl 6 pair notation.
The configuration section may be extended over subsequent lines by
starting those lines with an =
in the first column followed by a
whitespace character.
The lines following the opening delimiter and configuration are the data
or contents of the block, which continue until the block's =end
marker
line. The general syntax is:
=begin BLOCK_TYPE OPTIONAL CONFIG INFO = OPTIONAL EXTRA CONFIG INFO BLOCK CONTENTS =end BLOCK_TYPE
For example:
=begin table :caption<Table of Contents> Constants 1 Variables 10 Subroutines 33 Everything else 57 =end table
=begin Name :required = :width(50) The applicant's full name =end Name
=begin Contact :optional The applicant's contact details =end Contact
Note that no blank lines are required around the directives; blank lines within the contents are always treated as part of the contents. This is a universal feature of Pod.
Note also that in the following specifications, a "blank line" is a line
that is either empty or that contains only whitespace characters. That
is, a blank line matches the Perl 6 pattern: /^^ \h* $$/
. Pod uses
blank lines as delimiters, rather than empty lines, the principle of
least surprise.
Paragraph blocks are introduced by a =for
marker and terminated by
the next Pod directive or the first blank line (which is not
considered to be part of the block's contents). The =for
marker is
followed by the name of the block and optional configuration
information. The general syntax is:
=for BLOCK_TYPE OPTIONAL CONFIG INFO = OPTIONAL EXTRA CONFIG INFO BLOCK DATA
For example:
=for table :caption<Table of Contents> Constants 1 Variables 10 Subroutines 33 Everything else 57
=for Name :required = :width(50) The applicant's full name
=for Contact :optional The applicant's contact details
Abbreviated blocks are introduced by an '='
sign in the
first column, which is followed immediately by the typename of the
block. The rest of the line is treated as block data, rather than as
configuration. The content terminates at the next Pod directive or the
first blank line (which is not part of the block data). The general
syntax is:
=BLOCK_TYPE BLOCK DATA MORE BLOCK DATA
For example:
=table Constants 1 Variables 10 Subroutines 33 Everything else 57
=Name The applicant's full name =Contact The applicant's contact details
Note that abbreviated blocks cannot specify configuration information. If
configuration is required, use a =for
or =begin
/=end
instead.
The three block specifications (delimited, paragraph, and abbreviated) are treated identically by the underlying documentation model, so you can use whichever form is most convenient for a particular documentation task. In the descriptions that follow, the abbreviated form will generally be used, but should be read as standing for all three forms equally.
For example, although Headings shows only:
=head1 Top Level Heading
this automatically implies that you could also write that block as:
=for head1 Top Level Heading
or:
=begin head1 Top Level Heading =end head1
Pod predefines a small number of standard configuration options that can be applied uniformly to built-in block types. These include:
:nested
This option specifies that the block is to be nested within its current
context. For example, nesting might be applied to block quotes, to textual
examples, or to commentaries. In addition the =code
,
=item
, =input
, and =output
blocks all have implicit nesting.
Nesting of blocks is usually rendered by adding extra indentation to the block contents, but may also be indicated in others ways: by boxing the contents, by changing the font or size of the nested text, or even by folding the text (so long as a visible placeholder is provided).
Occasionally it is desirable to nest content by more than one level:
=begin para :nested =begin para :nested =begin para :nested "We're going deep, deep, I<deep> undercover!" =end para =end para =end para
This can be simplified by giving the :nested
option a positive integer
value:
=begin para :nested(3) "We're going deep, deep, I<deep> undercover!" =end para
You can also give the option a value of zero, to defeat any implicit nesting that might normally be applied to a paragraph. For example, to specify a block of code that should appear without its usual nesting:
=comment Don't nest this code block in the usual way... =begin code :nested(0)
1 2 3 4 5 6 123456789012345678901234567890123456789012345678901234567890 |------|-----------------------|---------------------------| line instruction comments number code
=end code
Note that :!nested
could also be used for this purpose:
=begin code :!nested
:numbered
This option specifies that the block is to be numbered. The most common use of this option is to create numbered headings and ordered lists, but it can be applied to any block.
It is up to individual renderers to decide how to display any numbering associated with other types of blocks.
:term
:formatted
This option specifies that the contents of the block should be treated as if they had one or more formatting codes placed around them.
For example, instead of:
=for comment The next para is both important and fundamental, so doubly emphasize it...
=begin para B<I< Warning: Do not immerse in water. Do not expose to bright light. Do not feed after midnight. >> =end para
you can just write:
=begin para :formatted<B I> Warning: Do not immerse in water. Do not expose to bright light. Do not feed after midnight. =end para
The internal representations of these two versions are exactly the same,
except that the second one retains the :formatted
option information
as part of the resulting block object.
Like all formatting codes, codes applied via a :formatted
are
inherently cumulative. For example, if the block itself is already
inside a formatting code, that formatting code will still apply, in
addition to the extra "basis" and "important" formatting specified by
:formatted<B I>
.
:like
This option specifies that a block or config has the same formatting properties as the type named by its value. This is useful for creating related configurations. For example:
=config head2 :like<head1> :formatted<I>
:allow
V<>
codes that appear in (or are implicitly applied to)
the current block. The option is most often used on =code
blocks to
allow mark-up within those otherwise verbatim blocks, though it can be
used in any block that contains verbatim text. See Formatting
within code blocks.
Pod offers notations for specifying a range of standard block types...
Pod provides an unlimited number of levels of heading, specified by the
=head
N block marker. For example:
=head1 A Top Level Heading
=head2 A Second Level Heading
=head3 A third level heading
=head86 A "Missed it by I<that> much!" heading
While Pod parsers are required to recognize and distinguish all levels of heading, Pod renderers are only required to provide distinct renderings of the first four levels of heading (though they may, of course, provide more than that). Headings at levels without distinct renderings would typically be rendered like the lowest distinctly rendered level.
You can specify that a heading is numbered using the :numbered
option. For
example:
=for head1 :numbered The Problem
=for head1 :numbered The Solution
=for head2 :numbered Analysis
=for head3 Overview
=for head3 Details
=for head2 :numbered Design
=for head1 :numbered The Implementation
which would produce:
1. The Problem
2. The Solution
2.1. Analysis
Overview
Details
2.2: Design
3. The Implementation
It is usually better to preset a numbering scheme for each heading level, in a series of configuration blocks:
=config head1 :numbered =config head2 :numbered =config head3 :!numbered
=head1 The Problem =head1 The Solution =head2 Analysis =head3 Overview =head3 Details =head2 Design =head1 The Implementation
Alternatively, as a short-hand, if the first whitespace-delimited word
in a heading consists of a single literal #
character, the #
is
removed and the heading is treated as if it had a :numbered
option:
=head1 # The Problem =head1 # The Solution =head2 # Analysis =head3 Overview =head3 Details =head2 # Design =head1 # The Implementation
Note that, even though renderers are not required to distinctly render more than the first four levels of heading, they are required to correctly honour arbitrarily nested numberings. That is:
=head6 # The Rescue of the Kobayashi Maru
should produce something like:
2.3.8.6.1.9. The Rescue of the Kobayashi Maru
Ordinary paragraph blocks consist of text that is to be formatted into a document at the current level of nesting, with whitespace squeezed, lines filled, and any special inline mark-up applied.
Ordinary paragraphs consist of one or more consecutive lines of text, each of which starts with a non-whitespace character at column 1. The paragraph is terminated by the first blank line or block directive. For example:
=head1 This is a heading block
This is an ordinary paragraph. Its text will be squeezed and short lines filled. It is terminated by the first blank line.
This is another ordinary paragraph. Its text will also be squeezed and short lines filled. It is terminated by the trailing directive on the next line. =head2 This is another heading block
Within a =pod
, =item
, =nested
, or =END
block, ordinary
paragraphs do not require an explicit marker or delimiters, but there is
also an explicit para
marker (which may be used anywhere):
=para This is an ordinary paragraph. Its text will be squeezed and short lines filled.
and likewise the longer =for
and =begin
/=end
forms. For example:
=begin para This is an ordinary paragraph. Its text will be squeezed and short lines filled. =end para
As the previous example implies, when any form of explicit para
block
is used, any whitespace at the start of each line is removed, so
the paragraph text no longer has to begin at column 1.
Code blocks are used to specify pre-formatted text (typically source code), which should be rendered without rejustification, without whitespace-squeezing, and without recognizing any inline formatting codes. Code blocks also have an implicit nesting associated with them. Typically these blocks are used to show examples of code, mark-up, or other textual specifications, and are rendered using a fixed-width font.
A code block may be implicitly specified as one or more lines of text, each of which starts with a whitespace character. The block is terminated by a blank line. For example:
This ordinary paragraph introduces a code block: $this = 1 * code('block'); $which.is_specified(:by<indenting>);
Implicit code blocks may only be used within =pod
, =item
,
=nested
, or =END
blocks.
There is also an explicit =code
block (which can be specified within
any other block type, not just =pod
, =item
, etc.):
The C<loud_update()> subroutine adds feedback: =begin code sub loud_update ($who, $status) { say "$who -> $status"; silent_update($who, $status); } =end code
As the previous example demonstrates, within an explicit =code
block
the code can start at the first column. Furthermore, lines that start
with whitespace characters have that whitespace preserved exactly (in
addition to the implicit nesting of the code). Explicit =code
blocks may
also contain empty lines.
Although =code
blocks automatically disregard all formatting
codes, occasionally you may still need to specify
some formatting within a code block. For example, you may wish
to emphasize a particular keyword in an example (using a B<>
code). Or
you may want to indicate that part of the example is metasyntactic
(using the R<>
code). Or you might need to insert a non-ASCII
character (using the E<>
code).
You can specify a list of formatting codes that should still be
recognized within a code block using the :allow
option. The value of
the :allow
option must be a list of the (single-letter) names of one
or more formatting codes. Those codes will then remain active inside the
code block. For example:
=begin code :allow< B R > sub demo { B<say> 'Hello R<name>'; } =end code
would be rendered:
sub demo { say 'Hello name'; }
Although code blocks are verbatim by default, it can still occasionally
be useful to explicitly :allow
the verbatim formatting code (V<>
). That's
because, although the contents of an explicit =code
block are allowed to
start in column 1, they are not allowed to start with
an equals sign in that first column2. So, if an =
is needed in column 1,
it must be declared verbatim:
=begin code :allow<V> V<=> in the first column is always a Perldoc directive =end code
Pod also provides blocks for specifying the input and output of programs.
The =input
block is used to specify pre-formatted keyboard input,
which should be rendered without rejustification or squeezing of whitespace.
The =output
block is used to specify pre-formatted terminal or file
output which should also be rendered without rejustification or
whitespace-squeezing.
Note that, like =code
blocks, both =input
and =output
blocks have an
implicit level of nesting. They are also like =code
blocks in that they
are typically rendered in a fixed-width font, though ideally all three blocks
would be rendered in distinct font/weight combinations (for example: regular
serifed for code, bold sans-serif for input, and regular sans-serif for
output).
Unlike =code
blocks, both =input
and =output
blocks honour any
nested formatting codes. This is particular useful since a sample of
input will often include prompts (which are, of course, output).
Likewise a sample of output may contain the occasional interactive
component. Pod provides special formatting codes
(K<>
and T<>
) to indicate embedded input or output, so you can use
the block type that indicates the overall purpose of the sample (i.e. is
it demonstrating an input operation or an output sequence?) and then use
the "contrasting" formatting code within the block.
For example, to include a small amount of input in a sample of output:
=begin output Name: Baracus, B.A. Rank: Sgt Serial: 1PTDF007 Do you want additional personnel details? K<y> Height: 180cm/5'11" Weight: 104kg/230lb Age: 49 Print? K<n> =end output
Lists in Pod are specified as a series of contiguous =item
blocks. No
special "container" directives or other delimiters are required to
enclose the entire list. For example:
The seven suspects are:
=item Happy =item Dopey =item Sleepy =item Bashful =item Sneezy =item Grumpy =item Keyser Soze
List items have one implicit level of nesting:
The seven suspects are:
- Happy
- Dopey
- Sleepy
- Bashful
- Sneezy
- Grumpy
- Keyser Soze
Lists may be multi-level, with items at each level specified using the
=item1
, =item2
, =item3
, etc. blocks. Note that =item
is just
an abbreviation for =item1
. For example:
=item1 Animal =item2 Vertebrate =item2 Invertebrate
=item1 Phase =item2 Solid =item2 Liquid =item2 Gas =item2 Chocolate
which would be rendered something like:
• Animal
– Vertebrate
– Invertebrate
• Phase
– Solid
– Liquid
– Gas
– Chocolate
It is an error for a "level-N+1" =item
block (e.g. an =item2
,
=item3
, etc.) to appear anywhere except where there is a preceding
"level-N" =item
. That is, an =item3
can only be specified if an
=item2
appears somewhere before it, and that =item2
can only
appear if there is a preceding =item1
.
Note that item blocks within the same list are not physically nested. That is, lower-level items should not be specified inside higher-level items:
=comment WRONG... =begin item1 -------------- The choices are: | =item2 Liberty ==< Level 2 |==< Level 1 =item2 Death ==< Level 2 | =item2 Beer ==< Level 2 | =end item1 --------------
=comment CORRECT... =begin item1 --------------- The choices are: |==< Level 1 =end item1 --------------- =item2 Liberty ==================< Level 2 =item2 Death ==================< Level 2 =item2 Beer ==================< Level 2
An item is part of an ordered list if the item has a :numbered
configuration option:
=for item1 :numbered Visito
=for item2 :numbered Veni
=for item2 :numbered Vidi
=for item2 :numbered Vici
This would produce something like:
1. Visito
1.1. Veni
1.2. Vidi
1.3. Vici
although the numbering scheme is entirely at the discretion of the renderer, so it might equally well be rendered:
1. Visito
1a. Veni
1b. Vidi
1c. Vici
or even:
A: Visito
(i) Veni
(ii) Vidi
(iii) Vici
Alternatively, if the first word of the item consists of a single #
character, the item is treated as having a :numbered
option:
=item1 # Visito =item2 # Veni =item2 # Vidi =item2 # Vici
To specify an unnumbered list item that starts with a literal #
, either
make it verbatim:
=item V<#> introduces a comment
or explicitly mark the item itself as being unnumbered:
=for item :!numbered # introduces a comment
The numbering of successive =item1
list items increments
automatically, but is reset to 1 whenever any other kind of non-ambient
Perldoc block appears between two =item1
blocks. For example:
The options are:
=item1 # Liberty =item1 # Death =item1 # Beer
The tools are:
=item1 # Revolution =item1 # Deep-fried peanut butter sandwich =item1 # Keg
would produce:
The options are:
1. Liberty
2. Death
3. Beer
The tools are:
1. Revolution
2. Deep-fried peanut butter sandwich
3. Keg
The numbering of nested items (=item2
, =item3
, etc.) only resets
(to 1) when the higher-level item's numbering either resets or increments.
To prevent a numbered =item1
from resetting after a non-item block,
you can specify the :continued
option:
=for item1 # Retreat to remote Himalayan monastery =for item1 # Learn the hidden mysteries of space and time I<????> =for item1 :continued # Prophet!
which produces:
1. Retreat to remote Himalayan monastery
2. Learn the hidden mysteries of space and time
????
3. Prophet!
To create term/definition lists, specify the term as a configuration value of the item, and the definition as the item's contents:
=for item :term<MAD> Affected with a high degree of intellectual independence. =for item :term<MEEKNESS> Uncommon patience in planning a revenge that is worth while. =for item :term<MORAL> Conforming to a local and mutable standard of right. Having the quality of general expediency.
An item that's specified as a term can still be numbered:
=for item :numbered :term<SELFISH> Devoid of consideration for the selfishness of others. =for item :numbered :term<SUCCESS> The one unpardonable sin against one's fellows.
List items that do not specify either the :numbered
or :term
options are
unordered. Typically, such lists are rendered with bullets. For example:
=item1 Reading =item2 Writing =item3 'Rithmetic
might be rendered:
• Reading
— Writing
¤ 'Rithmetic
As with numbering styles, the bulleting strategy used for different levels within a nested list is entirely up to the renderer.
Use the delimited form of the =item
block to specify items that
contain multiple paragraphs. For example:
Let's consider two common proverbs:
=begin item :numbered I<The rain in Spain falls mainly on the plain.>
This is a common myth and an unconscionable slur on the Spanish people, the majority of whom are extremely attractive. =end item
=begin item :numbered I<The early bird gets the worm.>
In deciding whether to become an early riser, it is worth considering whether you would actually enjoy annelids for breakfast. =end item
As you can see, folk wisdom is often of dubious value.
which produces:
Let's consider two common proverbs:
The rain in Spain falls mainly on the plain.
This is a common myth and an unconscionable slur on the Spanish people, the majority of whom are extremely attractive.
The early bird gets the worm.
In deciding whether to become an early riser, it is worth considering whether you would actually enjoy annelids for breakfast.
As you can see, folk wisdom is often of dubious value.
Any block can be nested by specifying an :nested
option on it:
=begin para :nested We are all of us in the gutter,E<NL> but some of us are looking at the stars! =end para
However, qualifying each nested paragraph individually quickly becomes tedious if there are many in a sequence, or if multiple levels of nesting are required:
=begin para :nested We are all of us in the gutter,E<NL> but some of us are looking at the stars! =end para =begin para :nested(2) -- Oscar Wilde =end para
So Pod provides a =nested
block that marks all its contents as being
nested:
=begin nested We are all of us in the gutter,E<NL> but some of us are looking at the stars! =begin nested -- Oscar Wilde =end nested =end nested
Nesting blocks can contain any other kind of block, including implicit paragraph and code blocks.
Simple tables can be specified in Perldoc using a =table
block.
The table may be given an associated description or title using the
:caption
option.
Columns are separated by whitespace, vertical lines (|
), or border
intersections (+
). Rows can be specified in one of two ways: either
one row per line, with no separators; or multiple lines per row with
explicit horizontal separators (whitespace, intersections (+
), or
horizontal lines: -
, =
, _
) between every row. Either style
can also have an explicitly separated header row at the top.
Each individual table cell is separately formatted, as if it were a
nested =para
.
This means you can create tables compactly, line-by-line:
=table The Shoveller Eddie Stevens King Arthur's singing shovel Blue Raja Geoffrey Smith Master of cutlery Mr Furious Roy Orson Ticking time bomb of fury The Bowler Carol Pinnsler Haunted bowling ball
or line-by-line with multi-line headers:
=table Superhero | Secret | | Identity | Superpower ==============|=================|================================ The Shoveller | Eddie Stevens | King Arthur's singing shovel Blue Raja | Geoffrey Smith | Master of cutlery Mr Furious | Roy Orson | Ticking time bomb of fury The Bowler | Carol Pinnsler | Haunted bowling ball
or with multi-line headers and multi-line data:
=begin table :caption('The Other Guys')
Secret Superhero Identity Superpower ============= =============== =================== The Shoveller Eddie Stevens King Arthur's singing shovel
Blue Raja Geoffrey Smith Master of cutlery
Mr Furious Roy Orson Ticking time bomb of fury
The Bowler Carol Pinnsler Haunted bowling ball
=end table
Blocks whose names are not recognized as Pod built-ins are assumed to be destined for specialized renderers or parser plug-ins. For example:
=begin Xhtml <object type="video/quicktime" data="onion.mov"> =end Xhtml
or:
=Image http://www.perlfoundation.org/images/perl_logo_32x104.png
Named blocks are converted by the Perldoc parser to block objects;
specifically, to objects of a subclass of the standard
Perldoc::Block::Named
class.
For example, the blocks of the previous example would be converted to
objects of the classes Perldoc::Block::Named::Xhtml
and
Perldoc::Block::Named::Image
respectively. Both of those classes
would be automatically created as subclasses of the
Perldoc::Block::Named
class (unless they were already defined via a
prior =use directive
).
The resulting object's .typename
method retrieves the short name of
the block type: 'Xhtml'
, 'Image'
, etc. The object's .config
method retreives the list of configuration options (if any). The
object's .contents
method retrieves a list of the block's
verbatim contents.
Named blocks for which no explicit class has been defined or loaded are usually not rendered by the standard renderers.
Note that all block names consisting entirely of lower-case or entirely of upper-case letters are reserved. See Semantic blocks.
Comments are Pod blocks that are never rendered by any renderer. They are, of course, still included in any internal Perldoc representation, and are accessible via the Perldoc API.
Comments are useful for meta-documentation (documenting the documentation):
=comment Add more here about the algorithm
and for temporarily removing parts of a document:
=item # Retreat to remote Himalayan monastery =item # Learn the hidden mysteries of space and time =item # Achieve enlightenment =begin comment =item # Prophet! =end comment
Note that, since the Perl interpreter never executes embedded Perldoc
blocks, comment
blocks can also be used as (nestable!) block comments
in Perl 6:
=begin comment for my $file (@files) { system("rm -rf $file"); } =end comment
=END
blockThe =END
block is special in that all three of its forms
(delimited, paragraph, and
abbreviated) are terminated only by the end of the
current file. That is, neither =END
nor =for END
are terminated by the
next blank line, and =end END
has no effect within a =begin END
block.
A warning is issued if an explicit =end END
appears within a document.
An =END
block indicates the end-point of any ambient material within
the document. This means that the parser will treat all the remaining
text in the file as Perldoc, even if it is not inside an explicit block. In
other words, apart from its special end-of-file termination behaviour,
an =END
block is in all other respects identical to a =pod
block.
Named Perldoc blocks whose typename is DATA
are the Perl 6 equivalent of
the Perl 5 __DATA__
section. The difference is that =DATA
blocks are
just regular Pod blocks and may appear anywhere within a source file, and as
many times as required.
Synopsis 2
describes the new Perl 6 interface for inline data.
All other uppercase block typenames are reserved for specifying standard documentation, publishing, or source components. In particular, all the standard components found in Perl and manpage documentation have reserved uppercase typenames.
Standard semantic blocks include:
=NAME =VERSION =SYNOPSIS =DESCRIPTION =USAGE =INTERFACE =METHOD =SUBROUTINE =OPTION =DIAGNOSTIC =ERROR =WARNING =DEPENDENCY =BUG =SEEALSO =ACKNOWLEDGEMENT =AUTHOR =COPYRIGHT =DISCLAIMER =LICENCE =LICENSE =TITLE =SECTION =CHAPTER =APPENDIX =TOC =INDEX =FOREWORD =SUMMARY
The plural forms of each of these keywords are also reserved, and are aliases for the singular forms.
Most of these blocks would typically be used in their full delimited forms:
=begin SYNOPSIS use Perldoc::Parser my Perldoc::Parser $parser .= new(); my $tree = $parser.parse($fh); =end SYNOPSIS
The use of these reserved keywords is not required; you can still just write:
=head1 SYNOPSIS =begin code use Perldoc::Parser my Perldoc::Parser $parser .= new(); my $tree = $parser.parse($fh); =end code
However, using the keywords adds semantic information to the documentation, which may assist various renderers, summarizers, coverage tools, and other utilities.
Note that there is no requirement that semantic blocks be rendered in a
particular way (or at all). Specifically, it is not necessary to
preserve the capitalization of the keyword. For example, the
=SYNOPSIS
block of the preceding example might be rendered like so:
3. Synopsis
use Perldoc::Parser; my Perldoc::Parser $parser .= new(); my $tree = $parser.parse($fh);
Formatting codes provide a way to add inline mark-up to a piece of text within the contents of (most types of) block. Formatting codes are themselves a type of block, and most of them may nest sequences of any other type of block (most often, other formatting codes). In particular, you can nest comment blocks in the middle of a formatting code:
B<I shall say this loudly =begin comment and repeatedly =end comment and with emphasis.>
All Pod formatting codes consist of a single capital letter followed
immediately by a set of angle brackets. The brackets contain the text or
data to which the formatting code applies. You can use a set of single
angles (<...>
), a set of double angles («...»
), or multiple
single-angles (<<<...>>>
).
Within angle delimiters, you cannot use sequences of the same angle characters that are longer than the delimiters:
=comment These are errors... C< $foo<<bar>> > The Perl 5 heredoc syntax was: C< <<END_MARKER >
You can use sequences of angles that are the same length as the delimiters, but they must be balanced. For example:
C< $foo<bar> > C<< $foo<<bar>> >>
If you need an unbalanced angle, either use different delimiters:
C«$foo < $bar» The Perl 5 heredoc syntax was: C« <<END_MARKER »
or delimiters with more consecutive angles than your text contains:
C<<$foo < $bar>> The Perl 5 heredoc syntax was: C<<< <<END_MARKER >>>
A formatting code ends at the matching closing angle bracket(s), or at the end of the enclosing block or formatting code in which the opening angle bracket was specified, whichever comes first. Pod parsers are required to issue a warning whenever a formatting code is terminated by the end of an outer block rather than by its own delimiter (unless the user explicitly disables the warning).
Pod provides three formatting codes that flag their contents with increasing levels of significance:
U<>
formatting code specifies that the contained text is
unusual or distinctive; that it is of minor significance. Typically
such content would be rendered in an underlined style.
I<>
formatting code specifies that the contained text is
important; that it is of major significance. Such content would
typically be rendered in italics or in <em>...<em/>
tags
B<>
formatting code specifies that the contained text is the
basis or focus of the surrounding text; that it is of fundamental
significance. Such content would typically be rendered in a bold style or
in <strong>...</strong>
tags.
The D<>
formatting code indicates that the contained text is a
definition, introducing a term that the adjacent text
elucidates. For example:
There ensued a terrible moment of D<coyotus interruptus>: a brief suspension of the effects of gravity, accompanied by a sudden to-the-camera realisation of imminent downwards acceleration.
A definition may be given synonyms, which are specified after a vertical bar and separated by semicolons:
A D<Formatting code|formatting codes;formatters> provides a way to add inline mark-up to a piece of text.
A definition would typically be rendered in italics or <dfn>...</dfn>
tags and will often be used as a link target for subsequent instances of the
term (or any of its specified synonyms) within a hypertext.
Perldoc provides formatting codes for specifying inline examples of input, output, code, and metasyntax:
The T<>
formatting code specifies that the contained text is
terminal output; that is: something that a program might print out.
Such content would typically be rendered in a fixed-width font or with
<code>...</code>
tags. The contents of a T<>
code are always
space-preserved (as if they had an implicit
S<...>
around them). The T<>
code is the inline equivalent of the
=output
block.
The K<>
formatting code specifies that the contained text is
keyboard input; that is: something that a user might type in. Such
content would typically be rendered in a fixed-width font (preferably a
different font from that used for the T<>
formatting code) or with
<kbd>...</kbd>
tags. The contents of a K<>
code are always
space-preserved. The K<>
code is the
inline equivalent of the =input
block.
The C<>
formatting code specifies that the contained text is code;
that is, something that might appear in a program or specification. Such
content would typically be rendered in a fixed-width font
(preferably
a different font from that used for the T<>
or K<>
formatting
codes) or with <samp>...</samp>
tags. The contents of a C<>
code
are space-preserved and verbatim.
The C<>
code is the inline equivalent of the =code
block.
To include other formatting codes in a C<>
code, you can lexically
reconfigure it:
=begin para =config C<> :allow<E I> Perl 6 makes extensive use of the C<E<laquo>> and C<E<raquo>> characters, for example, in a hash look-up: C<%hashI<E<laquo>>keyI<E<raquo>>> =end para
To enable entities in every C<...>
put a =config C<> :allow<E>
at the top of the document
The R<>
formatting code specifies that the contained text is a
replaceable item, a placeholder, or a metasyntactic variable. It is
used to indicate a component of a syntax or specification that should
eventually be replaced by an actual value. For example:
The basic C<ln> command is: C<ln> R<source_file> R<target_file>
or:
Then enter your details at the prompt: =for input Name: R<your surname> ID: R<your employee number> Pass: R<your 36-letter password>
Typically replaceables would be rendered in fixed-width italics or with
<var>...</var>
tags. The font used should be the same as that used for
the C<>
code, unless the R<>
is inside a K<>
or T<>
code (or
the equivalent =input
or =output
blocks), in which case their
respective fonts should be used.
The V<>
formatting code treats its entire contents as being verbatim,
disregarding every apparent formatting code within it. For example:
The B<V< V<> >> formatting code disarms other codes such as V< I<>, C<>, B<>, and M<> >.
Note, however that the V<>
code only changes the way its
contents are parsed, not the way they are rendered. That is, the
contents are still wrapped and formatted like plain text, and the
effects of any formatting codes surrounding the V<>
code
are still applied to its contents. For example the previous example
is rendered:
The V<> formatting code disarms other codes such as I<>, C<>, B<>, and M<> .
You can prespecify formatting codes that remain active within
a V<>
code, using the :allow
option.
The Z<>
formatting code indicates that its contents constitute a
zero-width comment, which should not be rendered by any renderer.
For example:
The "exeunt" command Z<Think about renaming this command?> is used to quit all applications.
In Perl 5 POD, the Z<>
code was widely used to break up text that would
otherwise be considered mark-up:
In Perl 5 POD, the ZZ<><> code was widely used to break up text that would otherwise be considered mark-up.
That technique still works, but it's now easier to accomplish the same goal using a verbatim formatting code:
In Perl 5 POD, the V<Z<>> code was widely used to break up text that would otherwise be considered mark-up.
Moreover, the C<>
code automatically treats its contents as being
verbatim, which often eliminates the need for the V<>
as well:
In Perl 5 POD, the C<Z<>> code was widely used to break up text that would otherwise be considered mark-up.
The Z<>
formatting code is the inline equivalent of a
=comment
block.
The L<>
code is used to specify all kinds of links, filenames, citations,
and cross-references (both internal and external).
A link specification consists of a scheme specifier terminated by a colon, followed by an external address (in the scheme's preferred syntax), followed by an internal address (again, in the scheme's syntax). All three components are optional, though at least one must be present in any link specification.
Usually, in schemes where an internal address makes sense, it will be
separated from the preceding external address by a #
, unless the
particular addressing scheme requires some other syntax. When new
addressing schemes are created specifically for Perldoc it is strongly
recommended that #
be used to mark the start of internal addresses.
Standard schemes include:
http:
and https:
A standard web URL. For example:
This module needs the LAME library (available from L<http://www.mp3dev.org/mp3/>)
If the link does not start with //
it is treated as being relative to
the location of the current document:
See also: L<http:tutorial/faq.html> and L<http:../examples/index.html>
file:
A filename on the local system. For example:
Next, edit the global config file (that is, either L<file:/usr/local/lib/.configrc> or L<file:~/.configrc>).
Filenames that don't begin with a /
or a ~
are relative to the current
document's location:
Then, edit the local config file (that is, either L<file:.configrc> or L<file:CONFIG/.configrc>.
mailto:
An email address. Typically, activating this type of link invokes a mailer. For example:
Please forward bug reports to L<mailto:devnull@rt.cpan.org>
man:
A link to the system manpages. For example:
This module implements the standard Unix L<man:find(1)> facilities.
doc:
A link to some other documentation, typically a module or part of the core documentation. For example:
You may wish to use L<doc:Data::Dumper> to view the results. See also: L<doc:perldata>.
defn:
A link to the definition of the specified term within the current document. For example:
He was highly prone to D<lexiphania>: an unfortunate proclivity for employing sesquipedalian words (such as "proclivity", "sesquipedalian", and indeed "lexiphania").
and later, to link back to the definition
To treat his chronic L<defn:lexiphania> the doctor prescribed an immediate glossectomy or, if that proved ineffective, a complete cephalectomy.
isbn:
and issn:
The International Standard Book Number or International Standard Serial Number for a publication. For example:
The Perl Journal was a registered serial publication (L<issn:1087-903X>)
To refer to a specific section within a webpage, manpage, or Perldoc
document, add the name of that section after the main link, separated by
a #
. For example:
Also see: L<man:bash(1)#Compound Commands>, L<doc:perlsyn#For Loops>, and L<http://dev.perl.org/perl6/syn/S04.html#The_for_statement>
To refer to a section of the current document, omit the external address:
This mechanism is described under L<doc:#Special Features> below.
The scheme name may also be omitted in that case:
This mechanism is described under L<#Special Features> below.
Normally a link is presented as some rendered version of the link specification itself. However, you can specify an alternate presentation by prefixing the link with the desired text and a vertical bar. Whitespace is not significant on either side of the bar. For example:
This module needs the L<LAME library|http://www.mp3dev.org/mp3/>. You could also write the code L<in Latin | doc:Lingua::Romana::Perligata>
A second kind of link—the P<>
or placement link—works in the
opposite direction. Instead of directing focus out to another document,
it allows you to draw the contents of another document into your own.
In other words, the P<>
formatting code takes a URI and (where possible)
places the contents of the corresponding document inline in place of the
code itself.
P<>
codes are handy for breaking out standard elements of
your documentation set into reusable components that can then be
incorporated directly into multiple documents. For example:
=COPYRIGHT P<file:/shared/docs/std_copyright.pod>
=DISCLAIMER P<http://www.MegaGigaTeraPetaCorp.com/std/disclaimer.txt>
might produce:
Copyright
This document is copyright (c) MegaGigaTeraPetaCorp, 2006. All rights reserved.
Disclaimer
ABSOLUTELY NO WARRANTY IS IMPLIED. NOT EVEN OF ANY KIND. WE HAVE SOLD YOU THIS SOFTWARE WITH NO HINT OF A SUGGESTION THAT IT IS EITHER USEFUL OR USABLE. AS FOR GUARANTEES OF CORRECTNESS...DON'T MAKE US LAUGH! AT SOME TIME IN THE FUTURE WE MIGHT DEIGN TO SELL YOU UPGRADES THAT PURPORT TO ADDRESS SOME OF THE APPLICATION'S MANY DEFICIENCIES, BUT NO PROMISES THERE EITHER. WE HAVE MORE LAWYERS ON STAFF THAN YOU HAVE TOTAL EMPLOYEES, SO DON'T EVEN *THINK* ABOUT SUING US. HAVE A NICE DAY.
If a renderer cannot find or access the external data source for a placement link, it must issue a warning and render the URI directly in some form, possibly as an outwards link. For example:
Copyright
See: std_copyright.pod
Disclaimer
Any text enclosed in an S<>
code is formatted normally, except that
every whitespace character in it—including any newline—is preserved.
These characters are also treated as being non-breaking (except for the
newlines, of course). For example:
The emergency signal is: S< dot dot dot dash dash dash dot dot dot>.
would be formatted like so:
The emergency signal is:
dot dot dot dash dash dash dot dot dot.
rather than:
The emergency signal is: dot dot dot dash dash dash dot dot dot.
To include named Unicode or XHTML entities, use the E<>
code.
If the contents of the E<>
are a number, that number is
treated as the decimal Unicode value for the desired codepoint.
For example:
Perl 6 makes considerable use of E<171> and E<187>.
You can also use explicit binary, octal, decimal, or hexadecimal numbers (using the Perl 6 notations for explicitly based numbers):
Perl 6 makes considerable use of E<0b10101011> and E<0b10111011>. Perl 6 makes considerable use of E<0o253> and E<0o273>. Perl 6 makes considerable use of E<0d171> and E<0d187>. Perl 6 makes considerable use of E<0xAB> and E<0xBB>.
If the contents are not a number, they are interpreted as a Unicode character name (which is always upper-case), or else as an XHTML entity. For example:
Perl 6 makes considerable use of E<LEFT DOUBLE ANGLE BRACKET> and E<RIGHT DOUBLE ANGLE BRACKET>.
or, equivalently:
Perl 6 makes considerable use of E<laquo> and E<raquo>.
Multiple consecutive entities can be specified in a single E<>
code,
separated by semicolons:
Perl 6 makes considerable use of E<laquo;hellip;raquo>.
Anything enclosed in an X<>
code is an index entry. The contents
of the code are both formatted into the document and used as the
(case-insensitive) index entry:
An X<array> is an ordered list of scalars indexed by number, starting with 0. A X<hash> is an unordered collection of scalar values indexed by their associated string key.
You can specify an index entry in which the indexed text and the index entry are different, by separating the two with a vertical bar:
An X<array|arrays> is an ordered list of scalars indexed by number, starting with 0. A X<hash|hashes> is an unordered collection of scalar values indexed by their associated string key.
In the two-part form, the index entry comes after the bar and is case-sensitive.
You can specify hierarchical index entries by separating indexing levels with commas:
An X<array|arrays, definition of> is an ordered list of scalars indexed by number, starting with 0. A X<hash|hashes, definition of> is an unordered collection of scalar values indexed by their associated string key.
You can specify two or more entries for a single indexed text, by separating the entries with semicolons:
A X<hash|hashes, definition of; associative arrays> is an unordered collection of scalar values indexed by their associated string key.
The indexed text can be empty, creating a "zero-width" index entry:
X<|puns, deliberate>This is called the "Orcish Manoeuvre" because you "OR" the "cache".
Anything enclosed in an N<>
code is an inline note.
For example:
Use a C<for> loop instead.N<The Perl 6 C<for> loop is far more powerful than its Perl 5 predecessor.> Preferably with an explicit iterator variable.
Renderers may render such annotations in a variety of ways: as footnotes, as endnotes, as sidebars, as pop-ups, as tooltips, as expandable tags, etc. They are never, however, rendered as unmarked inline text. So the previous example might be rendered as:
Use a for
loop instead.† Preferably with an explicit iterator
variable.
and later:
Footnotes
† The Perl 6
for
loop is far more powerful than its Perl 5 predecessor.
Perldoc modules can define their own formatting codes,
using the M<>
code. An M<>
code must start with a
colon-terminated scheme specifier. The rest of the enclosed text is
treated as the (verbatim) contents of the formatting code. For example:
=use Perldoc::TT =head1 Overview of the M<TT: $CLASSNAME > class (version M<TT: $VERSION>) M<TT: get_description($CLASSNAME) >
The M<>
formatting code is the inline equivalent of a
named block.
Internally an M<>
code is converted to an object derived from the
Perldoc::FormattingCode::Named
class. The name of the scheme becomes
the final component of the object's classname. For instance, the M<>
code in the previous example would be converted to a
Perldoc::FormattingCode::Named::TT
object, whose .typename
method retrieves the string "TT"
and whose .contents
method retrieves a list of the formatting code's (verbatim,
unformatted) contents.
If the formatting code is unrecognized, the contents of the code (i.e. everything after the first colon) would normally be rendered as ordinary text.
By default, Perldoc assumes that documents are Unicode, encoded in one of the three common schemes (UTF-8, UTF-16, or UTF-32). The particular scheme a document uses is autodiscovered by examination of the first few bytes of the file (where possible). If the autodiscovery fails, UTF-8 is assumed, and parsers may treat any non-UTF-8 bytes later in the document as fatal errors.
At any point in a document, you can explicitly set or change the encoding
of its content using the =encoding
directive:
=encoding ShiftJIS
=encoding Macintosh
=encoding KOI8-R
The specified encoding is used from the start of the next line in
the document. If a second =encoding
directive is encountered, the
current encoding changes again after that line. Note, however, that
the second encoding directive must itself be encoded using the first
encoding scheme.
This requirement also applies to an =encoding
directive at the very
beginning of the file. That is, it must itself be encoded in
the default UTF-8, -16, or -32. However, as a special case, the
autodiscovery mechanism will (as far as possible) also attempt to
recognize "self-encoded" =encoding
directives that begin at the first
byte of the file. For example, at the start of a ShiftJIS-encoded file
you can specify =encoding ShiftJIS
in the ShiftJIS encoding.
An =encoding
directive affects any ambient code between the Perldoc
as well. That is, Perl 6 uses =encoding
directives to determine the
encoding of its source code as well as that of any documentation.
Note that =encoding
is a fundamental Perldoc directive, like =begin
or
=for
; it is not an instance of an abbreviated block. Hence there is no paragraph or delimited form of the =encoding
directive (just as there is no paragraph or delimited form of =begin
).
The =config
directive allows you to prespecify standard configuration
information that is applied to every block of a particular type.
For example, to specify particular formatting for different levels of heading, you could preconfigure all the heading directives with appropriate formatting schemes:
=config head1 :formatted<B U> :numbered =config head2 :like<head1> :formatted<I> =config head3 :formatted<U> =config head4 :like<head3> :formatted<I>
The general syntax for configuration directives is:
=config BLOCK_TYPE CONFIG OPTIONS = OPTIONAL EXTRA CONFIG OPTIONS
Like =encoding
, a =config
is a directive, not a block. Hence,
there is no paragraph or delimited form of the =config
directive.
Each =config
specification is lexically scoped to the surrounding
block in which it is specified.
Note that, if a particular block later explicitly specifies a configuration option with the same key, that option overrides the pre-configured option. For example, given the heading configurations in the previous example, to specify a non-basic second-level heading:
=for head2 :formatted<I U> Details
The :like
option causes the current formatting options for the
named block type to be (lexically) replaced by the complete
formatting information of the block type specified as the :like
's
value. That other block type must already have been preconfigured. Any
additional formatting specifications are subsequently added to that
config. For example:
=comment In the current scope make =head2 an "important" variant of =head1 =config head2 :like<head1> :formatted<I>
Incidentally, this also means you can arrange for an explicit :formatted
option to augment an existing =config
, rather than replacing
it. Like so:
=comment Mark this =head3 (but only this one) as being important (in addition to the normal formatting)... =head3 :like<head3> :formatted<I>
You can also lexically preconfigure a formatting code, by naming it with a pair of angles as a suffix. For example:
=comment Always allow E<> codes in any (implicit or explicit) V<> code... =config V<> :allow<E>
=comment All inline code to be marked as important... =config C<> :formatted<I>
Note that, even though the formatting code is named using single-angles, the preconfiguration applies regardless of the actual delimiters used on subsequent instances of the code.
Perldoc provides a mechanism by which you can extend the syntax,
semantics, or content of your documentation: the =use
directive.
Specifying a =use
causes a Perldoc processor to load the
corresponding Perldoc module at that point, or to throw an exception if
it cannot.
Such modules can specify additional content that should be included in the document. Alternatively, they can register classes that handle new types of block directives or formatting codes.
Note that a module loaded via a =use
statement can affect the
content or the interpretation of subsequent blocks, but not the
initial parsing of those blocks. Any new block types must still
conform to the general syntax described in this document. Typically, a
module will change the way that renderers parse the contents of
specific blocks.
A =use
directive may be specified with either a module name or a URI:
=use MODULE_NAME OPTIONAL CONFIG DATA = OPTIONAL EXTRA CONFIG DATA
=use URI
If a URI is given, the specified file is treated as a source of Pod
to be included in the document. Any Pod blocks are parsed out of the
contents of the =use
'd file, and added to the main file's Pod
representation at that point.
If a module name is specified, with a language prefix of pod:
, then
the corresponding .pod
file is searched for in the $PERL6DOC
"documentation path". If none is found, the corresponding .pm
file is
then searched for in the library path ($PERL6LIB
). If either file is
found, the Pod is parsed out of it and the resulting block objects
inserted into the main file's representation.
If a module name is specified with any prefix except pod:
, or without
a prefix at all, then the corresponding .pm
file (or another
language's equivalent code module) is searched for in the appropriate
module library path. If found, the code module require
'd into the Pod
parser (usually to add a class implementing a particular Pod extension).
If no such code module is found, a suitable .pod
file is searched for
instead, the contents parsed as Pod, and the resulting block objects
inserted into the main file's representation.
You can use fully and partially specified module names (as with Perl 6 modules):
=use Perldoc::Plugin::XHTML-1.2.1-(*)
Any options that are specified after the module name:
=use Perldoc::Plugin::Image :Jpeg prefix=>'http://dev.perl.org'
are passed to the internal require
that loads the corresponding module.
Collectively these alternatives allow you to create standard documentation inserts or stylesheets, to include Pod extracted from other code files, or to specify new types of documentation blocks and formatting codes:
To create a standard Pod insertion or stylesheet, create a .pod
file and install it in your documentation path. Load it with either:
=use Pod::Insertion::Name
or:
=use pod:Pod::Insertion::Name
or:
=use file:/full/path/spec/Pod/Insertion/Name.pod
or even:
=use http://www.website.com/Pod/Insertion/Name.pod
To insert the Pod from a .pm
file (for example, to have your class
documentation include documentation from a base class):
=use pod:Some::Other::Module
To implement a new Pod block type or formatting code, create a .pm
file
and load it with either:
=use New::Perldoc::Subclass
or (more explicitly):
=use perl6:New::Perldoc::Subclass
To create a module that inserts Pod and also require
's a parser
extension, install a .pod
file that contains a nested =use
that
imports the necessary plug-in code. Then load the Pod file as above.
A typical example would be a Perldoc extension that also needs to specify some preconfiguration:
=use Hybrid::Content::Plus::Extension
Then, in the file some_perl_doc_dir/Hybrid/Content/Plus/Extension.pod:
=begin code :allow<R> =comment This file sets some config and also enables the Graph block
=config Graph :formatted< B >
=use perl6:Perldoc::Plugin::Graph-(*)-cpan:MEGAGIGA =end code
Note that =use
is a fundamental Perldoc directive, like =begin
or
=encoding
, so there is no paragraph or delimited form of =use
.
Directive
Specifies
=begin
Start of an explicitly terminated block
=config
Lexical modifications to a block or formatting code
=encoding
Encoding scheme for subsequent text
=end
Explicit termination of a
=begin
block
=for
Start of an implicitly (blank-line) terminated block
=use
Transclusion of content; loading of a Perldoc module
Block typename
Specifies
=code
Verbatim pre-formatted sample source code
=comment
Content to be ignored by all renderers
=head
NNth-level heading
=input
Pre-formatted sample input
=item
First-level list item
=item
NNth-level list item
=nested
Nest block contents within the current context
=output
Pre-formatted sample output
=para
Ordinary paragraph
=table
Simple rectangular table
=DATA
Perl 6 data section
=END
No ambient blocks after this point
=
RESERVEDSemantic blocks (
=SYNOPIS
,=BUGS
, etc.)
=
TypenameUser-defined block
Formatting code
Specifies
B<...>
Basis/focus of sentence (typically rendered bold)
C<...>
Code (typically rendered fixed-width)
D<...|...;...>
Definition (
D<R<defined term>|R<synonym>;R<synonym>;...>
)
E<...>
Entity name or numeric codepoint
I<...>
Important (typically rendered in italics)
K<...>
Keyboard input (typically rendered fixed-width)
L<...|...>
Link (
L<R<display text>|R<destination URI>>
)
M<...:...>
Module-defined code (
M<R<scheme>:R<contents>>
)
N<...>
Note (not rendered inline)
P<...>
Placement link
V<R><...>
Replaceable component or metasyntax
S<...>
Space characters to be preserved
T<...>
Terminal output (typically rendered fixed-width)
U<...>
Unusual (typically rendered with underlining)
V<V><...>
Verbatim (internal formatting codes ignored)
X<...|..,..;...>
Index entry (
X<R<display text>|R<entry>,R<subentry>;...>
)
Z<...>
Zero-width comment (contents never rendered)
1A valid identifier is a sequence of alphanumerics and/or underscores, beginning with an alphabetic or underscore
2Because an =
in the first column is
always the start of a Pod directive