DragonFly On-Line Manual Pages
PONSCR-SYNTAX(7) Ponscripter manual PONSCR-SYNTAX(7)
NAME
ponscr-syntax - description of Ponscripter syntax
DESCRIPTION
This page documents the syntax of Ponscripter scripts. See
ponscripter(7) for an overview of other documentation.
Note that this cannot be considered NScripter documentation. NScripter
itself is largely unspecified. Ponscripter's implementation is
ultimately based on observation and documentation rather than on
reverse-engineering as such: it inevitably adopts different parsing
strategies, and is more liberal in what it accepts. Not all differences
are described below. With this disclaimer out of the way: the
documentation.
Fundamentals
Scripts are line-based.
There are two parsing modes: command mode and text mode. Parsing of
each line begins in command mode, and switches to text mode for the
rest of the line if a text command is encountered.
The two parsing modes have little in common. The following sections
discuss command mode first; text mode is then treated separately.
A third mode, "unmarked text", exists for legacy reasons. This mode is
similar to text mode, and is entered if an invalid character is
encountered at the start of a command: that is, a number, a character
outside the ASCII range, or anything in the set [[@\/%?$(!#,]. Such
lines are valid if they contain only !-commands; otherwise a warning is
issued, and the behaviour is undefined.
(This mode derives from NScripter, where traditionally there was no
intersection between printable characters and command characters; the `
text marker was introduced in ONScripter as a means of supporting
English text, and replaced in Ponscripter with ^ to free up ` for other
uses. Since unmarked text serves no useful purpose, and complicates
parsing, it is deprecated and will be removed without notice at some
point in the future.)
Context
NScripter is a context-sensitive language. Each parameter to a command
may be parsed differently based on the type of that parameter. The
major types are string and integer, with labels and barewords being
special cases of string parameters.
String expressions do not merely have a different type from integer
expressions, as in other languages: they have a distinct syntax. Some
string expressions can be parsed as integer expressions, but then leave
code unparsed that will cause a syntax error when it is reached. It is
impossible, in the general case, to parse a line of code unless it is
known in advance what context each parameter is using.
For example, given the following definitions:
numalias foo, 100
stralias foo, "bar"
the constant foo would have the value 100 in integer context, but "bar"
in string context.
See the next section for more on constants, and "Expression syntax"
below for details of the syntax accepted in each context.
Lexical categories
The following broad lexical categories are used in command mode
(parenthesised names are used in syntax descriptions below):
Comments
are introduced with a semicolon, and last to the end of the line.
Barewords (bareword)
have the same syntax as identifiers in most programming languages:
the first character must be in the set [A-Za-z_], and the remainder
must be in the set [A-Za-z0-9_].
A bareword at the start of a line, or immediately following a
colon, is assumed to be a command. Otherwise their interpretation
is context-sensitive:
o If an alias exists of the desired type (a numalias in number
context, or a stralias in string context) then the bareword
acts as a constant, and the value of the alias is substituted.
o In string context where no stralias exists, the bareword itself
is treated as a string; it will be transformed to lower case
and substituted directly.
o In number context where no numalias exists, a warning is issued
and 0 is substituted.
o Some commands, such as rmenu, ld, and systemcall, look for
barewords directly for certain parameters; in these cases
aliases are not resolved.
String literals (str_lit)
are formed in two ways.They may be enclosed in regular double
quotes, or in pairs of the text delimiter (^ in native scripts, `
in legacy scripts).
The two forms have slightly different semantics. Strings enclosed
in text delimiters support ~-tags (described under "text mode"
below) to apply text formatting, while tildes are literal
characters in double-quoted strings.
Note: this differs from ONScripter (and some pre-release versions
of Ponscripter), where double-quoted strings had semantics similar
to unmarked text: in particular, whitespace was ignored.
In these interpreters, whitespace could be made significant in
double-quoted text by following the opening quote with a text
delimiter. This no longer has any effect, but is still supported
for backwards-compatibility: the text delimiter is ignored, and the
construct is equivalent to a double-quoted string.
Numeric literals (num_lit)
are straightforward.
Unlike NScripter, which accepts only decimal integers, Ponscripter
also understands the C-style 0xNN notation for hexadecimal numbers.
Label literals (label)
have the general format *bareword. They are used to mark and
provide targets for jump commands (goto, csel, etc) and for the
construction of subroutines with commands such as defsub,
textgosub, etc.
(In NScripter, label literals are a distinct type that can only be
used where a command is expecting a label. ONScripter also accepts
them wherever a string is expected: *foo means roughly the same
thing as "foo".)
Colour literals (colour)
have the general format #RRGGBB, where RR, GG, and BB are each two
hex digits. These represent colours in the standard way.
(In NScripter, colour literals are a distinct type that can only be
used where a command is expecting a colour. ONScripter also accepts
them wherever a string is expected: #RRGGBB means exactly the same
thing as "#RRGGBB".)
Variables (int_var, str_var)
take the form of a sigil followed by either a number, a bareword
(which must have been defined with numalias), or an integer
variable (with sigil) for indirect access.
The sigils are % for integer variables, ? for integer arrays, and
$ for string variables.
Hence %200 (an integer variable), $%foo (the string variable
indexed by the current value of %foo), and ?bar[9][4]
(dereferencing the multidimensional array ?bar).
Variable syntax is expressed formally in the expression sections
below.
Expression syntax
Integer expressions (int_expr)
are similar to those in other languages. The syntax is infix. There
are two operator precedence levels: *, /, and mod are processed
before + and -. Parentheses and unary minus operate as normal.
More formally:
int_expr ::= int_term binary_op int_expr
int_term ::= int_paren | "-" int_paren
int_paren ::= "(" int_expr ")" | int_elt
int_elt ::= num_lit | int_var | bareword
int_var ::= "%" int_elt | "?" int_elt subscript+
subscript ::= "[" int_expr "]"
binary_op ::= "*" | "/" | "mod" | "+" | "-"
num_lit ::= [0-9]+ | 0x[0-9A-Fa-f]+
bareword ::= [A-Za-z_][A-Za-z_0-9]*
String expressions (str_expr)
are simpler. Their grammar is as follows:
str_expr ::= str_elt | str_elt "+" str_expr
str_elt ::= file_cond | str_lit | str_var | label |
colour | bareword
file_cond ::= "(" str_term ")" str_term str_term
str_var ::= "$" int_elt
str_lit ::= "[^"]*?" | ^[^^]*?^
label ::= "*" [A-Za-z_0-9]+
colour ::= "#" [0-9A-Fa-f]{6}
The only part of the above that should not be obvious, given the
descriptions under "Lexical categories" above, is the file_cond
term. This is only useful when the filelog command is in effect.
The parenthesised string is interpreted as the name of an image
file. If the player has viewed this file, the first of the
subsequent terms is used; otherwise, the second is used.
Conditional expressions (conditional)
are effectively a special syntax associated with the if / notif
commands.
They are somewhat lacking compared to conditionals in most
languages: in particular, multiple terms may be combined only with
an "and" operator, with no "or" available.
Either strings or integers may be compared. The ordering of strings
is deliberately left undefined; it may change without warning in
the future. However, for any given Ponscripter version, the
ordering will be the same across all platforms and will not be
affected by users' locale settings.
The operators are C-style: == and != for equality and inequality;
<, <=, >, and >= for ordering; and & to combine terms with a
logical "and".
(Several operators accept variant forms: && for &, = for ==, and <>
for !=. These variants have no semantic difference from the
canonical forms.)
Functions cannot be called from conditional expressions (you must
assign the result of a function to a variable, and compare that
manually), with one exception: there is hardcoded support for a
function fchk, which takes a string, interprets it as the filename
of a picture, and returns true iff that picture has been displayed.
(This is analogous to the file_cond term in string expressions.)
The grammar is:
conditional ::= cond_term | cond_term "&" conditional
cond_term ::= comp_term | "fchk" str_expr
comparison ::= expression comp_op expression
expression ::= int_expr | str_expr
comp_op ::= "==" | | "!=" | ">" | ">=" | "<" | "<="
Command syntax
The above lexemes and expressions are combined in a fairly similar way
to BASIC. Commands are interpreted sequentially, one to a line;
multiple commands may be placed on a single line, where required, by
separating them with colons.
There are several forms of command:
o Procedure calls consist of a bareword, normally followed by a
parameter list: this is a comma-separated list of expressions
(parentheses are not used).
o Labels consist of a label literal, which serves as a name for that
point in the script.
There is also a form of anonymous label, represented by a single ~
character, which is used by the jumpf and jumpb commands.
o Text commands consist of a text delimiter, which switches the
interpreter into text mode for the remainder of the line; see next
section.
Text mode
As described above, text commands begin with a text marker (^ in native
scripts, ` in legacy scripts). The remainder of the line is then parsed
in text mode.
Most characters in text mode represent themselves and are printed
verbatim; this includes the newline at the end of each line, unless it
is explicitly suppressed with /. It also includes characters with
special meanings in command mode, such as colons and semicolons.
However, there are also a fair number of control characters with
special meanings. Since text syntax was not so much designed as
gradually accumulated, there is very little consistency in how these
control characters are chosen, when exactly in the parsing process they
are interpreted, and how they are printed literally. Read on for
details.
Text control
Single characters with special meanings. These characters may all be
printed literally by prefixing them with a single hash character, i.e.
#@, #_, etc.
@
Waits for click, then continues printing text as though nothing had
happened. (Unlike in many ONScripter builds, the behaviour of @ is
not altered by the definition of a textgosub routine.)
\
Waits for a click, then clears the text window and begins a new
page.
_
If a character has the clickstr nature, prefixing it with an
underscore suppresses that behaviour; otherwise it does nothing
whatsoever. clickstr is evil, so you should never need to use
this. Place your pauses explicitly.
/
At the end of a line, ends a text command without beginning a new
line of display text. This control only has any effect immediately
before a newline character. Anywhere else in a line, even if only
whitespace follows, it prints a literal slash.
Speed control
Multi-character control codes controlling text speed.
Whitespace after these codes is ignored; you can cause it to be treated
literally by adding a trailing separator character, i.e. !sd| etc.
If one of these sequences would appear in literal text, it can be
escaped by prefixing it with a single hash character, i.e. #!sd etc.
Due to existing conventions for script layout, these codes are also
valid as standalone commands without a preceding text marker; in this
case they must be the only thing on their line apart from whitespace.
!sNUM
Sets text speed; this is equivalent to the commmand
textspeed NUM
but has a more convenient syntax in cases where the speed must
change within a single line.
Lower speeds are faster; 0 means there should be no deliberate
delay between characters, though (as they are still printed one at
a time) it may not quite lead to instantaneous display.
!sd
Resets text speed to the current player-selected default.
!wNUM
Inserts a pause of NUM milliseconds. It cannot be truncated by
clicking, but can be skipped with any of the normal skip commands.
!dNUM
As !w, but the pause can also be truncated by clicking.
Colour tags
#RRGGBB, where RR, GG, and BB are each two hex digits, modifies the
current text foreground colour in the obvious way. A literal hash
character can be inserted with ##.
Formatting tags
All formatting other than text colour is performed with formatting tag
blocks. These are delimited with tildes; a literal tilde can be
inserted with ~~ (not #~... that would be consistent.)
Any number of tags can be combined within a single block, optionally
separated with whitespace.
Font selection tags
The tags in this section, with the exception of c, assume that
Ponscripter's eight font slots are assigned according to the
following convention:
0 - text regular
1 - text italic
2 - text bold
3 - text bold italic
4 - display regular
5 - display italic
6 - display bold
7 - display bold italic
If fonts are assigned in any other way, tags such as b and i will
not behave as documented; you should use c in this case. Font slots
are assigned using the h_mapfont command, which is documented in
ponscr-ext(7).
cN
Selects the font in slot N
d
Selects the default style (equivalent to c0)
r
Disables italics (default)
i
Toggles italics
t
Disables bold weight (default)
i
Toggles bold weight
f
Selects text face (default)
s
Toggles display face
Text size
In this section, the base size refers to the font size defined for
the active window; the current size refers to that selected with
previous size control tags.
=N
Sets font size to exactly N pixels.
%N
Sets font size to N% of the base size.
*N
Increases current font size by N pixels.
-N
Decreases current font size by N pixels.
Text position
xN
Sets the horizontal text position to a position N pixels right
of the left margin.
yN
Sets the vertical text position to a position N pixels below
the top margin.
x+N, y+N
Adjusts the current horizontal or vertical text position by N
pixels right or down.
x-N, y-N
Adjusts the current horizontal or vertical text position by N
pixels left or up.
Indentation
n
Sets the indent to the current horizontal position. New text
lines will start from this offset until the end of the current
page.
u
Resets the indent to the left margin. This will only affect
subsequent line breaks; to end an indented section within a
page, position this at the end of the last line of the indented
section.
In addition to these tags, the indent is set automatically when the
first character of a page is an indent character.
The set of indent characters can be configured with the h_indentstr
command (described in ponscr-ext(7)). By default it includes
opening quotes and em dashes.
Formatting examples
As an example of the usage of these tags, Narcissu 2's omake mode
displays page headings at the top of each screen with code like
^!s0~i %120 x-20 y-40~Heading~i =0~!sd
br2 120
Here the !s0 and !sd are the usual NScripter commands. The first
tag block selects italic text, 120% of the regular font size, and
shifts the output position up and to the left. The second tag block
cancels the italic effect and resets the font size to normal.
An example of indentation:
^**%.Item 1
^Not indented
^**%.~n~Item 2
^Indented~u~
^Not indented
Ligatures and shortcuts
To assist in typing Unicode scripts with ASCII keyboards, Ponscripter
has the ability to replace sequences of characters with Unicode
symbols. This facility is also used to implement the hash-escaping of
single-character control codes, and can be used to add ligatures
automatically. It is only enabled in native scripts; none of this is
possible in legacy mode.
A shortcut is a mapping of a sequence of characters to a Unicode
codepoint.
A shortcut sequence can be inserted literally by separating the
characters with either a Unicode ZWNJ or a | character, e.g. `|` to
insert two separate open single quotes. A literal | can be inserted
with ||.
By default, the following character sequences are defined, in addition
to the hash escapes described above:
``
open double quotes
''
close double quotes
`
open single quote
'
apostrophe / close single quote
Additional sequences can be defined by use of the h_ligate command: see
ponscr-ext(7).
Variable interpolation
Unlike in vanilla NScripter, merely including the name of a variable in
text does not cause it to be interpolated; this is because frankly it
seems to be more common to want something like $500 to be literal text
representing a sum of money.
Instead, variables will be interpolated if enclosed in braces: {$foo},
{?100[%index]}, and so forth. This is not to be confused with
NScripter's rather less useful brace syntax (variable assignments),
which is not supported.
The variable's sigil must immediately follow the opening brace, and
only variables can be interpolated, not arbitrary expressions. To
include a literal sequence of a left brace followed by a sigil
character, use a separator character: {|%.
Certain control codes are recognised after variable interpolation,
since they are parsed at a later stage of processing: these are text
controls, speed controls, colour tags, and ligatures/shortcuts. In
particular, and in contrast to NScripter, things like ^!w{%var} will be
interpreted as a command to wait for however long is specified in the
given variable. This should be considered an undefined behaviour, and
will probably change in future; rather than rely on it, you should use
the wait command (and so forth) for variable timings, and in the
unlikely event that you actually intend to print the literal string !w
followed by the value of %var, you should write #!w{%var} to avoid
ambiguity.
Other special sequences are not recognised after interpolation.
Variable interpolations are not expanded recursively. Likewise,
formatting codes are not processed during interpolation; however, if
the string literal in which they first appeared was delimited with ^
rather than ", they will have been processed when the string was read,
and will therefore work as intended.
That is to say,
mov $var, "~b~"
^foo{$var}bar\
prints
foo~b~bar
, while
mov $var, ^~b~^
^foo{$var}bar\
prints
foobar
.
BUGS
This whole syntax may be considered a bug: it is inconvenient,
irregular, and needlessly difficult to parse. Don't blame me: I didn't
design it, I'm just documenting it. If you want a similar tool with
sane syntax, try something like Ren'Py.
SEE ALSO
ponscripter(7)
Ponscripter 20111009 2014-03-28 PONSCR-SYNTAX(7)