DragonFly On-Line Manual Pages

PONSCR-SYNTAX(7)              Ponscripter manual              PONSCR-SYNTAX(7)

NAME
       ponscr-syntax - description of Ponscripter syntax

DESCRIPTION
       This page documents the syntax of Ponscripter scripts. See
       ponscripter(7) for an overview of other documentation.

       Note that this cannot be considered NScripter documentation. NScripter
       itself is largely unspecified. Ponscripter's implementation is
       ultimately based on observation and documentation rather than on
       reverse-engineering as such: it inevitably adopts different parsing
       strategies, and is more liberal in what it accepts. Not all differences
       are described below. With this disclaimer out of the way: the
       documentation.

   Fundamentals
       Scripts are line-based.

       There are two parsing modes: command mode and text mode. Parsing of
       each line begins in command mode, and switches to text mode for the
       rest of the line if a text command is encountered.

       The two parsing modes have little in common. The following sections
       discuss command mode first; text mode is then treated separately.

       A third mode, "unmarked text", exists for legacy reasons. This mode is
       similar to text mode, and is entered if an invalid character is
       encountered at the start of a command: that is, a number, a character
       outside the ASCII range, or anything in the set [[@\/%?$(!#,]. Such
       lines are valid if they contain only !-commands; otherwise a warning is
       issued, and the behaviour is undefined.

       (This mode derives from NScripter, where traditionally there was no
       intersection between printable characters and command characters; the `
       text marker was introduced in ONScripter as a means of supporting
       English text, and replaced in Ponscripter with ^ to free up ` for other
       uses. Since unmarked text serves no useful purpose, and complicates
       parsing, it is deprecated and will be removed without notice at some
       point in the future.)

   Context
       NScripter is a context-sensitive language. Each parameter to a command
       may be parsed differently based on the type of that parameter. The
       major types are string and integer, with labels and barewords being
       special cases of string parameters.

       String expressions do not merely have a different type from integer
       expressions, as in other languages: they have a distinct syntax. Some
       string expressions can be parsed as integer expressions, but then leave
       code unparsed that will cause a syntax error when it is reached. It is
       impossible, in the general case, to parse a line of code unless it is
       known in advance what context each parameter is using.

       For example, given the following definitions:

           numalias foo, 100
           stralias foo, "bar"

       the constant foo would have the value 100 in integer context, but "bar"
       in string context.

       See the next section for more on constants, and "Expression syntax"
       below for details of the syntax accepted in each context.

   Lexical categories
       The following broad lexical categories are used in command mode
       (parenthesised names are used in syntax descriptions below):

       Comments
           are introduced with a semicolon, and last to the end of the line.

       Barewords (bareword)
           have the same syntax as identifiers in most programming languages:
           the first character must be in the set [A-Za-z_], and the remainder
           must be in the set [A-Za-z0-9_].

           A bareword at the start of a line, or immediately following a
           colon, is assumed to be a command. Otherwise their interpretation
           is context-sensitive:

           o   If an alias exists of the desired type (a numalias in number
               context, or a stralias in string context) then the bareword
               acts as a constant, and the value of the alias is substituted.

           o   In string context where no stralias exists, the bareword itself
               is treated as a string; it will be transformed to lower case
               and substituted directly.

           o   In number context where no numalias exists, a warning is issued
               and 0 is substituted.

           o   Some commands, such as rmenu, ld, and systemcall, look for
               barewords directly for certain parameters; in these cases
               aliases are not resolved.

       String literals (str_lit)
           are formed in two ways.They may be enclosed in regular double
           quotes, or in pairs of the text delimiter (^ in native scripts, `
           in legacy scripts).

           The two forms have slightly different semantics. Strings enclosed
           in text delimiters support ~-tags (described under "text mode"
           below) to apply text formatting, while tildes are literal
           characters in double-quoted strings.

           Note: this differs from ONScripter (and some pre-release versions
           of Ponscripter), where double-quoted strings had semantics similar
           to unmarked text: in particular, whitespace was ignored.

           In these interpreters, whitespace could be made significant in
           double-quoted text by following the opening quote with a text
           delimiter. This no longer has any effect, but is still supported
           for backwards-compatibility: the text delimiter is ignored, and the
           construct is equivalent to a double-quoted string.

       Numeric literals (num_lit)
           are straightforward.

           Unlike NScripter, which accepts only decimal integers, Ponscripter
           also understands the C-style 0xNN notation for hexadecimal numbers.

       Label literals (label)
           have the general format *bareword. They are used to mark and
           provide targets for jump commands (goto, csel, etc) and for the
           construction of subroutines with commands such as defsub,
           textgosub, etc.

           (In NScripter, label literals are a distinct type that can only be
           used where a command is expecting a label. ONScripter also accepts
           them wherever a string is expected: *foo means roughly the same
           thing as "foo".)

       Colour literals (colour)
           have the general format #RRGGBB, where RR, GG, and BB are each two
           hex digits. These represent colours in the standard way.

           (In NScripter, colour literals are a distinct type that can only be
           used where a command is expecting a colour. ONScripter also accepts
           them wherever a string is expected: #RRGGBB means exactly the same
           thing as "#RRGGBB".)

       Variables (int_var, str_var)
           take the form of a sigil followed by either a number, a bareword
           (which must have been defined with numalias), or an integer
           variable (with sigil) for indirect access.

           The sigils are % for integer variables, ?  for integer arrays, and
           $ for string variables.

           Hence %200 (an integer variable), $%foo (the string variable
           indexed by the current value of %foo), and ?bar[9][4]
           (dereferencing the multidimensional array ?bar).

           Variable syntax is expressed formally in the expression sections
           below.

   Expression syntax
       Integer expressions (int_expr)
           are similar to those in other languages. The syntax is infix. There
           are two operator precedence levels: *, /, and mod are processed
           before + and -. Parentheses and unary minus operate as normal.

           More formally:

               int_expr    ::=   int_term binary_op int_expr
               int_term    ::=   int_paren | "-" int_paren
               int_paren   ::=   "(" int_expr ")" | int_elt
               int_elt     ::=   num_lit | int_var | bareword
               int_var     ::=   "%" int_elt | "?" int_elt subscript+
               subscript   ::=   "[" int_expr "]"
               binary_op   ::=   "*" | "/" | "mod" | "+" | "-"
               num_lit     ::=   [0-9]+ | 0x[0-9A-Fa-f]+
               bareword    ::=   [A-Za-z_][A-Za-z_0-9]*

       String expressions (str_expr)
           are simpler. Their grammar is as follows:

               str_expr    ::=   str_elt | str_elt "+" str_expr
               str_elt     ::=   file_cond | str_lit | str_var | label |
                                 colour | bareword
               file_cond   ::=   "(" str_term ")" str_term str_term
               str_var     ::=   "$" int_elt
               str_lit     ::=   "[^"]*?" | ^[^^]*?^
               label       ::=   "*" [A-Za-z_0-9]+
               colour      ::=   "#" [0-9A-Fa-f]{6}

           The only part of the above that should not be obvious, given the
           descriptions under "Lexical categories" above, is the file_cond
           term. This is only useful when the filelog command is in effect.
           The parenthesised string is interpreted as the name of an image
           file. If the player has viewed this file, the first of the
           subsequent terms is used; otherwise, the second is used.

       Conditional expressions (conditional)
           are effectively a special syntax associated with the if / notif
           commands.

           They are somewhat lacking compared to conditionals in most
           languages: in particular, multiple terms may be combined only with
           an "and" operator, with no "or" available.

           Either strings or integers may be compared. The ordering of strings
           is deliberately left undefined; it may change without warning in
           the future. However, for any given Ponscripter version, the
           ordering will be the same across all platforms and will not be
           affected by users' locale settings.

           The operators are C-style: == and != for equality and inequality;
           <, <=, >, and >= for ordering; and & to combine terms with a
           logical "and".

           (Several operators accept variant forms: && for &, = for ==, and <>
           for !=. These variants have no semantic difference from the
           canonical forms.)

           Functions cannot be called from conditional expressions (you must
           assign the result of a function to a variable, and compare that
           manually), with one exception: there is hardcoded support for a
           function fchk, which takes a string, interprets it as the filename
           of a picture, and returns true iff that picture has been displayed.
           (This is analogous to the file_cond term in string expressions.)

           The grammar is:

              conditional   ::=   cond_term | cond_term "&" conditional
              cond_term     ::=   comp_term | "fchk" str_expr
              comparison    ::=   expression comp_op expression
              expression    ::=   int_expr | str_expr
              comp_op       ::=   "==" | | "!=" | ">" | ">=" | "<" | "<="

   Command syntax
       The above lexemes and expressions are combined in a fairly similar way
       to BASIC. Commands are interpreted sequentially, one to a line;
       multiple commands may be placed on a single line, where required, by
       separating them with colons.

       There are several forms of command:

       o   Procedure calls consist of a bareword, normally followed by a
           parameter list: this is a comma-separated list of expressions
           (parentheses are not used).

       o   Labels consist of a label literal, which serves as a name for that
           point in the script.

           There is also a form of anonymous label, represented by a single ~
           character, which is used by the jumpf and jumpb commands.

       o   Text commands consist of a text delimiter, which switches the
           interpreter into text mode for the remainder of the line; see next
           section.

   Text mode
       As described above, text commands begin with a text marker (^ in native
       scripts, ` in legacy scripts). The remainder of the line is then parsed
       in text mode.

       Most characters in text mode represent themselves and are printed
       verbatim; this includes the newline at the end of each line, unless it
       is explicitly suppressed with /. It also includes characters with
       special meanings in command mode, such as colons and semicolons.

       However, there are also a fair number of control characters with
       special meanings. Since text syntax was not so much designed as
       gradually accumulated, there is very little consistency in how these
       control characters are chosen, when exactly in the parsing process they
       are interpreted, and how they are printed literally. Read on for
       details.

   Text control
       Single characters with special meanings. These characters may all be
       printed literally by prefixing them with a single hash character, i.e.
       #@, #_, etc.

       @
           Waits for click, then continues printing text as though nothing had
           happened. (Unlike in many ONScripter builds, the behaviour of @ is
           not altered by the definition of a textgosub routine.)

       \
           Waits for a click, then clears the text window and begins a new
           page.

       _
           If a character has the clickstr nature, prefixing it with an
           underscore suppresses that behaviour; otherwise it does nothing
           whatsoever.  clickstr is evil, so you should never need to use
           this. Place your pauses explicitly.

       /
           At the end of a line, ends a text command without beginning a new
           line of display text. This control only has any effect immediately
           before a newline character. Anywhere else in a line, even if only
           whitespace follows, it prints a literal slash.

   Speed control
       Multi-character control codes controlling text speed.

       Whitespace after these codes is ignored; you can cause it to be treated
       literally by adding a trailing separator character, i.e.  !sd| etc.

       If one of these sequences would appear in literal text, it can be
       escaped by prefixing it with a single hash character, i.e.  #!sd etc.

       Due to existing conventions for script layout, these codes are also
       valid as standalone commands without a preceding text marker; in this
       case they must be the only thing on their line apart from whitespace.

       !sNUM
           Sets text speed; this is equivalent to the commmand

               textspeed NUM

           but has a more convenient syntax in cases where the speed must
           change within a single line.

           Lower speeds are faster; 0 means there should be no deliberate
           delay between characters, though (as they are still printed one at
           a time) it may not quite lead to instantaneous display.

       !sd
           Resets text speed to the current player-selected default.

       !wNUM
           Inserts a pause of NUM milliseconds. It cannot be truncated by
           clicking, but can be skipped with any of the normal skip commands.

       !dNUM
           As !w, but the pause can also be truncated by clicking.

   Colour tags
       #RRGGBB, where RR, GG, and BB are each two hex digits, modifies the
       current text foreground colour in the obvious way. A literal hash
       character can be inserted with ##.

   Formatting tags
       All formatting other than text colour is performed with formatting tag
       blocks. These are delimited with tildes; a literal tilde can be
       inserted with ~~ (not #~... that would be consistent.)

       Any number of tags can be combined within a single block, optionally
       separated with whitespace.

       Font selection tags
           The tags in this section, with the exception of c, assume that
           Ponscripter's eight font slots are assigned according to the
           following convention:

               0 - text regular
               1 - text italic
               2 - text bold
               3 - text bold italic
               4 - display regular
               5 - display italic
               6 - display bold
               7 - display bold italic

           If fonts are assigned in any other way, tags such as b and i will
           not behave as documented; you should use c in this case. Font slots
           are assigned using the h_mapfont command, which is documented in
           ponscr-ext(7).

           cN
               Selects the font in slot N

           d
               Selects the default style (equivalent to c0)

           r
               Disables italics (default)

           i
               Toggles italics

           t
               Disables bold weight (default)

           i
               Toggles bold weight

           f
               Selects text face (default)

           s
               Toggles display face

       Text size
           In this section, the base size refers to the font size defined for
           the active window; the current size refers to that selected with
           previous size control tags.

           =N
               Sets font size to exactly N pixels.

           %N
               Sets font size to N% of the base size.

           *N
               Increases current font size by N pixels.

           -N
               Decreases current font size by N pixels.

       Text position

           xN
               Sets the horizontal text position to a position N pixels right
               of the left margin.

           yN
               Sets the vertical text position to a position N pixels below
               the top margin.

           x+N, y+N
               Adjusts the current horizontal or vertical text position by N
               pixels right or down.

           x-N, y-N
               Adjusts the current horizontal or vertical text position by N
               pixels left or up.

       Indentation

           n
               Sets the indent to the current horizontal position. New text
               lines will start from this offset until the end of the current
               page.

           u
               Resets the indent to the left margin. This will only affect
               subsequent line breaks; to end an indented section within a
               page, position this at the end of the last line of the indented
               section.

           In addition to these tags, the indent is set automatically when the
           first character of a page is an indent character.

           The set of indent characters can be configured with the h_indentstr
           command (described in ponscr-ext(7)). By default it includes
           opening quotes and em dashes.

       Formatting examples
           As an example of the usage of these tags, Narcissu 2's omake mode
           displays page headings at the top of each screen with code like

               ^!s0~i %120 x-20 y-40~Heading~i =0~!sd
               br2 120

           Here the !s0 and !sd are the usual NScripter commands. The first
           tag block selects italic text, 120% of the regular font size, and
           shifts the output position up and to the left. The second tag block
           cancels the italic effect and resets the font size to normal.

           An example of indentation:

               ^**%.Item 1
               ^Not indented
               ^**%.~n~Item 2
               ^Indented~u~
               ^Not indented

   Ligatures and shortcuts
       To assist in typing Unicode scripts with ASCII keyboards, Ponscripter
       has the ability to replace sequences of characters with Unicode
       symbols. This facility is also used to implement the hash-escaping of
       single-character control codes, and can be used to add ligatures
       automatically. It is only enabled in native scripts; none of this is
       possible in legacy mode.

       A shortcut is a mapping of a sequence of characters to a Unicode
       codepoint.

       A shortcut sequence can be inserted literally by separating the
       characters with either a Unicode ZWNJ or a | character, e.g.  `|` to
       insert two separate open single quotes. A literal | can be inserted
       with ||.

       By default, the following character sequences are defined, in addition
       to the hash escapes described above:

       ``
           open double quotes

       ''
           close double quotes

       `
           open single quote

       '
           apostrophe / close single quote

       Additional sequences can be defined by use of the h_ligate command: see
       ponscr-ext(7).

   Variable interpolation
       Unlike in vanilla NScripter, merely including the name of a variable in
       text does not cause it to be interpolated; this is because frankly it
       seems to be more common to want something like $500 to be literal text
       representing a sum of money.

       Instead, variables will be interpolated if enclosed in braces: {$foo},
       {?100[%index]}, and so forth. This is not to be confused with
       NScripter's rather less useful brace syntax (variable assignments),
       which is not supported.

       The variable's sigil must immediately follow the opening brace, and
       only variables can be interpolated, not arbitrary expressions. To
       include a literal sequence of a left brace followed by a sigil
       character, use a separator character: {|%.

       Certain control codes are recognised after variable interpolation,
       since they are parsed at a later stage of processing: these are text
       controls, speed controls, colour tags, and ligatures/shortcuts. In
       particular, and in contrast to NScripter, things like ^!w{%var} will be
       interpreted as a command to wait for however long is specified in the
       given variable. This should be considered an undefined behaviour, and
       will probably change in future; rather than rely on it, you should use
       the wait command (and so forth) for variable timings, and in the
       unlikely event that you actually intend to print the literal string !w
       followed by the value of %var, you should write #!w{%var} to avoid
       ambiguity.

       Other special sequences are not recognised after interpolation.
       Variable interpolations are not expanded recursively. Likewise,
       formatting codes are not processed during interpolation; however, if
       the string literal in which they first appeared was delimited with ^
       rather than ", they will have been processed when the string was read,
       and will therefore work as intended.

       That is to say,

           mov $var, "~b~"
           ^foo{$var}bar\

       prints

           foo~b~bar

       , while

           mov $var, ^~b~^
           ^foo{$var}bar\

       prints

           foobar

       .

BUGS
       This whole syntax may be considered a bug: it is inconvenient,
       irregular, and needlessly difficult to parse. Don't blame me: I didn't
       design it, I'm just documenting it. If you want a similar tool with
       sane syntax, try something like Ren'Py.

SEE ALSO
       ponscripter(7)

Ponscripter 20111009              2014-03-28                  PONSCR-SYNTAX(7)