DragonFly On-Line Manual Pages

Search: Section:  


cfg(3)                       Configuration Parsing                      cfg(3)

NAME

OSSP cfg - Configuration Parsing

VERSION

OSSP cfg 0.9.11 (10-Aug-2006)

SYNOPSIS

API Header: cfg.h API Types: cfg_t, cfg_rc_t, cfg_node_type_t, cfg_node_t, cfg_node_attr_t, cfg_fmt_t, cfg_data_t, cfg_data_ctrl_t, cfg_data_cb_t, cfg_data_attr_t API Functions: cfg_create, cfg_destroy, cfg_error, cfg_version, cfg_import, cfg_export, cfg_node_create, cfg_node_destroy, cfg_node_clone, cfg_node_set, cfg_node_get, cfg_node_root, cfg_node_select, cfg_node_find, cfg_node_apply, cfg_node_cmp, cfg_node_link, cfg_node_unlink, cfg_data_set, cfg_data_get, cfg_data_ctrl

DESCRIPTION

OSSP cfg is a ISO-C library for parsing arbitrary C/C++-style configuration files. A configuration is sequence of directives. Each directive consists of zero or more tokens. Each token can be either a string or again a complete sequence. This means the configuration syntax has a recursive structure and this way allows to create configurations with arbitrarily nested sections. Additionally the configuration syntax provides complex single/double/balanced quoting of tokens, hexadecimal/octal/decimal character encodings, character escaping, C/C++ and Shell-style comments, etc. The library API allows importing a configuration text into an Abstract Syntax Tree (AST), traversing the AST and optionally exporting the AST again as a configuration text. CONFIGURATION SYNTAX The configuration syntax is described by the following context-free (Chomsky-2) grammar: sequence ::= empty | directive | directive SEP sequence directive ::= token | token directive token ::= OPEN sequence CLOSE | string string ::= DQ_STRING # double quoted string | SQ_STRING # single quoted string | FQ_STRING # flexible quoted string | PT_STRING # plain text string The other contained terminal symbols are defined itself by the following set of grammars production (regular sub-grammars for character sequences given as Perl-style regular expressions "/regex/"): SEP ::= /;/ OPEN ::= /{/ CLOSE ::= /}/ DQ_STRING ::= /"/ DQ_CHARS /"/ DQ_CHARS ::= empty | DQ_CHAR DQ_CHARS DQ_CHAR ::= /\\"/ # escaped quote | /\\x\{[0-9a-fA-F]+\}/ # hex-char group | /\\x[0-9a-fA-F]{2}/ # hex-char | /\\[0-7]{1,3}/ # octal character | /\\[nrtbfae]/ # special character | /\\\n[ \t]*/ # line continuation | /\\\\/ # escaped escape | /./ # any other char SQ_STRING ::= /'/ SQ_CHARS /'/ SQ_CHARS ::= empty | SQ_CHAR SQ_CHARS SQ_CHAR ::= /\\'/ # escaped quote | /\\\n[ \t]*/ # line contination | /\\\\/ # escaped escape | /./ # any other char FQ_STRING ::= /q/ FQ_OPEN FQ_CHARS FQ_CLOSE FQ_CHARS ::= empty | FQ_CHAR FQ_CHARS FQ_CHAR ::= /\\/ FQ_OPEN # escaped open | /\\/ FQ_CLOSE # escaped close | /\\\n[ \t]*/ # line contination | /./ # any other char FQ_OPEN ::= /[!"#$%&'()*+,-./:;<=>?@\[\\\]^_`{|}~]/ FQ_CLOSE ::= << FQ_OPEN or corresponding closing char ('}])>') if FQ_OPEN is a char of '{[(<' >> PT_STRING ::= PT_CHAR PT_CHARS PT_CHARS ::= empty | PT_CHAR PT_STRING PT_CHAR ::= /[^ \t\n;{}"']/ # none of specials Additionally, white-space WS and comment CO tokens are allowed at any position in the above productions of the previous grammar part. WS ::= /[ \t\n]+/ CO ::= CO_C # style of C | CO_CXX # style of C++ | CO_SH # style of /bin/sh CO_C ::= /\/\*([^*]|\*(?!\/))*\*\// CO_CXX ::= /\/\/[^\n]*/ CO_SH ::= /#[^\n]*/ Finally, any configuration line can have a trailing backslash character (\) just before the newline character for simple line continuation. The backslash, the newline and (optionally) the leading whitespaces on the following line are silently obsorbed and as a side-effect continue the first line with the contents of the second lines. CONFIGURATION EXAMPLE A more intuitive description of the configuration syntax is perhaps given by the following example which shows all features at once: /* single word */ foo; /* multi word */ foo bar quux; /* nested structure */ foo { bar; baz } quux; /* quoted strings */ 'foo bar' "foo\x0a\t\n\ bar" APPLICATION PROGRAMMING INTERFACE (API) ...

NODE SELECTION SPECIFICATION

The cfg_node_select function takes a node selection specification string select for locating the intended nodes. This specification is defined as: select ::= empty | select-step select select-step ::= select-direction select-pattern select-filter select-direction ::= "./" # current node | "../" # parent node | "..../" # anchestor nodes | "-/" # previous sibling node | "--/" # preceeding sibling nodes | "+/" # next sibling node | "++/" # following sibling nodes | "/" # child nodes | "//" # descendant nodes select-pattern ::= /</ regex />/ | token select-filter ::= empty | /\[/ filter-range /\]/ filter-range ::= num # short for: num,num | num /,/ # short for: num,-1 | /,/ num # short for: 1,num | num /,/ num num ::= /^[+-]?[0-9]+/ regex ::= << Regular Expression (PCRE-based) >> token ::= << Plain-Text Token String >>

IMPLEMENTATION ISSUES

Goal: non-hardcoded syntax tokens, only hard-coded syntax structure Goal: time-efficient parsing Goal: space-efficient storage Goal: representation of configuration as AST Goal: manipulation (annotation, etc) of AST via API Goal: dynamic syntax verification

HISTORY

OSSP cfg was implemented in lots of small steps over a very long time. The first ideas date back to the year 1995 when Ralf S. Engelschall attended his first compiler construction lessons at university. But it was first time finished in summer 2002 by him for use in the OSSP project.

AUTHOR

Ralf S. Engelschall rse@engelschall.com www.engelschall.com 10-Aug-2006 OSSP cfg 0.9.11 cfg(3)

Search: Section: