DragonFly On-Line Manual Pages

Search: Section:  


Munger(1)              DragonFly General Commands Manual             Munger(1)

NAME

munger - Text Processing Lisp

SYNOPSIS

munger [<script> [<args> ...]]

DESCRIPTION

Munger is a simple, statically-scoped, naive lisp interpreter, specialized for writing processors for 8-bit text. Munger makes it easy to write editors, shells, filters, low-demand clients and servers, and utility scripts. This document begins with an overview of the language, followed by a complete reference for every intrinsic function and macro, and library function and macro included in the Munger package. A list of these precedes the reference section, immediately after the following overview. A variety of example programs are included in the source distribution. They are installed in (libdir), and are described below. Included Example Programs cat.munger is a version of the cat utility, the simplest possible filter. grep.munger is an egrep-like filter. options.munger is a module which simplifies the processing of command- line arguments. fmt.munger is a version of the fmt utility. cal.munger Prints a calender of the current month to stdout. Demonstrates date arithmetic functions. filter.munger is a simple filter which expands documents with embedded munger code in them. transform.munger performs a set of regular expression based substitutions destructively over a set of files. view.munger is a file viewer resembling vi when invoked as view. It demonstrates character-I/O and cursor addressing. mush.munger is a job control shell. It demonstrates how to control process groups. xml2alist.munger is a minimal XML parser which will convert a standalone XML 1.0 document into an alist which may be queried for structure and content with the xmlquery.munger module. The program is a filter, reading XML from stdin and printing to stdout a lisp representation of the XML. xml2buffer.munger This is another version of the above XML parser which instead of printing the lisp representation of the converted XML document to stdout, inserts it into the current buffer, from where it may be read by the lisp reader with the "eval_buffer" intrinsic. This module can also read the XML document from a string. xml2sqlite.munger is another version of the XML parser, which serializes the XML document into an SQLite database file. xmlsqlite.munger is a module providing helper functions to access a database containing an XML document serialized by xml2sqlite.munger. rss.munger prints to stdout an RSS feed which has been converted into a SQLite database file by xml2sqlite.munger. It demonstrates accessing a database created by xml2sqlite.munger. xmlquery.munger is a module which may be used to extract data from a document processed with the XML parsers included in the Munger distribution, which store their results in lists. echo.munger is a TCP echo server. httpd.munger is a non fully-conforming HTTP/1.1 server.

IMPLEMENTATION NOTES

* The Munger implementation is designed to be easy to understand and extend. It is a naive implementation, and will perform best when interpreting programs written in an imperative programming style. Any general-purpose lisp/Scheme implementation out there will out- perform Munger. * Munger's strings are 8-bit clean, but some of the intrinsic functions in the interpreter cannot handle embedded NULs. This should not be a problem, as it is unlikely you will want to apply these functions to binary strings. A full list of 8-bit safe intrinsics is specified in the section below entitled STRINGS. * Munger's string-manipulation functions are "functional" in that they do not modify their arguments, but return new strings. * There is only one namespace. * There is a "dynamic_let" to impose dynamic scope on a single global at a time. * The function position of applications is fully-evaluated and must evaluate to an intrinsic function or a closure, as in Scheme. The order of evaluation of the terms of an application is fixed, from left to right, and may be relied upon in computations. * Munger does not recognize tail-calls, however you may explicitly request tail-calls with the "tailcall" intrinsic. This mechanism allows even anonymous functions to be tail-recursive, and can be used to turn non-tail positions into tail positions. Using function calls, whether they be tail-calls or not, to perform iteration in Munger will always be orders of magnitude slower than using one of the looping intrinsics. * As in Perl, all values are considered to be boolean "true" values except for the empty list, the empty string, and zero. There is no T nor NIL object, nor #t or #f. * All functions return a value. There are no undefined or unspecified return values used in this interpreter. * Improper lists cannot be formed in Munger. The final "cdr" of all lists is the empty list. * The empty list is a constant which evaluates to itself, and it is a list, not an atom. Every empty list is identical to every other empty list, which is to say they are "eq" to each other. * "eq" in Munger behaves similarly to "eql" in Common Lisp. A true "eq", which only returns a true value when comparing an object to itself, has very limited use, and therefore I have not bothered to include one. There is an "equal" as well, which does what you think it does. It is defined in (join "/" (libdir) "library.lsp"). * "eval" does not contain the lisp reader. Therefore applying "eval" to a string will not parse lisp code in the string, but simply cause the original string to be returned (strings are constants). There is a separate "eval_string" intrinsic. * There are no destructive list operations. Munger forces the programmer to use simple constructive list manipulations. Munger's "list" and "append" intrinsics are "functional" in that they make copies of their arguments before forming them into the final returned lists. * Munger's looping constructs are modelled on those of C. The most efficient way to iterate is to use the "for", "iterate", or "loop" intrinsics. * The set of intrinsics is purposefully-limited to a minimal set useful for text processing. There are many intrinsics with names similar to those of other lisp dialects, which do not perform similar tasks. Be forewarned. * Symbol syntax is similar to symbol syntax in C, including case- sensitivity. "let*" therefore, is "letn" in Munger. If you are used to hyphenating symbol names, you will have to get used to using the underscore in place of the hyphen to be happy here. * Simple "eval-twice" macros and gensyms are supported. A macro definition differs from a function definition in that the initial "lambda" symbol is replaced with the "macro" symbol. * Besides lists, three other aggregate types are provided by the language: one-dimensional dynamically-resizable arrays called "stacks", associative arrays called "tables", and statically-sized one-dimensional arrays called "records". * There are no interactive debugging facilities in the interpreter, nor any editing capabilities at the interpreter prompt, beyond those provided by the terminal driver (CTRL-H, CTRL-W, CTRL-U). To debug scripts the `(print "I made it here!")(exit 1)' technique is used by the author. STARTUP The interpreter attempts to read three files at startup, in order. The first, the system library (library.munger), is read from the Munger data directory (the "libdir" intrinsic will return the fully-qualified path of the data directory). Then, if the user has a custom lisp library (.munger) in his or her home directory, it is read, and lastly, any file specified by the first command-line argument is read. The succeeding command-line arguments, if present, are considered to be arguments to the script referenced by the first command-line argument, and are not accessed by the interpreter. All the command line arguments may be accessed from lisp programs with the "current" "rewind", "next", and "prev" intrinsics, described in the reference section at the end of this document. COMMENTS If the parser encounters a semicolon (;) or an octothorpe (#), outside of a string token, the rest of the line from that character to the next newline or carriage return is considered to be a comment and discarded. Recognizing the octothorpe as well as the traditional semicolon comment- character, allows one to put a "shebang" line at the top of one's scripts, in order to have the system feed the script to the interpreter, if the script itself is invoked as a command from the shell. #!/usr/local/bin/munger SYMBOLS The interpreter is case-sensitive when it comes to recognizing all tokens. Symbol names must consist of only sequences of alphanumeric characters and the underscore (_), and must not start with numerical characters. NUMBERS The interpreter supports only one numerical type, the fixnum. Fixnums are fixed-size integers which can be manipulated efficiently by the interpreter. The binary size of the fixnum is the word size of the machine the interpreter is running on. Arithmetical operations which overflow this size, will not be detected. The "maxidx" intrinsic will return the value of the largest fixnum the interpreter can represent. The lowest supported fixnum will be one more than the value returned by "maxidx", negated. This can be demonstrated by adding 1 to the value returned by "maxidx" to cause the two's-complement representation to "wrap-around" to the negative side. There is an "unsigned" intrinsic which can be used to display the result of unsigned arithmetic operations which wrap-around to the negative side. > (setq max (maxidx)) 1073741823 > (+ max 1) -1073741824 When an integer value is read by the lisp reader it is represented internally as a fixnum. If the value is too large or too small to be represented in that form, the value is silently truncated to fit. STRINGS Arbitrary strings of character data can be placed between " marks (ASCII 34). Such tokens are constants and evaluate to themselves. " marks may be embedded into strings if they are escaped with a backslash: > "asd\"" "asd"" A backslash occurring at the end of a string will be interpreted as escaping the closing " character. To make it possible for the lisp reader to read a string ending with a backslash, it is necessary, therefore, to have a means of escaping backslashes. Backslashes are therefore escaped with themselves: > "asd\\" "asd\" Backslashes are interpreted as escapes only when the occur before another backslash or a " character. Backslashes not followed by a " character or another backslash, are inserted into the string. The following two strings are therefore identical: > "\a" "\a" > "\\a" "\a" It is only necessary to employ these escapes in strings to be parsed by the lisp reader, that is to say, in the literal strings in your source code. These two characters may be inserted into strings programmatically, without escaping them: > (char 34) """ > (char 92) "\" The interpreter has a small set of string manipulation intrinsics, some you would expect in any language, such as the "substring" and "strcmp" functions, and others like the "split", "join", "chomp" and "chop" intrinsics, which are inspired by similar functions in Perl. Regular expressions can be used to find matches on substrings or to transform strings. There is no character data type. Single-character strings are used instead. All string operations are documented in the LANGUAGE REFERENCE section of this document. Munger's strings are 8-bit clean, but not all of the intrinsics which use strings will function correctly when confronted with strings which have embedded NULs in them. The following intrinsics will, however: getline, getchar, getchars, print, cgi_read, cgi_print, code, substring, stringify, concat, join, chop, chomp, insert, retrieve, slice, strcmp, child_read, child_write, getline_ub Due to a change in the regular expression library used by Munger, as of Munger 4.172, the following regular expression related intrinsics no longer work with strings with embedded NULs in them: regcomp, match, matches, substitute, replace The "split" intrinsic, notably, will work correctly when its first argument is the empty string and its second argument contains NULs. The programmer should avoid presenting binary strings to any other intrinsics than those guaranteed to work with them. STACKS, TABLES, AND RECORDS Munger supports an associative array type called a table. Associative arrays are collections of pairs of lisp atoms and arbitrary lisp objects. The atom is called the "key" and is used to retrieve the other object, called the "value" from the table. Tables are implemented internally as hash tables, hence the name. For more information on usings tables, see the entries for the intrinsics listed under the heading entitled, Tables, in the LANGUAGE REFERENCE occurring later in this document. Munger also supports a dynamically-resizable unidimensional array type called a stack. Stacks may be treated as push-down stacks or indexed as arrays. Internally, stacks are implemented as unidimensional arrays, but they are called stacks to emphasize their one-dimensional nature. Any lisp object may be stored in a stack. The capacity of stacks may be increased dynamically, simply by "pushing" items onto them, but stacks never decrease in size. Multi-dimensional arrays may be simulated using stacks of stacks. For more information on using stacks, see the entries for the intrinsics listed under the heading entitled, Stacks, in the LANGUAGE REFERENCE occurring later in this document. In addition to stacks and tables, an aggregate type called a record is provided. Records are fixed-size unidimensional arrays. They are more time-and-space-efficient means of representing fixed-size structures than lists, tables, or stacks. Tables, stacks, and records are opaque constant atoms which evaluate to themselves. FILES File I/O may be performed in one of two ways. The content of files may be read into text buffers, manipulated, then written out again, if random access to the files' content is desired, or the standard descriptors may be redirected onto files with the "redirect" intrinsic, if line-oriented, serial (filter-like) access is sufficient. Three convenience macros "with_input_file", "with_output_file", and "with_error_file" simplify the process: > (with_input_file "README" >> (for (a 1 4) (print a (char 9) (getline)))) 1 Munger 2 ====== 3 4 Munger is a simple, statically-scoped, interpreted lisp that has All redirections are undone upon return to toplevel. Redirections made with "redirect" may be explicitly undone in programs with the "resume" intrinsic. Redirections made with the "with_input_file", "with_output_file", and "with_error_file" macros are undone automatically when those macros return. Redirections made with these macros are dynamically-scoped, which is to say their visibility is unlimited, but their extent is limited to the duration of the macro. Redirections exhibit stack-like behavior, allowing nested redirections to "shadow" enclosing redirections: > (with_input_file "README" >> (print (getline)) Munger >> (with_input_file "lisp.h" >>> (print (getline))) /* >> (print (getline))) ====== The "temporary" intrinsic will redirect stdout onto a temporary file opened for writing, via mkstemp(2). The function returns the name of the file. A convenience macro is provided, "with_temporary_output_file": > (setq filename (with_temporary_output_file (print "foobar"))) "/tmp/munger2nvIa8D98M" > (with_input_file filename >> (print (getline))) "foobar" > (unlink filename) 1 Two macros simplify the writing of filters: "foreach_line" and "foreach_line_callback". Both accept a monadic function to be applied successively to each line of input, the second also accepts an additional function to be called when the input source changes. Both macros determine whether to read data from the standard input or from files specified on the command line based on the existence of command line arguments after the first two, which will always be the name of the interpreter and the name of the currently-executing script, respectively. Here is what "cat" looks like in Munger: (next) (foreach_line print) (exit 0) The example programs further demonstrate the use of these macros. They are fully documented later in this document. COMMUNICATING WITH CHILD PROCESSES The interpreter's standard descriptors can be redirected onto processes with the "pipe" intrinsic. Successive invocations of "pipe" inherit any already-made redirections, allowing pipelines to be created. The "with_input_process" and "with_output_process" macros simplify the process: > (with_input_process "jot 100" >> (with_input_process "fmt" >>> (while (setq l (getline)) >>>> (print l)))) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Upon return to toplevel, all redirections are undone. Redirections are explicitly undone with the "resume" intrinsic. The first argument is passed to the shell (always /bin/sh), and so may be any expression that program understands. Because of this it is actually more efficient to let the shell create the pipeline for us: > (with_input_process "jot 100 | fmt" ... The curl(1) utility may be used to redirect standard input onto a remote file via ftp or http: > (with_input_process "curl -s 'ftp://www.freebsd.org/pub/FreeBSD/README.TXT'" >> (for (n 1 10) (print n (char 9) (getline)))) 1 Welcome to the FreeBSD archive! 2 ------------------------------- 3 4 Here you will find the official releases of FreeBSD, along with 5 the ports and packages collection and other FreeBSD-related 6 material. For those who have World Wide Web access, we encourage 7 you to visit the FreeBSD home page at: 8 9 http://www.FreeBSD.org/ 10 The munger code below is the equivalent of the shell command: # cmd < infile 2> errfile | cmd2 > outfile (redirect 0 "infile") (redirect 2 "errfile") (pipe 0 "cmd") (redirect 1 "outfile") (resume 2) (exec "cmd2") The "getstring" library function will fork a local program and accumulate all the returned data from it into a single string. Here we run curl locally to retrieve a remote file as a single string: > (getstring "curl -s 'http://www.mammothcheese.ca/robots.txt'") "User-Agent: * Disallow: /cgi-bin/" If we use curl(1) as intermediary, it can handle HTTP redirects, remove the HTTP header and merge chunked responses, leaving us with just the requested resource's data. To see the HTTP header for ourselves, we can connect a socket directly to the server with the "child_open" intrinsic. The presence of a second port number argument to "child_open" tells the interpreter this is a request to connect to a network server. > (child_open "www.mammothcheese.ca" 80) 1 > (child_write "GET /robots.txt HTTP/1.0" (char 13) (char 10) (char 13) (char 10)) 1 > (while (stringp (setq line (child_read))) >> (print line)) HTTP/1.0 200 OK Content-Type: text/plain; charset=utf-8 Content-Length: 24 Last-Modified: Tue, 12 Jun 2007 16:10:51 GMT Server: Drood/1.14 (FreeBSD/6.1/i386) Date: Sun, 02 Sep 2007 01:27:20 GMT User-agent: * Disallow: "child_open" creates a full-duplex communication stream, while "pipe", "redirect", and the "with_input_*" and "with_output_*" macros create only unidirectional streams off of the standard descriptors. Only one full- duplex connection can be active at any time, but it exists independently of whatever source the standard descriptors are connected to. With a port number argument of 0, "child_open" attempts to connect to another process listening on a UNIX domain socket. Without a port number argument "child_open" forks a local program: > (child_open "munger") 1 > (child_write "(setq foobar 43)") 1 > (chomp (child_read)) "43" > (child_close) 1 TEXT BUFFERS Munger provides line-oriented buffers for storing large amounts of text. A buffer must be opened before it can be used, with the "open" intrinsic, and may be closed with the "close" intrinsic. A whole number is returned by "open" upon success, called the buffer number of the opened buffer. Only one buffer, out of those currently open, can be active at any time, and is called the current buffer. Each call to "open" creates a new buffer and makes the new buffer the current buffer. The buffer number of an open buffer may be passed as the argument to the "switch" intrinsic to make that buffer the current buffer. A number of intrinsics are provided which act upon the current buffer, to insert, retrieve, and delete lines, to report the number of words and lines in the buffer, and to write and read buffer lines, to and from, files and programs. Lines may be retrieved whole, or in slices with tabs expanded. A range of buffer lines may be processed through an external program with the "filter" intrinsic, while the "find" intrinsic may be used to search through the current buffer to find non-overlapping matches on regular expressions. A range of lines may be copied from one buffer to another with the "transfer" intrinsic. There is also intrinsics to set and find bookmarks in buffers. The full list of buffer intrinsics can be found later in this document at the top of the LANGUAGE REFERENCE section, under the subheading, Buffer Operations. Each is fully described in its own particular entry. ; Loading a buffer from a file: > (open) 0 > (read 0 "README") 38 ; Number of lines read. > (for (a 1 5) (print (retrieve a))) Munger ====== Munger is a simple, statically-scoped, interpreted lisp that has line-editor-like access to multiple text buffers, for use on the FreeBSD ; Loading a buffer from a process: > (empty) 1 > (input 0 "ls") 20 > (for (a 1 (lastline)) (print (retrieve a))) LICENSE Makefile README cat.munger client.munger err.munger cgi.munger fmt.munger grep.munger intrinsics.c library.munger lisp.c lisp.h options.munger munger.man transform.munger ; Filtering buffer content through a process: > (filter 1 (lastline) "fmt") 3 ; Number of lines received back from filter. > (for (a 1 (lastline)) (print (retrieve a))) LICENSE Makefile README cat.munger client.munger err.munger cgi.munger fmt.munger grep.munger intrinsics.c library.munger lisp.c lisp.h options.munger munger.man transform.munger ; Loading a buffer from a remote file using curl(1): > (empty) 1 > (input 0 "curl -s 'http://www.mammothcheese.ca/index.html'") 296 ; Number of lines read. ; Finding the location of a match on a regular expression: > (setq rx (regcomp "<body[^>]*>")) <REGEX#1> > (find 1 1 0 rx 0) (27 3 6) > (slice 27 3 6 1 0) "<body>" ; Filtering a buffer through an HTTP server: > (empty) 1 > (insert 1 (concat "GET /Slashdot/slashdot HTTP/1.0" (char 13) (char 10)) 0) 1 > (insert 2 (concat (char 13) (char 10)) 0) 1 > (filter_server 1 2 "rss.slashdot.org" 80) 272 ; Get rid of the HTTP header and condense a possibly chunked response body: > (remove_http_stuff) 0 ; Filter the XML document left in the buffer through the xml2alist example ; program: > (filter 1 (lastline) (join "/" (libdir) "xml2alist.munger")) 264 ; The buffer now contains a lisp representation of the XML we can use ; the xmlquery.munger example module with: > (load (join "/" (libdir) "xmlquery.munger")) <CLOSURE#24> ; Evaluate the buffer content as lisp: > (eval_buffer) [ Converted document scrolls by dramatically! ] ; Make a query. Let's see the cdata content of the title elements of the ; RSS feed: > (dynamic_let (document (get_elements "item" "document" 1 "rdf:RDF" 1)) >> (while document >>> (println (get_cdata "item" 1 "title" 1)) >>> (setq document (cdr document)))) Unrefined "Musician" Gains a Global Audience Open Source Laser Business Opens In New York OpenOffice.org 2.1 Released With New Templates Texas Lawmaker Wants To Let the Blind Hunt Designer Glasses With Microdisplay Unveiled Arctic Ice May Melt By 2040 DIY Service Pack For Windows 2000/XP/2003 Sea Snail Toxin Offers Promise For Pain A Press Junket To Redmond The Dutch Kill Analog TV Nationwide Google Web Toolkit Now 100% Open Source Novell and Microsoft Claim Customer Support Wikipedia Founder to Give Away Web Hosting How Craigslist is Keeping up Internet Ideals Norman & Spolsky - Simplicity is Out REGULAR EXPRESSIONS Munger provides five intrinsic functions for working with extended regular expressions. All regular expressions must be compiled with the "regcomp" intrinsic before they may be used by the other regular- expression wielding intrinsics. The "match" intrinsic returns a list of two character indices describing a match. The "matches" intrinsic returns a list of the text matched by the regular expression and up to 20 parenthesized subexpressions. The "substitute" intrinsic is modelled after the substitute command from the vi/ex editor, and may be used to perform regular-expression-based substitution operations upon strings. the "replace" library function allows snippets of lisp to be used for the replacement expression, allowing replacement text to be dynamically generated for each match in a string. Lastly, the "find" intrinsic may be used to find the location of matches against a particular regular expression in the text in the current buffer. > (set 'rx (regcomp "munger")) <REGEXP#1> > (set 's "/usr/local/share/munger/library.munger") > (match rx s) (17 23) ; Text before match: > (substring s 0 17) "/usr/local/share/" ; Text of match: > (substring s 17 (- 23 17)) "munger" ; Text after match: > (substring s 23 0) "/library.munger" > (set 'rx (regcomp "^/?([^/]+/)*([^/]+)")) <REGEXP#2> ; Full match and matched subexpressions: > (matches rx s) ("/usr/local/share/munger/library.munger" "munger/" "library.munger" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "") ; Ex-like substitution: > (substitute rx "\U\2" "/usr/local/bin/munger" 1) "MUNGER" ; Dynamically-generated replacement strings for each match: (let ((r0 (regcomp "%([0-9A-Fa-f][0-9A-Fa-f])")) (r1 (regcomp "\+"))) (defun decode (str) (replace r0 (char (hex2dec m1)) (substitute r1 " " str 0)))) > (decode "foobar+tooley%7Efunbag%24%25%26") "foobar tooley~funbag$%&" FUNCTIONS The elements of function or macro applications are evaluated from left to right, with the function position being evaluated in the same environment as the succeeding positions. The function position must be occupied by an expression that evaluates to a an intrinsic function object or a closure. Lambda- and macro-expressions evaluate to closure objects and are closed over the local bindings visible at that time: > (let ((a 10)) >> (defun booger () a)) <CLOSURE#32> > (booger) 10 With static scoping, the local bindings visible to a function are limited by the block structure in place at the time the function is created, but variable extent is unlimited, so the function "booger"'s reference to the enclosing block's variable "a" is still valid after the enclosing block has returned. The binding is said to be "closed" within "booger" and is no longer visible anywhere else. In the example above, the invocation of "defun" creates a toplevel binding, and not a new locally-visible binding, as an invocation of "define" would do in this position in Scheme. Locally-visible bindings can only be created by function application ("let" is syntactic sugar for the application of an anonymous function), or by the "extend" intrinsic. If the variable "booger" were a local variable in an enclosing scope, its binding would have been modified, creating a locally-visible function, but in this case, since no local binding exists, a new toplevel binding is created. When a lambda-expression is evaluated it is closed over the environment in which it is embedded to become a closure. This means free variables visible in containing functions remain accessible. Closures can be used to create shared lexical environments for functions which need to communicate with each other, or they can be used to create static bindings which persist between invocations of a function, much like local variables in C declared with the "static" storage class. Because closures perform encapsulation they can be used to simulate objects in the message-passing style of object programming. Closures can also be used to capture state for use in the continuation-passing style of programming. A full discussion of these topics is beyond the scope of this manual page. Functions may be defined to accept variable-length argument lists. To specify that a function accepts a varying number of arguments, a final parameter to reference the optional arguments is enclosed in parentheses in the parameter list of the function definition. For example, (lambda ((a)) (print a)) defines a function which accepts zero or more arguments. Upon invocation, all arguments passed to the function will be collected into a list and bound to "a" inside the function body. If no arguments are passed to the function, "a" will be bound to the empty list. The function below accepts two or more arguments, with any arguments after the second argument collected into a list bound to "c" in the function body: (lambda (a b (c)) (print a b c)) The "labels" intrinsic facilitates the creation of local function bindings, where each binding is visible inside each function definition. Recursive and mutually-recursive temporary functions therefore may be created with "labels". >(labels ((even (lambda (n) (or (eq n 0) (tailcall odd (- n 1))))) >>> (odd (lambda (n) (and (not (eq n 0)) (tailcall even (- n 1)))))) >> (print (even 11)) >> (newline) >> (print (even 12)) >> (newline)) 0 1 1 ; this is the return value of the last (newline). There are "let", "letn", and "letf" macros to facilitate the creation of new lexical bindings. "letn" corresponds to let* in other lisp dialects, while "letf" corresponds to a named let in Scheme or flet in lisp. It may not be necessary to use these macros to create new lexical bindings, depending upon the needs of a particular situation. If the interpreter is a toplevel, then one of these macros must be used, or a closure applied, to create a lexical environment. Inside a lexical environment, however, new bindings may be dynamically created with the "extend" intrinsic. These bindings have unlimited extent just as all lexical bindings do, and so may be safely closed-over by closures. > (defun (a) >> (extend 'b (* a a)) >> (lambda () b)) If the programmer wishes to limit the extent of the new binding, the invocation of "extend" and the expressions which use the new binding may be wrapped in a "dynamic_extent" intrinsic: > (lambda (a) >> (extend 'f (lambda () b)) >> (dynamic_extent >>> (extend 'b (* a a)) >>> (print (f))) ; Works here inside "dynamic_extent" ; even though f was closed before "extend" was invoked. ; b suddenly "pops-up" into the lexical environment. >> (f)) ; ERROR: b is no longer extant here. Tail recursion is not recognized by the interpreter, but must be requested explicitly by the "tailcall" intrinsic. This mechanism allows even anonymous functions to perform tail recursion. See the entry for "tailcall" later in this document for more details. > (let ((n 10) >>> (a 1)) >> (if (< n 2) >>> a >>> (tailcall 0 (- n 1) (* a n)))) 3628800 MACROS Macros allow us to define syntactic transformations. With macros we can create new special operators for the interpreter in lisp. A macro is simply a function which receives its arguments unevaluated, and instead has its return expression evaluated. In other words, the purpose of a macro is to compose a new expression out of its arguments, which the interpreter then evaluates. Despite this simple definition, the use of macros can become confusingly complex. With macros, we write lisp which writes the lisp to be ultimately evaluated. A simple example is the "quit" macro defined in library.munger: (set 'quit (macro () '(exit 0))) We can see how a macro will expand with the test intrinsic: > (test (quit)) (exit 0) Macros typically consist of a template expression into which we plug sub- expressions to produce the expression which is finally returned by the macro. The "qquote" (quasiquote) macro aids in the defining of other macros, by allowing us to use the template paradigm directly. This greatly simplifies complex macro definitions. The "qquote" macro is similar to the "quote" intrinsic in that it prevents evaluation of its argument, but the quoting can be selectively turned off for sub- expressions by prefacing them with a (",") comma character. Here is the definition of the "with_input_file" macro from library.munger: (set 'with_input_file (macro (file (code)) (qquote (when (> (redirect 0 ,file) 0) (protect ,(cons 'progn code) (resume 0)))))) The sub-expressions prefaced by commas are evaluated and the result of the evaluations are inserted into the template expression. We can see how the macro will expand with the "test" intrinsic: > (test (with_input_file "library.munger" (getline))) (when (> (redirect 0 "library.munger") 0) (protect (progn (getline)) (resume 0))) There is a "defmac" macro in library.munger which allows macros to be created with the following syntax: (defmac with_input_file (file (code)) (qquote (when (> (redirect 0 ,file) 0) (protect ,(cons 'progn code) (resume 0))))) PROGRAMMING STYLE Programs written in an imperative style will always out-perform programs written in a functional style because Munger is a naive interpreter. A Scheme programmer might be tempted to write a factorial function in Munger like this: (defun fact (n) (labels ((f (lambda (n a) (if (< n 2) a (f (- n 1) (* n a)))))) (f n 1))) This function will work as expected in Munger, but there are some improvements we can make to it increase its efficiency. First of all, the recursive invocation of "f" will not be recognized as a tail-call by the Munger interpreter, so we will use use the "tailcall" intrinsic to make this call. Secondly, "tailcall" does not need to "see" a binding in order to make a recursive call. It can restart the currently-running function internally. This means we can use a "let" to make the binding to "f": (defun fact (n) (let ((f (lambda (n a) (if (< n 2) a (tailcall 0 (- n 1) (* n a)))))) (f n 1))) We can simplify this further. We don't actually need to use "let" to create new local bindings, unless the interpreter is at toplevel, where no local lexical environment exists. The "extend" intrinsic will extend the current local environment with a new binding more efficiently, because "let" is a macro internal to the interpreter, which must be expanded and re-evaluated, resulting in another function application. If we wish to limit the extent of the bindings introduced by "extend" we can wrap expressions with the "dynamic_extent" intrinsic. Replacing the "let" with "extend" gives us: (defun fact (n) (extend 'f (lambda (n a) (if (< n 2) a (tailcall 0 (- n 1) (* n a))))) (f n 1)) We don't use it here, but the binding to "f" is actually visible inside the function body. Although the function is closed over the environment before the environment is extended, the binding will nonetheless be visible when the closure is applied, because closures do not simply close over visible bindings, but rather the environments which contain them. At the time the closure is applied, the environment will have been dynamically extended to make "f" visible. Using function calls to iterate will always result in slow programs in Munger. The only reason we are introducing the helper function is to create a new binding to "a" to act as an accumulator, but we can use "extend" to do that directly, and replace the closure with a loop: (defun fact (n) (extend 'a 1) (while (> n 1) (setq a (* n a)) (dec n)) a) This function is very different than the factorial we started with and will run much more efficiently in the Munger interpreter. But there is one last change we can make to further improve efficiency. The most efficient way to iterate in Munger is to use the "iterate" intrinsic. It can be used when the number of iterations can be calculated before entering the body of the loop: (defun fact (n) (extend 'a 1) (iterate n (setq a (* a n)) (dec n)) a) This is the fastest form of this function.

RETURN VALUES

Munger returns 1 if the interpreter stopped due to an error, 0 otherwise.

AUTHORS

James Bailie <jimmy@mammothcheese.ca> http://www.mammothcheese.ca

LANGUAGE REFERENCE

Do not assume you know what an intrinsic does because it has a similar name to an intrinsic of another Lisp dialect, or you may have some nasty surprises working in Munger. The following functions are either built-in to the interpreter, or are library functions or macros defined in library.munger. To find the entry for a particular item, search this document for the name of the item followed by a colon. List Operations: cons, car, cdr, list, length, caar, cdar, cadr, cddr, cdddr, cddddr, caddr, cadddr, caddddr, append, alist_lookup, alist_remove, alist_replace, reverse, sort, sortlist, sortcar, mapcar, foreach, remove, nthcdr, nth, member, map Predicates and Conditionals: eq, atomp, if, when, unless, and, or, not, nullp, boundp, pairp, equal <, <=, >, >=, stringp, fixnump, symbolp, regexpp, tablep, stackp, intrinsicp, closurep, macrop, recordp, Assignment: set, setq, inc, dec, defun defmac Evaluation and Control Flow: progn, throw, while, until, do, catch, continue, main, eval, quote, load, gensym, version, test, let, letn, labels, cond, apply, exit, quit, interact, fatal, nofatal, extract, qquote protect, letf, tailcall, prog1 printer, noprinter die, dynamic_let, extend, gc, for, iterate loop, dynamic_extent gc_freq, case, eval_string blind_eval_string, eval_buffer Fixnum Arithmetic: +, -, *, /, %, abs, random, negate, unsigned Type Conversions: stringify, digitize, intern, char, code, hex2dec, dec2hex Buffer Operations: open, close, insert, delete, retrieve, lastline, filter, write, read, empty, slice, find, input, output, words, maxidx, buffer, buffers, switch, transfer, setmark, getmark, with_buffer, filter_server remove_http_stuff Regular Expressions: regcomp, match, matches, substitute, replace, String Operations: split, join, expand, substring, concat chop, chomp, upcase, downcase, length, reverse, strcmp, split_rx, tokenize, rootname, suffix, explode base64_encode, base64_decode form_encode form_decode Filesystem Operations: chdir, libdir, directory, unlink, rmdir, pwd, exists, stat, rename, seek mkdir, complete, realpath, access, truncate, redirect, resume, chown, chmod, basename, readlock, writelock, unlock dirname, symlink Command-Line Arguments: current, next, prev, rewind Line-Oriented I/O: print, println, warn, with_input_file, with_output_file, with_output_file_appending, with_input_process, with_output_process, with_error_file, with_error_file_appending, with_temporary_output_file foreach_line, foreach_line_callback, pipe newline, redirect, resume, stderr2stdout, getline, rescan_path, stdout2stderr, flush_stdout getline_ub reset_history, save_history, load_history Network daemon-related: listen, listen_unix, stop_listening, accept, daemonize, syslog, getpeername, receive_descriptors, send_descriptors, busymap, nobusymap, busy, notbusy, get_scgi_header busyp, System Access: system, getenv, setenv, block, unblock, suspend, lines, cols, date, time, beep, checkpass, crypt, setuid, getuid, setgid geteuid, seteuid, getgid, hostname, gecos, timediff, timethen, date2days, days2date, date2time, localtime, utctime, week, weekday, month, getpid, getppid, setpgid, getpgrp, tcsetpgrp, tcgetpgrp kill, killpg, fork, glob, wait, zombies, nozombies, zombiesp, exec, shexec, forkpipe, command_lookup, unsetenv getstring datethen, chroot, isatty, sle Tables: table, hash, unhash, lookup, keys, values Stacks: stack, push, pop, index, store, used, topidx, assign, flatten, shift, clear, unshift Records: record, getfield, setfield, Communication with a child process: child_open, child_write, child_read, child_close, child_running, child_ready, child_wait, child_eof SQLite Interface: sqlite_open, sqlite_close, sqlite_exec, sqlite_prepare, sqlite_step, sqlite_finalize, sqlite_reset, sqlite_row, sqlite_bind, sqlp, sqlitep Character-Oriented I/O display, clearscreen, pause, clearline, getchar, goto, scrolldn, scrollup, hide, show, pushback, insertln, getchars fg_black, fg_red, fg_green, fg_yellow, fg_blue, fg_magenta, fg_cyan, fg_white, bg_black, bg_red, bg_green, bg_yellow, bg_blue, bg_magenta, bg_cyan, bg_white, boldface, normal, A description of each follows: cons: (cons expr1 expr2) Intrinsic "cons" adds an element to the beginning of a list. Expr2 must evaluate to a list. > (cons 'a (b c) (a b c) car: (car expr1) Intrinsic "car" returns the first element of a list. An error is generated if expr1 does not evaluate to a list. > (car '(a b c)) a cdr: (cdr expr1) Intrinsic "cdr" returns the sublist of a list, beginning from the second element of the list. An error is generated if expr1 does not evaluate to a list. > (cdr '(a b c)) (b c) boundp: (boundp expr1) The "boundp" intrinsic accepts one argument which must evaluate to a symbol. The function returns 1 if the symbol is currently bound, 0 otherwise. caar, cadr, cdar, caddr, cadddr, caddddr, cddr, cdddr, cddddr: (form expr) These four library functions take the same argument as the car and cdr intrinsics and are built out of nested groupings of those intrinsics. > (caar '((a) b c)) ; is equivalent to: (car (car '((a) b c))) a > (cadr '(a b c)) ; is equivalent to: (car (cdr '(a b c))) b > (cdar '((a) b c)) ; is equivalent to: (cdr (car '((a) b c))) () > (cddr '(a b c)) ; is equivalent to: (cdr (cdr '(a b c))) (c) etc. eq: (eq expr1 expr2) Intrinsic "eq" returns 1 if expr1 and expr2 evaluate to the same atom token, to atoms representing equivalent numbers, or to the exact same list, otherwise it returns 0. > (eq 'a 'a) 1 > (eq '(a b c) '(a b c)) 0 > (set 'l '(a b c)) (a b c) > (eq l l) 1 > (eq 0001 1) 1 equal: (equal expr1 expr2) Library function "equal" returns 1 in all the situations where "eq" returns 1, and additionally will return 1 if both of its arguments evaluate to lists having the same structure and content. While "eq" will fail if its arguments evaluate to different lists with the same structure and content, equal will return 1 for these arguments.. > (equal '(a b c) '(a b c)) 1 > (set 'l '(a b c)) (a b c) > (equal l l) 1 > (equal '(00 01 02) '(0 1 2)) 1 atomp: (atomp expr) Intrinsic "atomp" returns 1 if its argument evaluates to an atom, and returns 0 otherwise. > (atomp 'a) 1 > (atomp '(a b c)) 0 set: (set expr1 expr2) Intrinsic "set" accepts two arguments. The result of evaluating the second argument is bound to the result of evaluating the first argument. The first argument must evaluate to a symbol. If a local variable with the syntax of the symbol exists, its binding will be modified, otherwise "set" will create or modify a toplevel binding. Locals can only be created by function application, or with the "extend" intrinsic. > (set 's '(a b c)) (a b c) > s (a b c) > (set (car '(a b c)) ((lambda (x) (* x x)) 4)) 16 > a 16 setq: (setq symbol expr) The "setq" intrinsic works similarly to the "set" intrinsic, except the first argument to the function is not evaluated and must be a symbol. (setq a b) is equivalent to: (set 'a b) eval_buffer: (eval_buffer) The "eval_buffer" intrinsic evaluates Munger lisp in the current buffer. The function accepts no arguments. The buffer is evaluated in the current lexical context. Recursive invocations of "eval_buffer" are not permitted, which is to say the code in the current buffer (or any other code it invokes) may not itself invoke "eval_buffer" while "eval_buffer" is running. If the code in the current buffer messes with itself by altering the content of the current buffer, disaster may result; however, the code in the current buffer may open new buffers as it likes without fear. The "eval_buffer" function will continue to parse the code in the buffer which was current when it was invoked. The function returns the result of evaluating the last expression in the buffer upon success, or 0 if a recursive invocation is attempted or if the current buffer is empty. Any errors encountered during evaluation of the code in the buffer will stop evaluation. eval_string: (eval_string expr) blind_eval_string: (blind_eval_string expr) Both the "eval_string" and "blind_eval_string" accept one argument which must evaluate to a string, and attempt to execute the string as lisp. Any error encountered will stop evaluation of the string, but the interpreter will attempt to carry on interpreting the rest of your program. Your program may be in a "messed-up" state from the badly- behaved code parsed from the string, which may cause the interpreter to encounter another error which stops evaluation when it attempts to continue interpreting your program. Both functions return the result of evaluating the last successfully-evaluated expression parsed from the string. If no expressions are successfully-evaluated, then the original string will be returned. The difference between the two functions is that with "eval_string" the code parsed from the string is evaluated in the current lexical context, while with "blind_eval_string" only the global environment is visible to the code in the string. The current lexical environment is invisible to the string. Care must be taken when specifying recursive invocations of "eval_string" literally in program text. Consider this example: (eval_string "(setq foobar (eval_string \"(join \\\":\\\" \\\"b\\\")\"))") The string-within-a-string, which is the argument to the recursive invocation, must have its delimiting quotes escaped with backslashes in order for them to be embedded in the toplevel string, and not end it. Then the strings-within-a-string-within-a-string, which are the arguments to the recursive invocation's argument string's invocation of "join", must be double-escaped with three backslashes. The backslashes closest to the quotes escape the quotes in the toplevel string, while the double backslashes before the escaped quotes, embed backslashes into the toplevel string which will be interpreted as escaping the quotes during the recursive invocation of the lisp reader. Invoking "eval_string" is similar to invoking "load" except the code is extracted from a string instead of from a file. If an expression is not complete within the string, it will be discarded. eval: (eval expr) Intrinsic "eval" returns the result of evaluating its lone argument twice, which is to say, the argument is evaluated as usual, and the result of this evaluation is evaluated again. Note "eval" does not contain the lisp reader. Calling "eval" with a string argument will only cause the original string to be returned, since strings are constants in Munger. > (set 'a (quote (set 'b 'booger))) (set (quote b) (quote (booger))) > b evaluate: b has no value. > (eval a) booger > b booger quote: (quote expr) or 'expr Intrinsic "quote" returns its argument unevaluated. It is used so frequently it has an abbreviated form of a single apostrophe. > (quote (a b c)) (a b c) > '(a b c) a protect: (protect expr expr ...) The "protect" intrinsic is analogous to unwind-protect in other lisp dialects. The function accepts one or more arguments and evaluates them in sequence, but returns the value of evaluating the first argument. The arguments subsequent to the first are evaluated EVEN IF THE EVALUATION OF THE FIRST IS INTERRUPTED by the interpreter encountering an error. > (catch >> (protect (throw 0) >>> (print 'booger) >>> (newline))) booger 0 > qquote: (qquote expr) The "qquote" (quasiquote) macro accepts a list and returns it unchanged except where sub-expressions have been "escaped" with comma characters. Those escaped sub-expressions are evaluated and the result of the evaluation inserted into the template expression. "qquote" aids in the composition of macros. Some lisps define a read macro "`" as a short form for "qquote", but Munger does not support this. The commas escaping sub-expressions are separate tokens in Munger, unlike in other lisps. For example ,token is actually parsed by Munger as two tokens, "," and "token". This means ',token parses to "(quote ,) token" and not "(quote ,token)". if: (if expr1 expr2 [exp3...]) Intrinsic "if" is a conditional. It accepts two or three arguments, the first being the test condition. The test condition is evaluated, and if the result is a true value (anything except 0 (fixnum), the empty string, or the empty list), the second argument is evaluated and the result of that evaluation is returned. Otherwise, if the test condition evaluated to a false value, and further expressions are present after the second, those expressions are evaluated in order and the result of the last expression evaluated is returned. If only two expressions are present in the original form, and the test condition evaluates to a false value, the result of evaluating the test expression is returned. > (if (> 3 4) 'yes 'no) no > (if (> 3 4) 'yes) 0 and: (and expr1 [expr2 ...]) Intrinsic "and" accepts one or more number of arguments, and evaluates them from left to right until an argument evaluates to a "false" value (0, the empty string, or the empty list), or the end of the arguments is reached. The value of the last evaluation is returned. > (and 1 'a 0) 0 > (and 1 'a "string") 1 or: (or expr1 [expr2 ...]) Intrinsic "or" accepts one or more arguments, and evaluates them from left to right until an argument evaluates to a true value (anything other than 0, the empty string, or the empty list), or the end of the arguments is reached. The value of the last evaluation is returned. > (or 1 0) 1 > (or 0 0) 0 list: (list expr1 [expr2 ...]) Intrinsic "list" accepts one or more arguments, evaluates them in order, and returns a newly-constructed list containing of the result of each evaluation, also in order. > (list 'a 'b 'c) ; is equivalent to (cons 'a (cons 'b (cons 'c ()))) (a b c) > (list '(a b c) 4 "hello") ((a b c) 4 "hello") progn: (progn expr...) Intrinsic "progn" accepts one or more arguments, and evaluates them in order, returning the result of evaluating the last argument. If no arguments are passed to "progn" an error is generated which will stop evaluation. > (progn >> (set 'f (lambda (n) (+ n 1))) >> (set 'x 1) >> (f x)) 2 prog1: (prog1 ...) Library macro "prog1" accepts zero or more arguments, and evaluates them in order, returning the result of evaluating the first argument. If no arguments are passed to the macro, it returns the empty list. > (prog1 >> (+ 2 2) >> (+ 3 3)) 4 not: (not expr) Intrinsic "not" returns 1 if its argument is the empty list, the empty string, or zero, otherwise it returns 0. > (not 0) 1 > (not "hello") 0 nullp: (nullp expr) Library function "nullp" returns 1 if its argument is the empty list, otherwise it returns 0. > (nullp ()) 1 > (nullp 'a) 0 pairp: (pairp expr) Library function "pairp" returns 1 if its argument is a non-empty list, otherwise it returns 0. > (pairp '(a b c)) 1 (pairp ()) 0 (pairp 'a) 0 warn: (warn expr [expr...]) The "warn" intrinsic evaluates its arguments then writes the value of each evaluation, and one final newline character to the standard error stream. The "warn" intrinsic always returns 1. setenv: (setenv expr1 expr2) The "setenv" intrinsic sets the value of a named environment variable. The function accepts two arguments, which must both evaluate to strings. The first argument is the environment variable to set, and the second is the new value to bind to the variable. Any errors encountered will stop evaluation. Upon success, "setenv" returns 1. unsetenv: (unsetenv expr) The "unsetenv" intrinsic accepts one argument which must evaluate to a string and removes any environment variable named by the string from the environment. The function always returns 1. getenv: (getenv expr) The "getenv" intrinsic looks up the value of an environment variable. It accepts one argument which must evaluate to a string. If the environment variable specified exists, a string is returned representing the variable's value. If the variable does not exist, 0 is returned. > (getenv "HOME") "/usr/home/jbailie" > (getenv "foobar") 0 directory: (directory expr) The "directory" intrinsic returns a list of strings representing the filenames of the entries in a specified directory, or a string representing the error encountered by the opendir() system call. The function does not return the ".." and "." directory entries, in its result set. "directory" accepts one argument, which must evaluate to a string, specifying the directory to list. > (directory "/usr/local") ("man" "bin" "share" "include" "lib" "etc" "info" "libexec" "sbin" "libdata") > (directory "/foobar") "No such file or directory" chomp: (chomp expr) The "chomp" intrinsic removes all contiguous terminating carriage return and newline characters from a string. The function accepts one argument which must evaluate to a string, and ALWAYS returns a new string, regardless of whether any terminators were removed from its argument or not. > (chomp (getline)) hello[return] "hello" > (getline) hello[return] "hello " > chop: (chop expr) The "chop" intrinsic accepts one argument which must evaluate to a string, and returns a new string with the same characters as the original string but with the last character removed. If the argument string is empty "chop" does nothing. > (chop "hello") "hell" > (chop "") "" beep: (beep) The "beep" intrinsic accepts no arguments and will cause the device connected to standard output to beep, if it is capable of doing so. 1 is returned on success. Any error encountered stop evaluation. suspend: (suspend) The "suspend" intrinsic accepts no arguments and causes the interpreter to send a SIGSTOP to itself, suspending the interpreter process. See your shell's documentation on how to resume stopped jobs. stderr2stdout: (stderr2stdout) stdout2stderr: (stdout2stderr) The "stderr2stdout" intrinsic connects the stderr stream to the stdout stream. The "stdout2stderr" intrinsic connects the stdout stream to the stderr stream. Both functions accept no arguments, and always return 1. The standard steam redirected may be reconnected to the stream it was previously connected to, with the "resume" intrinsic. After successful invocation, the stream whose name occurs first in the name of the intrinsic is connected to the same descriptor as the stream whose name occurs second in the name of the intrinsic. So, if no previous redirections have been made, invoking "stderr2stdout" will cause stderr and stdout to both write to descriptor 1, while invoking "stdout2stderr" will cause both stdout and stderr to write to descriptor 2. pipe: (pipe expr1 expr2) NOTE there are separate shortcut intrinsics to send or receive buffer lines, to or from, child processes, or filter them through child processes. see the entries for the "input", "output", and "filter" intrinsics for more details. the "pipe" intrinsic forks a specified process, piping one of the interpreter's standard descriptors to the standard input or standard output of the new process. The function accepts two arguments, the first of which must evaluate to 0, 1, or 2, and specifies the interpreter descriptor to be redirected. 0 is the interpreter's standard input, 1 is the interpreter's standard output, while 2 is the interpreter's standard error. if the user wishes to read from the new process, 0 should be passed as the first argument, but if the user wishes to write to the new process then 1 or 2 should be passed as the first argument. The second argument must evaluate to a string specifying a command line to run. The command is passed to the shell (/bin/sh) for execution, and may contain any expression that program understands. All errors will stop evaluation. Upon success "pipe" returns 1. Note that if the specified command cannot be found, or there is some other error in the command line, the call to "pipe" will succeed, but the new shell process will immediately exit. If interpreter descriptor 0 were redirected, then the next call to "getline" will return 0, indicating EOF, but if descriptors 1 or 2 were redirected, the next "print" to the pipe will silently fail. Redirection is undone by the "resume" intrinsic. It is not necessary to undo a redirection, to perform another redirection. Successive invocations of "pipe" on the same descriptor may be made to create a pipeline. If interpretation returns to the toplevel, all redirections are undone automatically. See the entries for the "with_input_process" and "with_output_process" macros, for a simple means of performing temporary redirections. The following example is equivalent to entering: "jot 100 1 | sort -n | fmt" at the shell: >(progn >> (pipe 1 "fmt") >> (pipe 1 "sort -n") >> (for (a 100 1) >>> (print a) >>>> (newline))) 1 > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Note the last line of output from the pipeline is a short line. This means that after all the data was written to the pipeline, the "fmt" utility blocked, reading on its stdin, waiting for enough subsequent data to fill another entire line. It is only when the "progn" returned to toplevel and the interpreter did an implicit "resume" on descriptor 1, that the pipeline processes read EOF on the pipe, and "fmt" in turn, spit out the last line. The output of the pipeline and the interpreter process are mixed up because they both have their standard outputs connected to the terminal. The 1 after the lisp, is the return value of the lisp expression. This is followed by a the lisp prompt >, which is then followed by the output from the pipeline. It is necessary to wrap the whole example in a "progn" to prevent the interpreter from returning to toplevel in-between subexpressions, and undoing the redirections. The child process spawned by the second invocation of "pipe" inherited the redirection of stdout by the first invocation of "pipe" which results in the creation of the pipeline. Since the command argument to "pipe" is passed to the shell, we could have used (pipe 1 "sort -n | fmt") instead and let the shell create the pipeline for us. Note that we must create the subprocesses of the pipeline in reverse order if we are going to write to the pipeline, but in order if we are going to read from the pipeline. >(progn >> (pipe 0 "jot 1 100") >> (pipe 0 "fmt") >> (while (set 'l (getline)) >>> (print l))) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 0 > redirect: (redirect expr1 expr2 [expr3 [expr 4]]) The "redirect" intrinsic redirects one of the three standard file descriptors to a file. The function accepts two, three or four arguments, the first of which must evaluate to one of 0, 1, or 2, and specifies the descriptor to be redirected. The three possible values correspond to standard input, standard output, and standard error, respectively. The second argument must evaluate to a string specifying the filename to be opened. The third and fourth arguments are boolean flags which indicate whether the file specified by the second argument should be opened in append mode, and whether an attempt should be made to obtain a lock on the file, respectively. The third optional argument is ignored unless argument 1 evaluates to 1 or 2, and itself must evaluate to a fixnum. The fourth optional argument must also evaluate to a fixnum. If argument 3 evaluates to a non-zero value, it indicates the contents of the file should be appended to. If the third argument is not present or evaluates to 0, and the first argument evaluates to 1 or 2, then the file will be created if it does not exist, or overwritten if it does. If the third argument is present and evaluates to a non-zero fixnum, and the first argument evaluates to 1 or 2, the file will be created if it does not exist, or it will be appended to, if it already exists. The third argument is ignored if the first argument evaluates to 0. If the fourth argument evaluates to non-zero value, and the first argument evaluates to 0, an attempt will be made to acquire a shared lock on the file. If the fourth argument evaluates to a non-zero value, and the first argument evaluates to 1 or 2, then an attempt will be made to acquire an exclusive lock on the file. If the function is successful, the specified file descriptor will be redirected onto the specified file, and 1 will be returned. If an attempt is made to redirect descriptor 0 to a non-existent file, -1 is returned. If an attempt is made to redirect any descriptor to a file to which the user lacks the necessary permissions, -2 is returned. If a lock cannot be acquired, -3 is returned. If the function encounters any other error from the open() system call, it will return a string describing the error. Errors from other sources will stop the interpreter. Redirection is undone by the "resume" intrinsic. It is not necessary to undo a redirection, to perform another redirection. Successive invocations of "redirect" may be made, and each redirection may be undone by calling "resume" to reconnect the specified descriptor to the stream it was connected to previously. In other words, to undo all redirections, it is necessary to call "resume" on a descriptor the same number of times one has called "redirect" on the same descriptor. If the thread of execution returns to the toplevel read-eval-print loop while one or more streams are redirected, all redirections will be undone by the interpreter, with all three standard streams being reconnected to the descriptors to which they were reconnected when the interpreted was started. temporary: (temporary) The "temporary" intrinsic creates a temporary file in /tmp opened for writing, and redirects stdout onto it. The function accepts no arguments and returns the filename of the temporary file. The programmer should invoke (resume 1) when finished writing to the file, to reconnect stdout the stream it was previously connected to, and "unlink" the file when completely finished with it. with_temporary_output_file: (with_temporary_output_file expr...) The "with_temporary_output_file" invokes "temporary" to redirect stdout onto a temporary file, evaluates a sequence of expressions, then invokes "resume" to undo the redirection to the temporary file. The macro accepts one or more arguments to be evaluated when the redirection is in place, and returns the name of the temporary file, so that further code in your program may access the temporary file. One should "unlink" the filename when one is completely finished with the file. resume: (resume expr) The "resume" intrinsic causes a redirected descriptor to be reconnected to the device it was connected to before the last call to "redirect" or "pipe". The function accepts one argument which must evaluate to one of 0, 1, or 2, corresponding to the descriptor to be affected. If the specified descriptor were not redirected, "resume" returns 0, otherwise it returns 1. with_input_file: (with_input_file expr1 expr2 ...) The "with_input_file" macro temporarily redirects standard input onto a file. The macro accepts two or more arguments, the first of which must evaluate to a string specifying the file to read from, while the rest are expressions to be evaluated while standard input is redirected. The redirection is undone after all the arguments have been evaluated. No arguments are evaluated until after the macro expands, in the calling scope. The arguments following the first argument will not be evaluated unless redirection has been successful. The macro returns the value of evaluating the last argument upon success, otherwise it returns return value of the failed invocation of the "redirect" intrinsic. The "resume" intrinsic is called by the macro to reconnect standard input to the device it was connected to before the macro was invoked, after all the body expressions have been evaluated. > (with_input_file "library.munger" >> (print (getline))) ; This file contains the lisp library code for the Munger interpreter. 1 Note that calls to "with_input_file" may be nested. > (with_input_file "library.munger" >> (print (getline)) >> (with_input_file "INSTALL" >>> (print (getline))) >> (print (getline))) ; This file contains the lisp library code for the Munger interpreter. Munger Installation ; Copyright (c) 2001, 2002, 2003, James Bailie <jimmy@mammothcheese.ca>. 1 > with_output_file: (with_output_file expr1 expr2 ...) The "with_output_file" macro behaves similarly to the "with_input_file" macro except that "with_output_file" redirects standard output instead of standard input. The specified file will be overwritten, if it already exists. > (with_output_file "tmp" (print "hello") (newline)) 1 > (with_input_file "tmp" (print (getline))) hello 1 with_output_file_appending: (with_output_file_appending expr1 expr2 ...) The "with_output_file_appending" macro behaves similarly to the "with_output_file" macro, except that the specified file will be opened for appending. with_error_file: (with_error_file expr1 expr2 ...) The "with_error_file" macro behaves similarly to the "with_output_file" macro except that "with_error_file" redirects standard error instead of standard output. The specified file will be overwritten, if it already exists. with_error_file_appending: (with_error_file_appending expr1 expr2 ...) The "with_error_file_appending" macro behaves similarly to the "with_error_file" macro, except that the specified file will be opened for appending. with_input_process: with_output_process: These two macros behave similarly to the "with_input_file" and "with_output_file" macros, the only difference being the first argument specifies a program for the appropriate descriptor to be piped to or from, instead of a file. The first argument is passed to /bin/sh for processing and may therefore contain any commands that program understands. Read the entries for with_input_file and with_output_file for more information. > (with_input_process "ls" (while (set 'l (getline)) (print l))) INSTALL LICENSE Makefile README cat.munger cgi.munger cgi.munger echo.cgi err.munger grep.munger intrinsics.c library.munger lisp.c lisp.h munger.man transform.munger foreach_line: (foreach_line expr) The "foreach_line" library macro is used to create filter-like programs. The function accepts one argument, which must evaluate to a monadic function. The monadic function is called repeatedly with each line of input. The input lines are taken from the files specified on the command line, or if no files are specified, from the standard input. Note the function passed to the macro can be an intrinsic which accepts variable length argument lists, but is happy with just one argument, such as the "print" intrinsic. The argument pointer must be positioned so that the next call to (next) will return the first of the command line arguments the user wishes processed, if any. A script would normally process any option arguments, advancing the argument pointer as it did so, and then call foreach_line. The invocation of foreach_line returns when it has processed all the specified files, or when it encounters EOF on standard input. ; Skip over the interpreter name, so that the argument pointer is now ; pointing to the script name. The next invocation of (next) will ; either return the next argument, or 0. (next) (foreach_line (lambda (line) [do something with line] )) (exit 0) foreach_line_callback: (foreach_line_callback expr1 expr2 (expr3)) The "foreach_line_callback" macro functions similarly to the "foreach_line" macro, but is intended for use by filters which need to reset state variables when the input source changes, and/or which wish to output the results of processing the just-completed input source. The macro accepts one or two more arguments than "foreach_line". The second argument must evaluate to a function, which itself accepts no arguments, which will be called at the end of processing each input source. The third argument is optional, and if present, is treated as a logical flag which specifies whether the callback should be invoked before or after the argument pointer is advanced. In Munger, zero (fixnum), the empty list, and the empty string are all logical "false" values; all other objects are considered "true". A "true" third argument causes the callback to be invoked after the argument pointer has been advanced. Omitting the third argument, or setting it to one of the three "false" values, causes the callback to be invoked before the argument pointer is advanced. Within the callback function, invoking (current) will return the command-line argument the argument pointer is currently pointing at. This means if you wish to access the name of the file which has just been processed in the callback, no third argument, or a "false" third argument should be passed to the macro, but if you wish to access the name of the new input source about to be processed, a "true" third value should be passed as the third argument. If the currently-running script has not been presented with any command-line arguments, then "foreach_line_callback" will read data from the standard input, and then invoke the callback function once after all data has been processed. The callback function should be crafted with this possibility in mind. The "grep.munger" example program in (libdir) demonstrates using the third argument. The following script does not supply a third argument, and simulates wc -l: (next) (set 'script (current)) (set 'count 0) (set 'total 0) (set 'callback (lambda () (let ((name (if (eq script (current)) "stdin" (current)))) (print count " " name) (newline) (set 'total (+ total count)) (set 'count 0)))) (foreach_line_callback (lambda (x) (inc count)) callback) (print total " total") (newline) (exit 0) unlink: (unlink expr) The "unlink" intrinsic removes a file from a directory. It accepts one argument which must evaluate to a string specifying the filename to remove. The function returns 1 upon success, or a string specifying the error encountered upon failure. realpath: (realpath expr) The "realpath" intrinsic resolves all symbolic links, extra / characters, and references to . and .. in a specified pathname. If the filename begins with a tilde (~), the function will attempt to expand the tilde as csh or bash does to reference a home directory. The Function accepts one argument, which must evaluate to a string, and returns a string. If a directory component of the specified path does not exist, the empty string is returned. > (realpath ".") "/usr/home/jbailie/src/munger-4.81" > (realpath "~/foobar") "/usr/home/jbailie/foobar" > (realpath "/foobar/foobar") "" ; Tilde-expansion does not occur if the tilde-expression does not reference ; a valid, readable, home directory. Here the tilde is considered to be ; the first character of a relative filename, because there is no ; ~nobody home directory on my system. > (realpath "~nobody/foobar") "/usr/home/jbailie/src/munger-4.81/~nobody/foobar" access: (access expr1 expr2) The "access" intrinsic determines whether or not the real user id the interpreter is running as, has access to a specified file. The function accepts two arguments, the first of which must evaluate to a string and specifies the filename to check access to, while the second argument must evaluate to a number and be one of 0, 1, or 2. A second argument of 0 instructs the function to check for read access to the specified file. A second argument of 1 instructs the function to check for write access to the file, and a second argument of 2 instructs the function to check for executable access to the file. The function will return 1 to indicate access is allowed, or 0 otherwise. mkdir: (mkdir expr) The "mkdir" intrinsic accepts one argument, which must evaluate to a string, and attempts to create a directory with the same name as the string. Upon success, "mkdir" returns 1; otherwise it returns a string describing the error encountered. If successful, the newly-created directory will have permissions 755 (possibly modified by the current umask). rmdir: (unlink expr) The "rmdir" intrinsic removes a directory from the filesystem. It accepts one argument which must evaluate to a string specifying the directory name, and returns 1 upon success, or a string specifying the error encountered upon failure. words: (words) The "words" intrinsic accepts no arguments and returns the number of words in the buffer. time: (time) The "time" intrinsic accepts no arguments and returns a string representing an integer representing the number of seconds since 00:00 Jan 1, 1970 (the UNIX epoch). It was not possible to return the integer value directly when Munger's fixnums were 1 bit less in size than the word-size the interpreter was running on, as time values would overflow the 31 bits of a fixnum on 32-bit machines. This limitation has been lifted and fixnums are now the full 32 bits on a 32 bit machine, but the time intrinsic remains unchanged for backward compatibility. Note that one may use the "digitize" intrinsic to convert the time value to a fixnum. timediff: (timediff expr1 expr2) The "timediff" intrinsic accepts two arguments which both must evaluate to strings specifying a time value expressed in seconds from the UNIX epoch (00:00 Jan 1, 1970), such as returned by the "time" intrinsic, and returns the difference in seconds between the two times, as a number. If the time difference cannot be expressed within the word-size of the machine, minus 1-bit, then fixnum wrap-around will occur and the returned value will not be accurate. The "unsigned" intrinsic will return a string representing the correct time value. timethen: (timethen expr) The "timethen" intrinsic accepts one argument which must evaluate to a fixnum and returns a string representing the current time in seconds from the UNIX epoch (00:00 Jan 1, 1970), offset by that many seconds. This function can be used to determine the UNIX time value for a time in the future or the past. For example, the following code returns the UNIX time value for a time 1 hour in the past: > (timethen -3600) "1124553616" date2days: (date2days expr1 expr2 expr3) The "date2days" intrinsic accepts three arguments, each of which must evaluate to a fixnum, specifying a year, month, and a day of the month, and returns a fixnum representing the number of days this date is removed from March 1st, 1 BCE. This value can be useful in performing calendar arithmetic. The first number indicates the year and must be greater than or equal to 0, (1 BCE). The second number indicates the month and must be in the range 1-12. The third number indicates the day of the month and must be in the range 1-31, or 1-30, depending on the month specified. days2date: (days2date expr) The "days2date" intrinsic accepts a day number fixnum returned by "date2days" and converts it into a list of three fixnums representing that date. The first number indicates the year and will be greater than or equal to 0, (1 BCE). The second number indicates the month and will be in the range 1-12. The third number indicates the day of the month and will be in the range 1-31. week: (week expr) The "week" intrinsic accepts one argument which must evaluate to a fixnum, representing a day number as returned by "date2days", and returns a two-element list of fixnums representing the year and the week number in that year, respectively, in which the specified day occurs. weekday: (weekday expr) The "weekday" accepts one argument which must evaluate to a day number, such as returned by "date2days", and returns a two-element list describing the day of the week of the specified day. The first returned element is a fixnum in the range 0-6, each value of which corresponds to a weekday in the range Sunday-Saturday, respectively. The second element is a string representing the name of the weekday, in English. month: (month expr) The "month" intrinsic accepts a fixnum argument specifying the numerical value of a month (1-12), as used by the "date2days", "days2date", and "localtime" intrinsics, and returns a string representing the name of the month in English. localtime: (localtime expr) The "localtime" intrinsic accepts a string representing a UNIX time value, such as returned by the "time" intrinsic, and returns a six- element list of fixnums representing the date of the specified time in the local timezone. The first returned element is the year. The second is the month. The third is the day of the month. The fourth is the hour. The fifth is the minutes. The sixth is the seconds. utctime: (utctime expr) The "utctime" intrinsic accepts a string representing a UNIX time value, such as returned by the "time" intrinsic, and returns a six-element list of fixnums representing the date of the specified time in terms of coordinated universal time. The first returned element is the year. The second is the month. The third is the day of the month. The fourth is the hour. The fifth is the minutes. The sixth is the seconds. date2time: (date2time expr1 expr2 expr3...) The "date2time" intrinsic converts a list of fixnums representing a date on or after midnight January 1st, 1970, and returns a fixnum representing the number of seconds from the UNIX epoch (midnight, January 1st, 1970). See the entry for the "time" intrinsic for more information on UNIX time values. The function accepts three, four, five, or six arguments. The first three arguments must be present, and represent the year (1970-), the month (1-12), and the day of the month (1-31), respectively. The fourth, fifth, and sixth optional arguments represent the hour, minute, and seconds values of the date, respectively. Omitted optional arguments default to 0. The presence of an optional argument implies the presence of all preceding optional arguments. This means if you include a value for minutes, you must include a value for hours as well, and if you include a value for seconds, you must include values for hours and minutes as well. random: (random expr) The "random" intrinsic accepts one argument which must evaluate to a positive number, and returns a random integer where 0 < returned-value < evaluated-argument. The interpreter calls srandomdev() at startup to initialize the random number generator, and random() to generate random numbers. Random numbers are then scaled to be within the requested range by computing: evaluated-argument * random-number / RAND_MAX. date: (date [expr [expr ]]) The "date" intrinsic accepts two optional arguments and returns a string representing the current date and time. The two optional arguments, if present, must evaluate to fixnums. Invoking "date" with no arguments is the same as invoking it with an first argument of 0. The second argument is only significant when the first argument is non-zero. When invoked with no arguments, or one argument of 0, the function returns a representation of the time and date expressed in terms of the local timezone. When invoked with a non-zero first argument, the function returns a representation of the time and date expressed as universal coordinated time. If you are running FreeBSD 4.x, then the timezone in this case will be represented by the three-letter abbreviation GMT (Greenwich Mean Time), but on FreeBSD 5.x and later versions the timezone will be represented by the three-letter abbreviation UTC (Universal Coordinated Time, in French). The "date" intrinsic was designed to return a properly formatted date string per HTTP/1.1 if invoked with a non-zero first argument, but this standard specifies GMT must be used for the timezone indicator and not UTC, therefore the second optional argument may be present, and if non-zero changes the UTC abbreviation to GMT. > (date) "Fri, 04 Oct 2002 12:06:11 EDT" > (date 1) "Fri, 04 Oct 2002 16:06:13 UTC" > (date 1 1) "Fri, 04 Oct 2002 16:06:16 GMT" datethen: (datethen expr1 [expr2 [expr3 ]]) The "datethen" intrinsic may be used to produce human-readable date strings from UNIX time values. The function accepts one to three arguments. The first must evaluate to a string specifying a UNIX time value, such as those generated by the the "time" and "timethen" intrinsics. The optional second and third arguments must be one of either fixnum 0 or fixnum 1. The function returns a string describing the time argument in human readable form, formatted identically to the strings produced by the "date" intrinsic. If the second argument is present and is 1, the returned time string will be expressed in UTC, otherwise the time string will be expressed relative to the local time zone. If the third argument is present and evaluates to 1, then the timzone abbreviation will be GMT instead of UTC. Invoking (datethen (time)) is equivalent to invoking (date), while invoking (datethen (time) 1) is equivalent to invoking (date 1). print: (print expr...) println: (println expr...) Intrinsic "print" accepts one or more arguments, evaluates them, and prints the each evaluated argument, in order, to the standard output stream. If any of the arguments evaluate to strings, the contents of the strings are printed without the surrounding quotes. If any of the arguments evaluate to lists, then any strings in those lists will be printed with their surrounding quotes. The "print" intrinsic always returns 1. Intrinsic "println" functions similarly to "print" but it outputs a single newline character after printing its arguments. > (print "hello there") hello there 1 > (set 'f '(a b c)) (a b c) 1 > (print f) (a b c) > (print 'hello) hello1 > (progn >> (print 'hello) >> (newline)) hello 1 The "newline" intrinsic outputs a newline character (ASCII 10) to the standard output. load: (load expr) Intrinsic "load" reads lisp from the file specified by its lone argument, which must evaluate to a string. The function returns the value resulting from evaluating the last expression in the file. Any errors encountered opening the file, or evaluating its content will stop evaluation. getline_ub: (getline_ub [expr]) The "getline_ub" (getline unbuffered) is intended for use when the user wants to read lines of text from stdin, but subsequently "fork" or "forkpipe" the interpreter and "exec" or "shexec" a new program to continue reading lines from stdin. The "getline" intrinsic does its own input buffering, and if used in such a situation, would result in buffered data in the parent being lost to the child. Note that if the child does not call "exec" or "shexec" this is not true, as the child will have its own copy of the buffered data. The "getline_ub" intrinsic reads a line from standard input and returns it as a string, including the terminating newline. If it does not encounter a newline after having read 2048 characters, the 2048 characters will be returned without a terminating newline. If EOF is encountered before a newline is read, the returned string will contain all the remaining data in the input stream without a terminating newline. If no characters remain in the input stream the function returns 0. If the read() system call fails the function also returns 0. The function accepts two optional arguments, which if present, must both evaluate to fixnums. The lone optional argument is a timeout value, useful when stdin is connected to a socket. "getline_ub" reads data a character at a time from stdin until it reads a newline. If any invocation of the read() system call blocks for a longer number of seconds than that specified by the timeout value, it will be interrupted and "getline_ub" will return any characters successfully read before the timeout, not terminated by a newline character. If no characters were read before the timeout, then the empty string will be returned. This is the only circumstance in which the empty string will be returned. If invoked without a timeout value, "getline_ub" will block until at least one character can be read from stdin or EOF is encountered, and return either a non-empty string, or 0 on EOF. Therefore, if stdin has been connected to a socket via "accept", a return value of "" from (getline_ub 5) indicates a timeout, while a return value of 0 indicates EOF. The timeout argument may be omitted or it may be set to 0, to allow read() to block indefinitely. When reading from a terminal device in canonical mode, carriage returns will be converted into newlines by the terminal driver, and be returned as newlines to "getline_ub". When reading from other sources, or from a terminal in non-canonical mode, the carriage returns will be passed through untranslated. Note that if reading from a terminal device in canonical mode with a timeout, the empty string will always be returned when a timeout occurs, as the terminal driver will not return any characters to the interpreter until a carriage return or newline is input. getline: (getline [expr1 [expr2]]) Intrinsic "getline" reads a line from standard input and returns it as a string, including the terminating newline. If it does not encounter a newline before reading 2048 characters, it will return the 2048 characters without a terminating newline. If the end of stdin is reached while searching for the next newline, all the remaining characters in the stream, if any, will be returned as a string without a terminating newline. If no characters remain in the stream, "getline" returns fixnum 0. Any subsequent invocations of "getline" on the same input source will continue to return fixnum 0. The function also returns 0 when it encounters any error condition. "getline" does its own input buffering to make line-by-line reading of data more efficient, using a 100K buffer. The function should not be used when the user wishes to read some lines, then fork and exec another program to continue reading from the inherited stdin, as the data already read into the parent's buffer will not be available to the child process after an "exec", so data will be lost. In this situation, the unbuffered version, "getline_ub" must be used instead, or the unbuffered "getchars" intrinsic. IF THE STANDARD INPUT IS NOT A TERMINAL DEVICE, BOTH ARGUMENTS ARE IGNORED AND THE BEHAVIOR IN THE REST OF THIS DESCRIPTION DOES NOT OCCUR. If standard input is a terminal device, then the terminal is put into "raw" mode, with the interpreter simulating the standard UNIX terminal line discipline (ctrl-h => backspace; ctrl-w => werase; ctrl-u => kill). If ctrl-d (EOF) is encountered as the first character of input, "getline" returns the fixnum zero, otherwise EOF is ignored, and getline continues to accumulate characters until either a carriage-return or a newline is input. Carriage returns are converted into newlines, so the strings returned by "getline" when reading from a terminal device are always newline-terminated. The function drops the cursor to the last line of the terminal device and echoes input there. The function will also automatically perform horizontal scrolling if the input line grows to the width of the terminal device, scrolling in increments of the terminal width - 1 character. The last column of the last line of the screen is not used because some terminals will automatically scroll up a line if a character is printed there. The function accepts two optional arguments, the first of which, if present, must evaluate to a string, and is printed as a prompt to the user. If the second optional argument is also present, it must evaluate to a fixnum and specifies the location of tabstops in the onscreen echoed string. For example a value of 4, will cause tabstops to appear to the user to be set every four characters. If the second argument is not present, tabstops default to every 8 characters. Pass the empty string as the first argument if you wish to submit a second argument to the function, but do not wish it to print a prompt. If the second argument is 0, then the input of a tab character will trigger filename completion. if the second argument is -1, then tab will trigger command and/or filename completion, depending upon the state of the input line when the tab is received. If the second argument is -2, then tab will trigger filename completion, but the filename completion mechanism will not work recursively. If the second argument is -3, then tab will trigger command and/or filename completion, but the filename completion mechanism will not work recursively. Note that in these cases, tabs cannot be entered by the user. For the cases where the filename completion mechanism does not work recursively (second argument of -2 -3), it will attempt to complete one level of the path provided to it. If completion is invoked again at this point, it will complete another level, and so on. For the cases where the filename completion works recursively, it will complete as much of the path given to it, as is possible. When command and filename completion have been requested, and the text entered so far consists of only non-whitespace characters, then command completion will be attempted, otherwise filename completion will be attempted upon the last contiguous segment of non-whitespace characters. There are two exceptions to this rule. If the first character input is either '/' or '.' then filename completion will be attempted in the initial position instead of command completion. Completion will work with commands or filenames which contain whitespace only if the portion before the cursor does not contain whitespace. Quoting mechanisms, such as those provided by shell programs, are not available. The "getline" function will append a single space character to the input string after a completion has been successfully completed and the completion was unambiguous. When completions are requested, each string created with "getline" is placed onto a 500-line history list and may be recalled, edited and re- entered. Pressing Contol-P or Control-N, while inputting text via "getline" will cause the current text to be replaced with the previous or the next item on the history list, respectively. While in the history list one may quickly move back to the input line by invoking Control-X. The history list may be read from or written to a file with the "load_history" and "save_history" intrinsics. When completions are requested, pressing Control-R or Control-S causes the interpreter to enter history search mode. Square brackets appear above the input line. Text typed by the user appears inside the brackets and causes the interpreter to search forward (C-s) or backward (C-r) in the history list for a line containing the bracketed text. Pressing C-s or C-r again causes the interpreter to search forward or backward for another matching line. Both Control-R and Control-S wraparound the ends of the history list. Control-H and Backspace erase the last type character inside the brackets. The history can be cleared with the "reset_history" intrinsic. When completions are requested, these command-line editing commands are recognized by getline, in addition to C-h, C-w, and C-u: M-f - Moves the cursor to the beginning of the next word in the line. M-b - Moves the cursor to the beginning of the previous word in the line. C-k - Deletes the text from the cursor location to the end of the line. C-d - Deletes the character the cursor is on. M-d - Deletes the word or word-portion the cursor is before. C-a - Moves the cursor to the beginning of the line. C-e - Moves the cursor to the end of the line. C-y - pastes the last deletion into the line before the cursor. Additionally, will C-u not delete the entire line, when completions are requested, but only the portion before the cursor. reset_history (reset_history) The "reset_history" intrinsic clears the items on the history list the interpreter maintains for the "getline" intrinsic. The function accepts no arguments and always returns 1. load_history (load_history expr) save_history (save_history expr) The "load_history" and "save_history" intrinsics read and write the history list maintained by the "getline" intrinsic to and from files. Both functions accept a single argument that must evaluate to a string specifying the filename. In both functions, if the open() system call encounters an error, a string will be returned. In the "save_history" intrinsic, if the read() system call encounters an error a string will be returned. Otherwise, both functions return the number of lines read or written. rescan_path: (rescan_path) The "rescan_path" intrinsic causes the interpreter to re-build its internal list of executable files from the directories defined by the PATH environment variable. This list is used by "getline" and by "command_lookup". If the programmer wishes those two intrinsics to continue to find all executables, it is necessary to invoke "rescan_path" if the PATH environment variable has changed, or if new executables have been added to the directories specified by that environment variable, since the last invocation of either "getline" or "command_lookup". The function always returns 1. stringp: (stringp expr) Intrinsic "stringp" returns 1 if its lone argument evaluates to a string, otherwise it returns 0. > (stringp "0") 1 > (stringp 0) 0 fixnump: (fixnump expr) Intrinsic "fixnump" returns 1 if its lone argument evaluates to a number, otherwise, it returns 0. > (fixnump 0) 1 > (fixnump "0") 0 symbolp: (symbolp expr) Intrinsic "symbolp" returns 1 if its lone argument evaluates to a symbol, otherwise, it returns 0. regexpp: (regexpp expr) Intrinsic "regexpp" returns 1 if its lone argument evaluates to a compiled regular expression object, otherwise, it returns 0. tablep: (tablep expr) Intrinsic "tablep" returns 1 if its lone argument evaluates to a table, otherwise, it returns 0. stackp: (stackp expr) Intrinsic "stackp" returns 1 if its lone argument evaluates to a stack, otherwise, it returns 0. intrinsicp: (intrinsicp expr) Intrinsic "intrinsicp" returns 1 if its lone argument evaluates to an intrinsic function, otherwise, it returns 0. closurep: (closurep expr) Intrinsic "closurep" returns 1 if its lone argument evaluates to a closure, otherwise, it returns 0. macrop: (macrop expr) Intrinsic "macrop" returns 1 if its lone argument evaluates to a macro closure, otherwise, it returns 0. recordp: (recordp expr) Intrinsic "recordp" returns 1 if its lone argument evaluates to a record, otherwise it returns 0. sqlitep: (sqlitep expr) Intrinsic "sqlitep" returns 1 if its lone argument evaluates to a sqlite database object, otherwise, it returns 0. regcomp: (regcomp expr1 [expr2 [expr 3]]) The "regcomp" intrinsic accepts one, two, or three arguments, the first of which must evaluate to a string, and which is interpreted as a regular expression to be compiled into a compiled regular expression object. The function returns the compiled regular expression object if the compilation is successful, or a string containing an error message, otherwise. Regular expression objects are constants which evaluate to themselves. The "regcomp" intrinsic also accepts two optional arguments after the first. If the second optional argument is present, it must evaluate to a fixnum, and is interpreted as a boolean. If non-zero, it causes "regcomp" to produce a compiled regular expression which will match text case-insensitively. Without the second argument, "regcomp" defaults to case-sensitive matching. If the third optional argument is present, it must evaluate to a fixnum, and is interpreted as a boolean. If non-zero, it prevents "regcomp" from recognizing regular expression operators in the first argument. The resulting compiled expression will match the literal string value of the first argument. The "regcomp" intrinsic interprets the two-character escape-sequences \t, and \b, to represent the tab and space characters respectively, allowing one to specify regular expressions containing whitespace, without using whitespace. This can sometimes be convenient. The carriage return and newline characters can also be specified with escape sequences: \r, and \n, respectively. The zero width assertions \< and \> match the null string at the beginning and end of a word, respectively. All the regular-expression operators may be escaped with a backslash to include the operator as a literal literal character in the expression. Unrecognized escape sequences will be replaced with the escaped character. The backslash will be consumed. All escape sequences are ignored and left unchanged in the expression if a non-zero third argument has been passed to "regcomp". substitute: (substitute expr1 expr2 expr3 [expr4]) Intrinsic "substitute" performs search and replace operations on strings using regular expressions to specify the matches. The function uses a template string to specify the replacement text, in a manner similar to that of the substitute commands of ed and sed, where the template may refer to the text of the match, and the text matched by parenthesized subexpressions. There is also a "replace" library function available, described elsewhere in this document, which allows the replacement expression to be a snippet of lisp, which is evaluated upon each match to dynamically generate the replacement text. The "substitute" function takes three or four arguments, and returns a new string, incorporating the appropriate replacements. The first argument must evaluate to a compiled regular expression. See the "regcomp" intrinsic for more information on compiling regular expressions. The second and third arguments must evaluate to strings, and are interpreted as the replacement string, and the string to search for matches in, respectively. The fourth optional argument, if present, must evaluate to a number, which indicates the number of matches which should be replaced. A value of 0 indicates the substitution should replace all the matches of the pattern in the text of the third argument. The absence of the fourth argument is the same as having a fourth argument of 1, which is to say, only the first match will be replaced. The text matched by the first ten subexpressions in the regular expression may be inserted into the replacement string by including an escape sequence of the form \[0-9] in the replacement text where one wishes the matched text to appear. The first subexpression is referred to by \1 and the tenth subexpression is referred to by \0. The text matched by the entire regular expression may be inserted into the replacement text by inserting \& into the replacement string at the desired location. In the replacement string unrecognized escape sequences are replaced with the unescaped character. To escape the backslash, to cause a literal backslash to appear in the substitution, a total of four backslashes are necessary in the replacement string. One level of escaping will be removed by the lisp string parser, leaving two backslashes for "substitute" to see, the first one escaping the second. Note that Munger's string parser will not remove unrecognized escape sequences, so that this double level of escaping is only necessary when the backslash itself is being escaped. > (substitute (regcomp "string") "booger" "string string string string" 2) "booger booger string string" > (substitute (regcomp "[a-z]+") "{\&}" "one two three" 0) "{one} {two} {three}" > (substitute (regcomp "([a-zA-Z]+) ([a-zA-Z]+)") "\2 \1" >> "is This sentence a." 0) "This is a sentence." The "substitute" intrinsic interprets the two-character escape-sequences \t, and \b, to represent the tab and space characters respectively, allowing one to specify whitespace in the replacement string, without using whitespace. This can sometimes be convenient. Five other escape sequences may be embedded in the replacement string to control the case of portions of the returned string. These escape sequences work for strings of ASCII alphanumeric characters, only: \U Turns on conversion to uppercase. \u Turns on conversion to uppercase for the next character only. \L Turns on conversion to lowercase. \l Turns on conversion to lowercase for the next character only. \e Turns off \U or \L. > (substitute (regcomp "foo") "\U\&\e" "foobar") "FOObar" The effects of \U and \L extend beyond the replacement string if they are not terminated: > (substitute (regcomp "foo") "\U\&" "foobar") "FOOBAR" \U and \L override each other: > (substitute (regcomp "foo") "\U\&\L" "foobar FUNBAG") "FOObar funbag" match: (match expr1 expr2) The "match" intrinsic matches a regular expression against a string, and returns a two element list if the regular expression finds a match within the string, or the empty list otherwise. The function accepts two arguments, the first of which must evaluate to a compiled regular expression object, while the second must evaluate to the string to match the regular expression against. The two-element list returned upon success, consists of the character indices of the starting and ending locations of the matched text. The second index is the start of the text following the match. These two indices may be used in conjunction with the "substring" intrinsic, to extract the text before, after, or containing the match: > (set 'rx (regcomp "foobar")) <REGEXP#3> > (set 's "---foobar---") "---foobar---" > (match rx s) (3 9) > (substring s 0 3) ; text before the match "---" > (substring s 3 (- 9 3)) ; text of the match "foobar" > (substring s 9 0) ; text after the match "---" Note that if the first returned index were 0, the first invocation of "substring" above would have returned the whole string, because passing a third length argument of 0 to "substring" means "to the end of the string." This should not be a problem, since a first returned index of 0 indicates there is no text before the match. The programmer can check for this situation before invoking "substring". The "matches" intrinsic, described directly below, can be used to extract the text of a match and any matching parenthesized subexpressions; while "match" itself, is intended to be used in situations where it is only necessary to know whether a match occurred or not, or if the text not matched needs to be accessed, or if the location of the match is required. matches: (matches expr1 expr2) The "matches" intrinsic accepts the same arguments as the "match" intrinsic, but returns a list, which if non-empty, describes the text matched. If no match were found, the list will be empty. Otherwise, a list of twenty strings will be returned. The first string of the list will be the text matched by the entire regular expression, while the subsequent elements of the list will be the text matched by the first nineteen parenthesized subexpressions in the regular expression. If nineteen subexpressions are not present in the regular expression, empty strings will be returned for the missing subexpressions. If a subexpression fails to match, an empty string will be returned in the corresponding position of the list. > (matches (regcomp "[0-9]+") "I have 22 figurines.") ("22" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "") > (matches (regcomp "([0-9])+") "12345") ("12345" "5" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "") > (matches (regcomp "([0-9]+)") "12345") ("12345" "12345" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ") If one is using a regular expression with a lot of parenthesized subexpressions, it may be more convenient to assign the results of invoking "matches" to a stack. The final zero in the example below is the return value of the "print" intrinsic. > (set 'rx (regcomp "(.)(.)(.)(.)(.*)")) <REGEX1> > (when (used (set 's (assign (stack) (matches rx "fooobar")))) >> (for (a 0 19) >>> (print (index s a) ":"))) foobar:f:o:o:o:bar:::::::::::::::1 split: (split expr1 expr2 [expr3]) The "split" intrinsic is used to break up a string into substrings at delimiter characters. It accepts either two or three arguments. Both of the first two arguments must evaluate to strings. The first is interpreted as a set of delimiter characters which specify where the second argument should be split. Where a character from the delimiter string occurs in the second argument, it is consumed, breaking up the second argument into pieces. The pieces are returned as a list of strings. The optional third argument must evaluate to a number, and specifies a limit on the number of pieces the second argument is to be split into. If the first argument is the empty string, then the second argument is split up character-by-character. The original string is unchanged. The "split" intrinsic will return empty strings for empty fields in the original string. Delimiter characters occurring adjacent to each other, or at the very start or very end of the second argument, will be recognized as empty fields. If none of the delimiter characters can be found in the string, then the returned list will contain only the original string. > (split " " "Lisp is an applicative language.") ("Lisp" "is" "an" "applicative" "language.") > (split " " "Lisp is an applicative language.", 3) ("Lisp" "is" "an applicative language.") > (split "" "Lisp is an applicative language", 3) ("L" "i" "sp is an applicative language") > (split ":" ":a:b:c:") ("" "a" "b" "c" "") split_rx: (split_rx expr1 expr2 [expr3]) The "split_rx" library function is used to break up a string into substrings using a regular expression to specify delimiter substrings. It works analogously to the "split" intrinsic, except the first argument must evaluate to a compiled regular expression object. As with "split", the second argument must evaluate to a string, and the third argument, if present limits the number pieces the second argument will be broken into. The supplied regular expression is matched against the second argument, then the text of matches is consumed to split the second argument into pieces, which are gathered into a list and returned. The function returns a list containing only the original second argument, if no match on the regular expression can be found. Empty fields cannot be detected by split_rx. > (setq rx (regcomp "[\b\t]+")) <REGEXP#34> > (split_rx rx "foobar tooley marzipan loopy ") ("foobar" "tooley" "marzipan" "loopy") rootname: (rootname expr) The "rootname" library function accepts one argument which must evaluate to a string, and returns the portion of that string, without the last trailing filename suffix. If the argument does not have a suffix the original string is returned. > (rootname "foobar.tar.gz") "foobar.tar" > (rootname (rootname "foobar.tar.gz")) "foobar" suffix: (suffix expr) The "suffix" library function accepts one argument which must evaluate to a string and returns the last filename suffix contained in the string. If the string does not end with a filename suffix, the original string is returned. > (suffix "foobar.tar.gz") ".gz" > (suffix (rootname "foobar.tar.gz")) ".tar" tokenize: (tokenize expr) The "tokenize" library function accepts one argument which must evaluate to a string, and returns a list consisting of all the tokens embedded in the string composed of contiguous non-whitespace characters. The function returns a list containing only the original argument string if it contains no whitespace. > (tokenize " foobar tooley hot-cha-cha!") ("foobar" "tooley" "hot-cha-cha!") replace: (replace expr1 expr2 expr3 [expr4]) The "replace" library macro behaves similarly to the "substitute" intrinsic, except that the replacement expression will be executed once for every match in the string. This allows the replacement string to be dynamically created for each match. The function accepts three or four arguments. The first argument must evaluate to a compiled regular expression object. The second argument is the expression to create the replacement string and is evaluated in a lexical context where the symbols m0...m19 are bound to the text of the full match, and the text matched by the first nineteen parenthesized subexpressions, respectively. Note that the resulting string this argument produces when evaluated will undergo some escape-processing, because it is ultimately used as the replacement pattern for a call to the "substitute" intrinsic. See the entry describing that function, elsewhere in this document, for details on the escape sequences recognized. Note also, one can simply pass a string as the second argument, but in that case it is much more efficient to use "substitute" directly. The third argument must evaluate to the string the regular expression is to be matched against. The fourth optional argument, if present is a repeat count, and can be used to limit the number of replacements made in the string. Unlike the "substitute" intrinsic, "replace" interprets a repeat count of 0 as a request to inhibit any substitutions from being performed. If the repeat count is not present, all matches in the string will undergo replacement. If no match is found, the function returns the original string, otherwise it returns a new string containing the specified substitutions. The following example uses "replace" to decode x-www-form-url-encoded characters in a specified string. > (setq str "foobar+tooley%7Efunbag%24%25%26") "foobar+tooley%7Efunbag%24%25%26" > (setq r0 (regcomp "%([0-9A-Fa-f][0-9A-Fa-f])")) <REGEXP#10> > (setq r1 (regcomp "\+")) <REGEXP#11> ; Decode encoded spaces. > (setq str (substitute r1 " " str 0)) "foobar tooley%7Efunbag%24%25%26" ; Decoded hex encodings. >(setq str >> (replace r0 >>> (char (hex2dec m1)) >>> str)) "foobar tooley~funbag$%&" concat: (concat expr ...) The "concat" macro is used to concatenate strings. The macro accepts one or more arguments which must all evaluate to strings, or to arbitrarily nested lists containing only strings, and returns a single string consisting of all the argument strings concatenated. > (concat "foo" "bar") "foobar" > (concat '("foo" "bar")) "foobar" explode: (explode expr) The "explode" macro accepts one argument which must evaluate to a string and returns a list of one-character strings, corresponding to each character in the original string. > (explode "foobar") ("f" "o" "o" "b" "a" "r") join: (join expr1 expr2 expr3...) The "join" intrinsic is used to concatenate strings with delimiters. It accepts three or more arguments, which must all evaluate to strings, or to arbitrarily-nested lists containing only strings. The first argument is sandwiched between the strings of the remaining arguments to make a new string, which is returned. > (join " " "This" "will" "be" "a" "sentence.") "This will be a sentence." > (join "" "this" "will" "be" "run" "together") "thiswillberuntogether" > (join ":" '("a" ("b" ("c") "d") "e")) "a:b:c:d:e" length: (length expr) The "length" intrinsic accepts one argument which must evaluate to a string, a record, a stack, a table, or a list, and returns the number of characters in the string, the number of key/value pairs in the table, or the number of elements in the list. > (length "fiver") 5 > (length '(a b c)) 3 sort: (sort [expr...]) The "sort" intrinsic accepts any number of arguments and sorts them in ascending order if they all evaluate to either strings or numbers. Invoking "sort" without arguments returns the empty list. When sorting strings, case is ignored. > (sort "D" "c" "B" "a") ("a" "B" "c" "D") > (sort 3 2 1) (3 2 1) sortlist: (sort expr) The "sortlist" intrinsic accepts one argument which must evaluate to a list whose elements must be either all numbers or all strings. The function returns a new list whose members are the members of the original list sorted into ascending order. When sorting strings, case is ignored. > (sortlist '("c" "b" "a")) ("a" "b" "c") > (sortlist '(0 -23 1)) (-23 0 1) sortcar: (sortcar expr) The "sortcar" intrinsic accepts one argument which must evaluate to a list and whose elements must all be lists themselves, whose first elements must be either all strings or all numbers. The function returns a new list containing the elements of the original list, sorted using the first elements of the sublists as the sort key. An example will make the meaning of this convoluted description clear: > (sortcar '(("Z" f f) ("B" f f) ("C" f f))) (("B" f f)("C" f f)("Z" f f)) "sortcar" is useful for sorting aggregate types. The following example sorts a stack-of-stacks, using the second element of each substack as the sort key. Perl users may recognize this as an example of an idiom that community has come to call "The Schwartzian Transform," but of course, this sort of thing is pure lisp. > (set 's (stack 3)) <STACK1> > (store s 0 (assign (stack) '(f 2 f))) <STACK2> > (store s 1 (assign (stack) '(f 1 f))) <STACK3> > (store s 2 (assign (stack) '(f 0 f))) <STACK4> > (flatten s) (<STACK2> <STACK3> <STACK4>) > (assign s (mapcar (lambda (x) (cadr x)) (sortcar (mapcar (lambda (x) (list (index x 1) x)) (flatten s))))) <STACK1> > (flatten s) (<STACK4> <STACK3> <STACK2>) until: (until expr1 [expr2 ...]) while: (while expr1 [expr2 ...]) The "while" intrinsic is a looping construct. It accepts one or more arguments, the first of which is the test condition. If it evaluates to a true value (anything other than 0 (fixnum), the empty string, or the empty list), the rest of the arguments are evaluated in order. This process is repeated until the first argument no longer evaluates to a true value, when the loop stops without evaluation of the other arguments. Therefore a while-loop will execute zero or more times. The "while" intrinsic returns the value of the failed test expression. The "until" intrinsic has similar behavior to the "while" intrinsic, except the logic of the test is inverted, which is to say the body of an "until" loop is executed as long as the test condition evaluates to a false value, or, phrased another way, until the test condition evaluates to a true value. > (setq n 10) 10 > (while n >>> (print n) >>> (newline) >>> (setq n (- n 1))) 10 9 8 7 6 5 4 3 2 1 1 > (setq n 0) 0 > (setq l ()) () > (until (eq n 10) >> (setq l (cons (inc n) l))) 1 > l (10 9 8 7 6 5 4 3 2 1) The final 1 here is the return value of the "while" expression. do: (do expr1 [expr2 ...]) The "do" intrinsic introduces a looping construct. It accepts one or more arguments. All the arguments are evaluated in order. If the last expression evaluates to a true value (anything other than 0, the empty string, or the empty list), then all the argument expressions are evaluated again. This process repeats until the last expression evaluates to a "false" value, when the loop exits without further evaluation. A do-loop will therefore evaluate its argument expressions one or more times. The "do" intrinsic returns the value of the failed test expression. > (set 'n 0) 0 > (do >> (print n) >> (newline) >> (set 'n (+ n 1))) > (< n 10)) 0 1 2 3 4 5 6 7 8 9 1 > The final 1 here is the return value of the "do" expression. fatal: (fatal) nofatal: (nofatal) The "fatal" and "nofatal" intrinsics, accept no arguments, and determine if errors encountered during evaluation will cause the interpreter to exit. When the interpreter starts it is in the "nofatal" state. If an error is encountered in the "nofatal" state, the stack will unwind back to the toplevel interpreter prompt. After "fatal" has been invoked, all errors which stop evaluation will additionally cause the interpreter to exit. Invoking "die" when in the "fatal" state, will also cause the interpreter to exit up return to toplevel. die: (die [expr...]) Intrinsic "die" forces the interpreter to stop evaluating any lisp it may be evaluating and return to the toplevel. The optional arguments are passed to the "warn" intrinsic. Invoking "die" when "fatal" has been invoked will cause the interpreter to exit, upon return to toplevel. throw: (throw expr) catch: (catch [expr1]...) The "catch" intrinsic together with the "throw" intrinsic provide a means of performing non-local exits. The arguments to catch are evaluated in an implicit "progn" and the result of the evaluation of the last expression is returned. If, however, a "throw" expression is encountered while evaluating any of the arguments to "catch" then the thread of execution returns to the catch expression. The "catch" form immediately returns, with the result of evaluating the argument to the "throw" expression becoming the return value of the "catch" expression. A "throw" is caught by its enclosing "catch" expression, if there is one, or if there is no enclosing "catch" expression, the interpreter returns to the toplevel, just as if the "die" intrinsic has been invoked. If the "fatal" intrinsic has been previously invoked, an uncaught "throw" will cause the interpreter to exit. > (catch >> (print (catch 0 (throw 'hello) 2 3 4)) >> (newline) >> (throw 'goodbye) >> (print 'not_reached)) hello goodbye > stringify: (stringify expr...) The "stringify" intrinsic accepts one or more arguments which must all evaluate to some sort of atom, and converts each atom's print syntax into a string. Then all the strings are concatenated into one string, which is returned. > (stringify 12 "," 43) "12,43" > (stringify 'hello " " 'there) "hello there" digitize: (digitize expr) Intrinsic "digitize" accepts one argument which must evaluate to an atom and converts it to a number, if possible. Attempting to apply digitize to a non-number will cause it to return 0. > (digitize '42) 42 > (digitize "-3") -3 mapcar: (mapcar expr1 expr2) The "mapcar" library function accepts two arguments, the first of which must evaluate to a monadic closure or a monadic intrinsic function or an intrinsic which may accept one or more arguments, such as the "print" intrinsic, while the second argument must evaluate to a list. The macro applies the function to each element of the list, returning a new list containing the results of each evaluation. > (set 'l '(a b c)) (a b c) > (set 'f (lambda (n) (list n n n))) <CLOSURE#23> > (mapcar f l) ((a a a) (b b b) (c c c)) map: (map expr1 expr2 ...) The "map" library function is a generalized version of "mapcar". The function accepts two or more arguments, the first of which must evaluate to a closure or an intrinsic function, while the remaining arguments must all evaluate to lists of the same length. The function passed as the first argument must accept as many arguments as there are list arguments. A new list is returned consisting of the result of successively applying the function to a grouping of elements from each list. The elements of the lists are successively grouped according to position. > (map (lambda (x y z) (+ x y z)) '(1 2 3) '(4 5 6) '(7 8 9)) (12 15 18) remove: (remove expr1 expr2) Library function "remove" accepts two arguments, an expression of any type, and a list, respectively, and returns a new list which is a copy of the argument list, with all occurrences of the first argument removed. The function uses library function "equal" to test for equivalency. > (remove 'a '(a a a b)) (b) > (remove '(a b) '((a b) c (a b) (a b) d e f g)) (c d e f g) nthcdr: (nthcdr expr1 expr2) The "nthcdr" intrinsic returns the "nth" cdr of a list. The function accepts two arguments. The first must evaluate to a list. The second must evaluate to a positive fixnum. The function returns the list which would be created if "cdr" were applied to the list the number of times specified by the second argument. If the list has insufficient elements to complete the request, the empty list is returned. > (nthcdr '(a b c) 1) (b c) > (nthcdr '(a b c) 2) (c) > (nthcdr '(a b c) 3) () nth: (nth expr1 expr2) The "nth" intrinsic returns the "nth" element of a list. elements are numbered starting from zero. The function accepts two arguments. The first must evaluate to a list, while the second must evaluate to a positive fixnum specifying the index position of the desired element. The function returns the specified element, if it can. If the list has insufficient elements to satisfy the request, then the function returns the empty list. > (setq l '(a b c)) (a b c) > (for (n (length l) 0) >> (println (nth l n))) () c b a 0 ; This is the return value of "while" member: (member expr1 expr2) Library function "member" accepts two arguments, the first of which can evaluate to any lisp object, while the second must evaluate to a list. If the first argument is "equal" to any element of the list 1 is returned, else 0 is returned. > (member 'a '(a b c)) 1 reverse: (reverse expr1) Library function "reverse" accepts one argument, which must evaluate to a list or a string. If a list the function creates a new list containing the same elements as the original list, in reverse order. If the argument is a string the function creates a new string consisting of the same characters as the original, but in reverse order. > (reverse '(a b c)) (c b a) > (reverse "hello") "olleh" append: (append expr1 ...) The "append" intrinsic accepts one or more arguments which all must evaluate to lists, and returns a new list consisting of the elements of the argument lists. loop: (loop expr1 ...) The "loop" intrinsic implements an infinite loop. The function accepts one or more arguments, and evaluates them in order, repeatedly. The return value of each body expression is discarded. The only way to exit a "loop" loop is to wrap it in a "catch" and invoke "throw" in the body: > (catch >> (loop >>> (if (setq line (getline)) >>>> (print line) >>>> (throw line)))) iterate: (iterate expr1 ...) The "iterate" intrinsic executes expressions a specified number of times. Unlike the "for" intrinsic it does not make an index variable visible to the expressions in its body. The function accepts one or more arguments, the first of which must evaluate to a number and specifies the number of times the loop will execute. A negative repeat count will be converted into a positive value by converting it to its absolute value. The subsequent expressions, if any, are optional. If present, each subsequent expression will be executed in order on each iteration of the loop. The result of evaluating the last body expression in the loop, on the last iteration, will be returned, or if no further arguments were provided beyond the repeat count, the repeat count will be returned. The "iterate" intrinsic is the fastest means of iteration available in the Munger interpreter. > (iterate 10 (print "A")) AAAAAAAAAA1 The 1 is the return value of the "print" intrinsic, which is in turn the return value of "iterate". for: (for (symbol expr1 expr2 [expr3]) expr ...) for: (for (([expr...]) (expr [...]) ([expr...])) expr ...) The "for" intrinsic provides a looping facility similar to the C "for" keyword. There are two forms of this intrinsic: The first form of the "for" intrinsic executes loops with an index variable set to each of a range of specified fixnum values upon each iteration. It is the most efficient means of iterating a fixed number of times, if the loop body needs a visible index variable. Two or more arguments must be provided. The first is expected to be a three or four element list consisting of a symbol to be used for the loop index variable, an expression evaluating to the initial number value for the index variable, an expression evaluating to the number value the index variable should have on the last iteration of the loop, and an optional increment value. If the start value is less than or equal to the stop value, the index variable will be incremented on each iteration, otherwise it will be decremented on each iteration. Each of the remaining arguments from the second argument on, will be evaluated during each iteration of the loop. The "for" function always returns the value of the last body expression evaluated on the last iteration. An optional fourth element may be present in the first argument list, and if present, it must evaluate to a fixnum specifying the increment or decrement value. The absolute value of the argument will be added or subtracted from the index, depending upon whether the first element is lesser-than-or-equal-to, or greater than the second element, respectively. If the fourth element is not present, the increment value defaults to 1. If incrementing or decrementing of the index results values that do not ever become the stop value, then the index value on the last iteration of the loop will be the last index value which was within the range specified by the start and stop values, inclusive. A "for" loop introduces a new environment for the loop's index variable. If a closure is formed in the body of a "for" loop, it will close over the new environment. Any invocations of "extend" or "dynamic_extent" directly inside of a "for" loop body will affect the environment of the loop and not the environment in which the loop is embedded. > (for (n 0 10 2) >> (println n)) 0 2 4 6 8 10 1 ; return value of "println" > (set 'a 5) 5 > (for (a a 1) >> (for (a a (+ a a)) >>> (print a " ")) >> (newline)) 5 6 7 8 9 10 4 5 6 7 8 3 4 5 6 2 3 4 1 2 1 ; return value of inner "newline" > It is possible to invoke "continue" inside a for loop such that no return value is ever generated by any of the contained expressions. In that case, "for" returns 0. The second form of the "for" intrinsic provides a more generic looping facility. With this form, the first argument must also be a three- element list, and the subsequent arguments are the loop's body expressions, but with this form, each sublist of the first argument list, must also be a list. The first sublist is a list of expressions to be evaluated once, in order, before any other part of the loop is evaluated, and may be empty. The second sublist consists of one or more expressions, the first of which is the test condition, and will be evaluated before every entry into the loop body, while the remaining elements are the final expressions which are evaluated as the last act of the "for" intrinsic. The elements after the test expression may be omitted. The third sublist contains the update expressions to be evaluated after evaluation of the loop body expressions, and may be empty. Execution of the loop proceeds as follows. The initialization expressions are evaluated, then the actions described in the next paragraph are repeated until the test expression evaluates to a false value: REPEAT The test condition is evaluated. If the evaluation of the test condition results in a false value, the remaining elements of the second sublist, after the test condition, if present, are evaluated in order, and the result of the last evaluation becomes the return value of the "for" loop. If there are no expressions after the test expression, then the result of the failed test expression becomes the return value of the loop. The loop is finished. Otherwise, if the evaluation of the test condition results in a true value, each of the body expressions are evaluated in order, and then each of the elements of the update sublist are evaluated, in order. Goto REPEAT. One might write a factorial function with this version of "for" like this: (defun fact (n) (for (((extend 'a 1)) ((> n 1) a) ((dec n))) (setq a (* n a)))) downcase: (downcase expr1 expr2) upcase: (upcase expr1 expr2) The "upcase" library function accepts two arguments, the first of which must evaluate to a string, and the second of which is treated as a logical true or false value after evaluation. If the second argument evaluates to a true value, then "upcase" will return a new string consisting of all the characters of the first argument string, but with any lowercase alphabetic characters replaced with their corresponding uppercase characters. If the second argument evaluates to a false value, then only the first lowercase alphabetic character encountered in the first argument will be so converted. In either case, if no lowercase alphabetic characters are present in the first argument, the returned string will be an exact copy of the first argument. The "downcase" library function behaves similarly to "upcase", except it converts lowercase alphabetic characters to uppercase alphabetic characters. alist_lookup: (alist_lookup expr1 expr2) The "alist_lookup" library function accepts two arguments. The second must evaluate to an alist, while the result of evaluating the first argument is treated as a key of the alist. The function returns the cdr of the first sublist of the alist whose car is equal to the key. > (set 'y '((a b) (b c))) ((a b) (b c)) > (alist_lookup 'b y) (c) > alist_remove: (alist_remove expr1 expr2) The "alist_remove" library function accepts two arguments. The second must evaluate to an alist, while the result of evaluating the first argument is treated as a key of the alist. The function returns a copy of the alist with the first sublist whose car is the key, removed. > (set 'y '((a b) (b c))) ((a b) (b c)) > (alist_remove 'b y) ((a b)) > y ((a b) (b c)) > alist_replace: (alist_replace expr1 expr2 expr3) The "alist_replace" library function accepts three arguments. The second must evaluate to an alist, while the result of evaluating the first is treated as a key of the alist. The third argument must evaluate to a list. The function returns a copy of the alist with the first sublist whose car is the key, removed, if such a sublist exists, and containing a new sublist consisting of the key and the elements of the third list. The new sublist is added to the front of the alist copy. > (set 'y '((a b) (b c))) ((a b) (b c)) > (alist_replace 'b y '(d e f)) ((b d e f) (a b)) > (set 'y ()) () > (alist_replace 'b y '(d e f)) ((b d e f)) > when: (when expr1 expr2...) The "when" intrinsic is a conditional. It accepts one or more arguments, the first being the test condition. If the test condition evaluates to a true value (anything but 0 (fixnum), the empty string, or the empty list), then the succeeding arguments are evaluated in order, and the result of evaluating the last argument is returned. Otherwise, if the test condition evaluated to a false value, that value is returned. > (when (eq 1 1) >> (print 'hello) >> (newline)) hello 1 > unless: (unless expr1 expr2...) The "unless" intrinsic is a conditional, which operates logically opposite to the "when" intrinsic. The function accepts one or more arguments, the first of which is the test condition. If the first argument evaluates to a false value (0 (fixnum), the empty list, or the empty string), the succeeding arguments are evaluated in order, and the result of evaluating the last argument is returned. Otherwise, if the first argument evaluated to a true value, that value is returned. > (unless (eq 1 1) >> (print 'hello) >> (newline)) 1 > apply: (apply expr1 expr2) The "apply" macro accepts two arguments, the first of which must evaluate to a function and the second of which must evaluate to a list. the function is called with all the elements of the second argument as arguments. All "apply" does is cons its first element onto its second and evaluate the result. > (apply 'set '('q 2)) 2 inc: (inc symbol [expr]) dec: (dec symbol [expr]) The "inc" and "dec" intrinsics increment and decrement a numerical value bound to a symbol, and rebind the symbol to the resultant value. Both functions accept either one or two arguments. The first argument is not evaluated, and must be a symbol currently bound to a number. The second optional argument, if present, must evaluate to a number. The value bound to the symbol is incremented or decremented by the value specified by the second argument, or if the function were not presented with a second argument, by 1, and this new value is then bound to the symbol. The new value is returned. > (set 'a 1) 1 > (inc a) 2 > (dec a) 1 > (inc a 10) 11 test: (test expr1) The "test" intrinsic is used to test macro expansion. Its lone argument is NOT evaluated, and must be a list consisting of a macro application, just as one would enter it at the interpreter prompt. The function returns the result of expanding the macro. > (test (let ((f "foobar")) f)) ((lambda (f) f) "foobar") > (test (setq a 2)) (set (quote a) 2) continue: (continue) The "continue" intrinsic, when invoked from inside a "while", "do", "for", "iterate", or "loop" loop will cause the loop to skip any subsequent expressions in the loop and jump to the top of the loop to continue with the next iteration. Note, that invoking "continue" inside of a do loop will prevent the evaluation of the test condition. An infinite loop may result in certain situations. If "continue" is invoked outside of the body of a loop, then the stack will unwind until the interpreter returns to the toplevel prompt. > (set 'n 10) 10 > (while n >> (print n) >> (newline) >> (set 'n (- n 1)) >> (continue) >> (set 'n 0)) 10 9 8 7 6 5 4 3 2 1 0 > The last 0 is the return value of the "while" intrinsic. block: (block) unblock: (unblock) The "block" and "unblock" intrinsics accept no arguments, and determine whether SIGINT, SIGQUIT, and SIGHUP will kill the interpreter, and whether SIGTSTP, SIGTTIN, and SIGTTOU are ignored by the interpreter. The interpreter starts out with an implicit call to "unblock." When in the unblocked state, the interpreter will be killed upon receipt of SIGINT, SIGQUIT, and SIGHUP, and it will be stopped upon receipt of SIGTSTP, SIGTTIN, or SIGTTOU. It will additionally generate a core dump upon receipt of SIGQUIT. Invoking "block" renders all of these signals impotent. Both of these functions always return 1. SIGTERM is caught by the interprepter. The "sigtermp" intrinsic can be used to detect the occurrence of SIGTERM. sigtermp: (sigtermp) The "sigtermp" intrinsic accepts no arguments and returns either 0 or 1 indicating whether the interpreter has received a SIGTERM since the last invocation of "sigtermp". exists: (exists expr) The "exists" intrinsic accepts one argument which must evaluate to a string, and attempts to stat() a file system entity with the name specified by the string. If the call to stat() succeeds "exists" returns a value in the range of 1 to 8, otherwise, it returns 0 if the entity does not exist, or -1 if the interpreter lacks permission to search one of the directories in the specified path. Note the interpreter does not need permission to read the file itself in order to stat() it. Possible successful return values are: 1 == regular file 2 == directory 3 == character device 4 == block device 5 == fifo 6 == symbolic link 7 == socket 8 == unknown type stat: (stat expr) The "stat" intrinsic accepts one argument, which must evaluate to a string. The argument is interpreted as a filename to be passed to the stat() system call. The function returns a list. If the specified entity does not exist, or the interpreter does not have search permission for one of the directories in the specified path, the returned list will be empty. Any other error condition will cause the function return a string describing the error. Upon successful return of stat(), a five- element list will be returned, containing, in order, a string containing the user name associated with the uid of the file, or if the uid does not map to a user on the system, the uid itself expressed as an integer, a string containing the group name associated with the gid of the file, or if the gid does not map to a group on the system, the gid itself expressed as an integer, the time of the last access of the file, the time of the last modification of the file, and the size of the file. The time values are expressed in seconds elapsed since the UNIX Epoch (00:00:00 January 1, 1970 UTC). > (stat (current)) ("root" "wheel" 1097210850 1097210836 417214) rename: (rename expr1 expr2) The "rename" intrinsic is used to rename entries in the filesystem. Its semantics are the same as those of the system call of the same name, for which is a wrapper. The function accepts two arguments, both of which must evaluate to strings. The first specifies the current name of the filesystem entity, and the second is the new name it will have if the function succeeds. 1 is returned upon success, or in the case of system call failure, a string describing the error is returned. All other errors will stop evaluation. current: (current) The interpreter maintains a pointer to an element of the argument list it was started with. The "current" intrinsic accepts no arguments, and returns a string representing the current argument the argument pointer points to. At startup "current" will return the name the interpreter was invoked by, which is always the first argument passed to programs running under UNIX. See the "prev" "next" and "rewind" intrinsics for accessing the other command line arguments. prev: (prev) The "prev" intrinsic steps the argument pointer back to the previous command line argument, if one exists. If no previous argument exists, "prev" returns zero, otherwise the previous argument is made the current argument and is returned as a string. next: (next) The "next" intrinsic steps the argument pointer forward to the next command line argument, if one exists. If no further arguments exists, "next" returns zero, otherwise the next argument is made the current argument, and is returned as a string. rewind: (rewind) The "rewind" intrinsic accepts no arguments, and sets the interpreter's command-line argument pointer to point to the first argument, and returns it as a string. The first argument is always the name by which the interpreter program was invoked. interact: (interact) The "interact" intrinsic causes the interpreter to stop running the current program and work interactively with the user. It is intended for use in programs which need to make the interpreter toplevel temporarily available to the user. The function accepts no arguments, and causes the interpreter to enter a new read-eval-print loop. To exit the loop and allow the program which invoked "interact" to continue, the Control-D may be inputted by itself on a line, or the symbol "_" may be input at toplevel. The function always returns 1. Recursive invocations will cause a string to be returned with an error message, but evalution at the prompt of the original invocation will continue normally. Note that the "fatal" intrinisic has no effect when lisp errors are generated inside "interact". This means the interpreter will not exit when a lisp error is generated by the user working at the "interact" prompt. The current local environment is hidden from code run via "interact". If the user wishes to hide the global environment of the program which has invoked "interact" from the user, then he or she should wrap the program in a giant "let" closure, defining initial bindings for all globals so that "defun", "defmac" "set" and "setq" will modify these locally-visible bindings and not create global bindings. An example of this technique can be seen in the msh.munger example program. pwd: (pwd) The "pwd" intrinsic accepts no arguments, and returns the current working directory of the interpreter as a string. let: (let [list] expr2 ...) The "let" intrinsic introduces new local lexical bindings and evaluates a list of expressions with those new bindings in place. The bindings are removed when the "let" expression returns. The first argument is not evaluated and must be a parameter list of the form: ((symbol1 expr1) (symbol2 expr2)...). The result of evaluating each exprN in the current scope gets bound to its paired symbolN in the new scope. The remaining arguments, if any, are evaluated in the new scope. The result of the last evaluation performed is returned. > (let ((a 1) (b 2)) >> (print "a: " a " b: " b) >> (newline)) a: 1 b: 2 1 ; This is the return value of (newline) > (set 'a 2) 2 > (let ((a (* a a))) >> (while a >>>> (print "a: " a) >>>> (newline) >>>> (set 'a (- a 1)))) a: 4 a: 3 a: 2 a: 1 0 > a 2 letn: (letn [list] expr2 ...) Intrinsic "letn" introduces new local lexical bindings and evaluates a list of expressions with those new bindings in place. The bindings are removed when the "letn" expression returns. The first argument is not evaluated and must be a parameter list of the form: ((symbol1 expr1) (symbol2 expr2)...). The result of evaluating each exprN in the current scope gets bound to its paired symbolN in the new scope. The remaining arguments are evaluated in the new scope. The value of the last evaluation performed is returned. Intrinsic "letn" is different from intrinsic "let" in that each binding made to symbolN is visible in exprN+1, exprN+2, etc. This form is called "let*" in Scheme and Common Lisp. > (letn ((a 1) >>> (b (+ a 1))) >> (print "b: " b) >> (newline)) b: 2 1 > letf: (letf [list] expr2 ...) Macro "letf" is the equivalent to a named let in Scheme. It provides a means of creating and applying a temporary function capable of recursing on its name. The macro accepts two or more arguments, the first of which must be a symbol, while the second must be a list of the form: ((symbol1 expr1) (symbol2 expr2) ...). The result of evaluating each exprN in the current scope gets bound to its paired symbolN in a new local lexical environment. The symbol passed as the first argument is visible in the new environment to enable recursion by name. The remaining arguments are then evaluated in the new environment, with the value of the last evaluation performed being returned. > (setq a 10) 10 > (letf fact ((n a)) >> (if (< n 2) >>> n >>> (* n (fact (- n 1))))) 3628800 > tailcall: (tailcall expr1 ...) The "tailcall" intrinsic accepts one or more arguments, the first of which must evaluate to a closure or 0, while the remaining arguments, if any, are the arguments to the closure being tail-called, and must be of the correct number and type for the particular closure being invoked. A first argument of 0 indicates the tail-call is a directly recursive call of the currently-executing function. This allows anonymous functions to call themselves tail-recursively. Using tail-calls to implement iteration is inefficient in Munger, and should be avoided. See the entries for the "for" and "iterate" intrinsics for how to iterate efficiently. Tail-calls are used to prevent the control stack from growing unnecessarily during recursive function calls. Use "tailcall" whenever you are invoking a function from a tail position in another function. Some tail-recursive functions do all their computation during the "descent" stage of recursion, and only return the final computed value during the "ascent" stage when they "unwind". This means the recursive call is in tail position because nothing happens after it has returned. Each invocation simply returns to its caller. Invoking the recursive calls with "tailcall" will cause each recursive invocation to replace the current context on the call stack with its own context, so that when the final recursive invocation returns, it returns to the caller of the first invocation. No "unwinding" occurs. Each invocation of the function therefore has a constant continuation, and the whole computation is effectively turned into a loop, inhibiting potentially explosive growth of the control stack. Other tail-recursive functions pass closures to each recursive invocation of themselves to capture state as they descend, with each new closure closing over the binding for the previous one. When these functions bottom-out, the current closure is invoked as the continuation of the function. It, in turn, performs part of a computation, then invokes the continuation from the previous invocation, in tail position, to continue the computation, and so on, until the primordial continuation given to the toplevel invocation receives the final value of the computation. Since all invocations of the recursive function, and all invocations of the dynamically-constructed continuations are in tail position, there is no need to save their contexts on the call stack. Every one can be invoked with "tailcall", so that the primordial continuation simply returns to the original caller of the first invocation of the recursive function. Note that in this case, the savings in growth of the call stack, is lost due to the growth of the heap from the creation of a new continuation at each invocation. Note that invoking "tailcall" from a non-tail position, results in that position becoming a tail position. Any pending computation in the function is abandoned. labels: (labels [list] expr2 ...) The "labels" intrinsic temporarily binds a set of functions to a set of symbols, such that each binding is visible in each function, allowing recursive and mutually-recursive local functions to be defined in a new environment. The function accepts one or more arguments, the first of which must be a list of lists where each sublist consists of a symbol paired with a lambda or macro expression. The remaining arguments are evaluated in the new environment, with the value of the last evaluation being returned. An example will make this clear: > (labels ((even (lambda (n) (or (eq n 0) (tailcall odd (- n 1))))) >>> (odd (lambda (n) (and (not (eq n 0)) (tailcall even (- n 1)))))) >> (print (even 11)) >> (newline)) 0 1 ; This is the return value of the "newline". Note that it may not be necessary to use "labels" to extend the current lexical environment, depending on whether or not the programmer wishes to limit the visibility of the functions created to the body of the "labels" expression, and whether or not there are any closures formed in the body which need to see the new bindings after the "labels" has returned. If both of those behaviors are simultaneously needed, or if the interpreter is at toplevel, "labels" must be used. Otherwise, the current lexical environment can be extended with the "extend" intrinsic. One can achieve the same effect as the above example inside a function body with: > (defun f () >> (extend 'even (lambda (n) (or (eq n 0) (tailcall odd (- n 1))))) >> (extend 'odd (lambda (n) (and (not (eq n 0)) (tailcall even (- n 1))))) >> (print (even 11)) >> (newline)) > (f) 0 1 The bindings created by "extend" have unlimited extent, so if closures were formed in the body of "f" they would close over the new bindings, even if the closures were closed before the invocations of "extend". See the entries for the "extend" and "dynamic_extent" intrinsics for more details on extending the current lexical environment. extract: (extract expr) The "extract" intrinsic extracts the lambda-expression from a function closure. It accepts one argument which must evaluate to a closure, and returns the closure's lambda-expression. > mapcar <CLOSURE#17> > (extract mapcar) (lambda (f l) (if (pairp l) (cons (f (car l)) (mapcar f (cdr l))) ())) cond: (cond [list] ...) Intrinsic "cond" is a multiple choice conditional. The function accepts one or more lists as arguments, which must consist of at least two elements, and attempts to evaluate the first element of each argument list, in order, until an evaluation returns a true value (anything but zero, the empty string, or the empty list), when it attempts to evaluate the remaining elements of the corresponding sublist. The value of the last evaluation performed in the sublist is returned. If none of the first elements of any of the argument lists evaluate to a true value, then the false value returned by the first element of the last argument list will be returned. By placing a first element of 1 in the final sublist, a catch-all else clause may be created. > (set 'a 10) 10 > (cond ((eq a 12) 'no) >> ((eq a 10) 'yes)) yes > case: (case expr expr2 ...) The "case" macro is a multiple choice conditional. It accepts two or more arguments. The first argument may be any expression. The succeeding elements must be lists. The first argument is evaluated, and then the resulting value is compared against the result of evaluating the first elements of each of the succeeding lists, using "eq". If "eq" returns 1, then the succeeding elements of the list with the "eq" first element, are evaluated, and the value of evaluating the last element is the return value of the invocation of "case". No other argument lists are processed. If the first element of any of the argument lists is the question mark, ?, then the elements of that list succeeding the question mark will always be executed, if none of the preceding list arguments are executed. This can be used to create an "else" clause, which will be executed if none of the others match. > (setq n 3) 3 > (case n >> (1 'one) >> (2 'two) >> (3 'three) >> (4 'four) >> (5 'five)) three > (case n >> (6 'six) >> (7 'seven) >> (8 'eight) >> (9 'nine) >> (10 'ten) >> (? "no match")) "no match" foreach: (foreach expr1 expr2) The "foreach" library function applies a monadic function to every item in a list. Unlike "mapcar", this function does not return a list of the return values. The results of each function application are discarded. "foreach" always returns the empty list. > (set 'print_each >> (lambda (x) (print x) (newline))) <CLOSURE#34> > (foreach print_each '(a b c d)) a b c d () > +: (+ expr...) Intrinsic "+" accepts any number of arguments, evaluates them, and if all have evaluated to numbers, adds the numbers together and returns the total. > (+ 1 1 1 1) 4 > (+ 1 "hello") +: argument 2 did not evaluate to a number. -: (- expr1 expr2) Intrinsic "-" accepts two arguments, both of which must evaluate to numbers, and subtracts the second value from the first and returns the result. > (- 1 2) -1 > (- (+ 1 1) 2) 0 *: (* expr...) Intrinsic "*" accepts one or more arguments which must all evaluate to numbers, and multiplies all the values together and returns the result. > (* 1 12 2) 24 > (* (+ 23 3) 4) 104 /: (/ expr1 expr2) Intrinsic "/" accepts two arguments both of which must evaluate to numbers, and divides the second value into the first and returns the integer part of the quotient. > (/ 4 3) 1 %: (% expr1 expr2) Intrinsic "%" accepts two arguments, both of which must evaluate to numbers, and divides the second value into the first, and returns the the remainder. > (% 4 3) 1 >: (> expr1 expr2) Intrinsic ">" accepts two arguments, which both must evaluate to numbers, and returns 1 if the first value is larger than the second, or 0 otherwise. > (> 4 3) 1 > (> -4 3) 0 >=: (>= expr1 expr2) Intrinsic ">=" accepts two arguments which both must evaluate to numbers, and returns 1 if the first value is greater than or equal to the second value, or 0 otherwise. > (>= 4 3) 1 <: (< expr1 expr2) Intrinsic "<" accepts two arguments, both of which must evaluate to numbers, and returns 1 if the first value is less than the second, or 0 otherwise. > (< 4 3) 0 <=: (<= expr1 expr2) Intrinsic "<=" accepts two arguments, both of which must evaluate to numbers, and returns 1 if the first value is less than or equal to the second, or 0 otherwise. > (<= 3 4) 0 negate: (negate expr) The "negate" intrinsic accepts one argument which must evaluate to a fixnum, and returns that fixnum negated. > (negate 3) -3 > (negate -3) 3 abs: (abs expr) Intrinsic "abs" evaluates its lone argument, and if it evaluates to a number, returns the absolute value of the number. Passing a non-number to "abs" will stop evaluation with an error. intern: (intern expr1) The "intern" intrinsic accepts one argument, which must evaluate to a string, and converts it into a symbol token consisting of the same sequence of characters, without the enclosing quotation marks. Passing a non-string to "intern" will generate an error which stops evaluation. > (set 'hello 4) 4 (eval (intern "hello")) 4 char: (char expr1) The "char" intrinsic accepts one argument, which must evaluate to a number between 1 and 255, and creates a one-character string consisting of the character corresponding to the character code specified by that number. code: (code expr1) The "code" intrinsic accepts one argument which must evaluate to a string and returns a number representing the character code of the first character of the string. open: (open) The "open" intrinsic creates a new text buffer and makes the new buffer the active buffer. The buffer number of the new text buffer is returned on success. Errors stop evaluation. Buffer numbers are whole numbers, and can be used by the "switch" intrinsic to change the active buffer. close: (close) The "close" intrinsic closes the active text buffer, and makes the most- recently opened buffer that has not been closed the active buffer. 1 is returned on success. Errors stop evaluation. insert: (insert expr1 expr2 expr3) The "insert" intrinsic inserts a line into the active buffer. It accepts three arguments. The first argument must evaluate to a positive number and represents an index position in the buffer. The second argument, which must evaluate to a string, is the data to be inserted. The third argument, which must evaluate to a number, specifies whether the data should be inserted before the specified index position, inserted after the specified index position, or should overwrite the contents of the specified index position. A value of 0 indicates the index position should be overwritten. A positive value indicates the data should be inserted after the specified index. A negative value indicates the data should be inserted before the specified index. If the specified index position does not exist it will be created, and all the index positions preceding it in the buffer will also be created and initialized to be empty. 1 is returned on success, 0 on failure. > (open) 1 > (insert 1 "This is the first line." 0) 1 delete: (remove expr1) The "delete" intrinsic removes a line from the active buffer. The function accepts one argument which must evaluate to a positive integer specifying the index of the line in the buffer to be deleted. 1 is returned on success, 0 on failure. retrieve: (retrieve expr1) The "retrieve" intrinsic returns the contents of a specified index position in the active buffer, as a string. The function accepts one argument which must evaluate to a positive integer representing the index of the desired line. An error is generated if the specified index does not exist. lastline: (lastline) The "lastline" intrinsic accepts no arguments, and returns the index value of the last line in the active buffer. Any error encountered will stop the interpreter. filter: (filter expr1 expr2 expr3) The "filter" intrinsic sends a range of lines from the active buffer to an external filter program, and replaces the lines in the buffer with the output from the filter program. The function accepts three arguments. The first two arguments must evaluate to numbers and inclusively specify the range of lines to be sent to the filter program. The first index does not have to be less than the second index. The lines, however are always processed in ascending order, regardless of how the range was specified. An error is generated if any of the lines in the specified range do not exist. The third argument must evaluate to a string, representing the command line used to launch the filter. It is passed to /bin/sh for interpretation, and so may contain any expression that program understands. Upon success, "filter" returns the number of lines read from the stdout of the child process. filter_server: (filter_server expr1 expr2 expr3 expr4) The "filter_server" library function sends a range of lines in the active buffer to a TCP or UNIX domain server and replaces the lines in the buffer with the server's response. This function can be useful in retrieving data from http servers. It accepts four arguments. The fist two arguments must evaluate to fixnums specifying the range of buffer lines, inclusive, to be sent to the server. If the first argument is greater than the second, then the lines will be sent to the server in reverse order. The third and fourth arguments are passed to the "child_open" intrinsic. See the entry in this manual for that function for details on what these two arguments can be. The function returns the number of lines read back from the server. > (open) 0 > (insert 1 "GET / HTTP/1.0" 0) 1 > (insert 2 "" 0) 1 > (for (n 1 2) (insert n (concat (retrieve n) (char 13) (char 10)) 0)) 1 > (filter_server 1 2 "www.mammothcheese.ca" 80) 306 The data returned will not be processed in any way, so the HTTP response header will be present, as well as any chunk headers in the response body. The "remove_http_stuff" library function can be used to remove these items. remove_http_stuff: (remove_http_stuff) The "remove_http_stuff" library function accepts no arguments, and if invoked after an invocation of "filter_server", will remove the http header and merge the response body if it has been "chunked" as per HTTP/1.1, from the data in the current buffer. After invocation, the current buffer will contain only the data of the requested resource. write: (write expr1 expr2 expr3 expr4 [expr5]) The "write" intrinsic writes content from the active buffer to a file. The function accepts four or five arguments. The first two arguments must evaluate to positive integers, and specify a range of lines, inclusively, to be written out. The third argument must evaluate to a string representing the filename to be written to. The fourth argument must evaluate to a number, and indicates whether the interpreter should attempt to get an exclusive lock on the file before writing to it. A non-zero value means to lock the file, while a value of 0 means to not lock the file. The optional fifth argument, if present, must evaluate to a number, and specifies whether the lines from the buffer should be appended to an already-existing file, or if a new file should be created, overwriting any existing file of the same name. A non-zero value means to append, while a value of 0 results in a new file being created. If the fifth argument is not present, an implicit fifth argument of 0 is assumed. Empty files can be created by passing 0 for both the first and second arguments. "write" creates files with both read and write permissions enabled for the owner, and read permission enabled for everyone else (mode 644). The first index argument does not have to be less than the second index argument, but the lines in the region so specified will be written to the file in ascending order, regardless of how the region was specified. Whitespace in the filename can either be present literally, or represented with the \t and \b escapes to represent the tab and the space character, respectively. This means if you wish to use a literal \b or \t in the filename, you must escape the backslash itself and use \\\\b or \\\\t instead. Remember to embed a single backslash into a string we must escape it with another backslash, so to embed two backslashes, we need to escape each of them with another backslash, totaling four backslashes. If this confuses you, try rereading the section on strings at the top of this document. The number of lines written is returned on success. If an error is encountered opening the file, a string describing the error is returned. All other errors stop evaluation. read: (read expr1 expr2) The "read" intrinsic inserts all the lines from a file into the active buffer, inserting the lines after a specified line. The function accepts two arguments. The first argument must evaluate to a number representing the index value of the line after which to insert the newly-read lines. If the buffer is empty, the first argument must be 0, or an error is generated. Similarly, to insert lines at the beginning of the buffer, the first argument should be 0. The function returns the number of lines read on success, -1 if the file to be read doesn't exist, and -2 if permission to access the file is denied. Any other failure of the read() system call will return a string describing the error. Other errors will stop evaluation. Whitespace in the filename can either be present literally, or represented with the \t and \b escapes to represent the tab and the space character, respectively. This means if you wish to use a literal \b or \t in the filename, you must escape the backslash itself and use \\\\b or \\\\t instead. Remember to embed a single backslash into a string we must escape it with another backslash, so to embed two backslashes, we need to escape each of them with another backslash, totaling four backslashes. If this confuses you, try rereading the section on strings at the top of this document. empty: (empty) The "empty" intrinsic accepts no arguments, and removes all data from the active buffer. All data in the active buffer are permanently lost. slice: (slice expr1 expr2 expr3 expr4 expr5) The "slice" intrinsic returns part of a line in the active buffer or a description of part of a line in the buffer. The function accepts five arguments, which must all evaluate to numbers. The first argument specifies the index value of the line to be sliced in the buffer. The second argument specifies the character index where the slice starts. The third argument specifies the length of the slice in characters. If the length argument is zero, this is interpreted as meaning, "to the end of the line." The fourth argument specifies where tabstops occur. Tabs are expanded before the slice is taken, so argument 2 refers to the "screen" x-coordinate, which may be different from the actual character located at that index position in the buffer, due to tab expansion. For example, a fourth argument of 3 indicates that tabstops are considered to occur every three columns. The fifth argument specifies what the function returns. A value of 0 indicates the caller wishes to receive the specified slice as a string. A value of 1 indicates the caller wishes to receive a two-element list describing the specified slice. The first element of this list is the length of the slice, in characters, which may be less than the specified length, if the specified length extended past the end of the line. This is the actual length of the slice in the buffer before tab expansion. The second element is the number of extra characters which would be added to the line during tab expansion from the beginning of the line to the end of the specified slice. find: (find expr1 expr2 expr3 expr4 expr5) The "find" intrinsic searches a specified range of lines in the active buffer for a match on a specified regular expression. The function accepts five arguments. The first three arguments, and the fifth argument, must all evaluate to numbers, while the fourth argument must evaluate to a compiled regular expression object. The first argument specifies the direction of the search: a positive value causes the search to proceed forward in the buffer, while a negative value causes the search to proceed backwards in the buffer. The second argument is interpreted as the line number at which to start the search. The third argument is interpreted as the character within the line at which to start the search. Remember that lines in the buffer are indexed from 1, while characters in lines are indexed from 0. The fifth argument specifies whether the search should "wraparound" if it fails. A non-zero value enables wraparound, while 0 disables wraparound. For example, if a search were in the forward direction and failed, a fifth argument of 1 would cause the search to begin again at the beginning of the buffer, looking for matches before the specified starting position. The fourth argument is the compiled regular expression object to search for matches with. A match starting exactly at the specified starting place of the search will be ignored, and the position of the next non-overlapping match, if any, will be sought. Newlines are temporarily removed from the end of buffer lines before a match is attempted. This means ^$ will match empty lines. The function returns a list of three numbers. If a match were found, the first element will be the index of the line containing the match, while the second element will be the character index in the line of the start of the text that matched, and the third element will be the length, in characters, of the match. If no match occurred, all three values will be zero. The three returned values can be used to pluck out the text of a match from the buffer: > (open) 0 > (insert 1 "this is the first line in the buffer." 0) 1 > (set 'f (find 1 1 0 (regcomp "buffer") 0) (1 30 7) > (substring (retrieve (car f)) (cadr f) (car (cddr f))) "buffer" The following function will count the number of blank lines in the buffer, assuming the buffer has been opened and loaded with text. Newlines are removed from the end of lines before the regular expression is applied, so the regular expression will match blank lines. (set 'blank (lambda () (let ((idx 1) (regexp (regcomp "^[\b\t]*$")) (count 0)) (while (set 'idx (car (find 1 idx 0 regexp 0))) (inc count)) count))) buffer: (buffer) The "buffer" intrinsic accepts no arguments, and returns the buffer number of the active text buffer, or -1 if no buffers have been opened. buffers: (buffers) The "buffers" intrinsic accepts no arguments, and returns a list of whole numbers representing the buffer numbers of all currently open buffers. If no buffers have been open, the function returns the empty list. switch: (switch expr) The "switch" intrinsic makes a specified buffer the active buffer. The function accepts one argument which must evaluate to the buffer number of an open buffer, and makes that buffer the active buffer. Errors stop evaluation. setmark: (setmark expr1 expr2) The "setmark" intrinsic is used to mark a line in the buffer for later reference. The function accepts two arguments, the first of which must evaluate to an atom of any type, which names the bookmark. The second argument must evaluate to a valid line number of a line in the active buffer. Upon success 1 is returned. The mark will be adjusted accordingly after buffer insertions and deletions in order to track the marked line. If the line is subsequently deleted with "delete" then any bookmarks pointing to that line will be set to -1. Invoking "getmark" on a subsequent occasion will return the marked line's current line number. Bookmarks are local to the active buffer, and each buffer may have an unlimited number of bookmarked lines. The active set of bookmarks is switched when the "switch" intrinsic changes the active buffer. Only the bookmarks for the active buffer can be altered or examined. getmark: (getmark expr1) The "getmark" intrinsic is used to retrieve the line number of a marked line in the active buffer. The function accepts one argument which must evaluate to an atom of any type, which names the desired bookmark. The current line number of the marked line will be returned if the line has not been deleted from the buffer. If the desired mark has not been set, "getmark" returns 0. If the marked line has been deleted, "getmark" returns -1. transfer: (transfer b1 f1 t1 b2 t2) The "transfer" intrinsic copies a contiguous range of lines from one buffer into another already-opened buffer. The function accepts five arguments which must all evaluate to numbers. The first argument is the buffer number of the source buffer. The second and third arguments specify the starting and ending lines, inclusive, of the range to be copied from the source buffer. If the second argument is greater than the third, then the lines will be copied in reversed order into the destination buffer. The fourth argument is the buffer number of the destination buffer, while the fifth argument is the index in the destination buffer, after which the copied lines will be inserted. To copy lines to the front of the destination buffer, a fifth value of 0 is used. Whatever buffer was active when "transfer" was invoked will be active when transfer returns. The function returns 1 on success. Any error will stop evaluation. with_buffer: (with_buffer expr1 expr2 ...) The "with_buffer" macro temporarily changes the active buffer. The macro accepts two or more arguments, the first of which must evaluate to the buffer number of an open buffer. This buffer is made the active buffer, and then the arguments subsequent to the first are evaluated. When the last argument has been evaluated the buffer which was active previous to the invocation of macro is made the active buffer once again. The result of evaluating the last argument is returned. version: (version) The "version" intrinsic accepts no arguments and returns a list of two numbers: the first being the major version number of the lisp interpreter, and the second being the minor version number. gensym: (gensym) The "gensym" intrinsic creates and returns a unique anonymous symbol, called a gensym. Gensyms cannot be named in your code, because the lisp reader does not recognize the print syntax for a gensym. Gensyms are useful when it is necessary for a macro to create a working variable in its returned expression, which must not conflict with other variables used in the program invoking the macro. The customary way to manipulate gensyms is to bind them to other symbols, and evaluate these symbols in macro templates. The definition of the "with_buffer" macro from library.munger is presented below as an example. (let ((buff (gensym))) (set 'with_buffer (macro (x (y)) (qquote (let ((,buff (buffer))) (switch ,x) (protect ,(cons 'progn y) (switch ,buff))))))) A gensym is bound to the symbol "buff" in a lexical closure surrounding the macro definition. This symbol's value, the gensym itself, is inserted into the macro template at appropriate places to act as a temporary to store the buffer current at the time the macro is invoked. Because the gensym is anonymous, we can be sure we are not shadowing a binding used by the program in which the macro is invoked, which the code passed to the macro as its second argument might modify. Furthermore, every invocation of the macro uses the same gensym, because it is created in the outer "let" lexical closure enclosing the macro definition, but because we insert the gensym into another "let" in the macro expansion, we can be sure that nested invocations of "with_buffer" will only see their own lexical binding. libdir: (libdir) The "libdir" intrinsic accepts no arguments and returns a string specifying the location of the interpreter's library files as a fully- qualified directory name, without the trailing virgule. > (libdir) "/usr/local/share/munger" > strcmp: (strcmp expr1 expr2) The "strcmp" intrinsic lexigraphically compares two strings. The function accepts two arguments which must evaluate to strings, and returns an integer greater than, equal to, or lesser than zero, indicating that the first string is either "greater than" (would sort after), "equal to" (identical), or "less than" (would sort before) the second string. > (strcmp "a" "b") -1 > (strcmp "b" "a") 1 > (strcmp "a" "a") 0 > (strcmp "A" "a") -32 substring: (substring expr1 expr2 expr3) The "substring" intrinsic is provided for extracting substrings from strings, using character indices. The function accepts three arguments. The first must evaluate to a string, while the second and third must evaluate to whole numbers. The second argument specifies the character index (indices start at 0) where the desired substring begins. The third argument is the number of characters to include in the substring. The "substring" intrinsic returns the specified substring as a new string. The second index may be zero, which is interpreted to mean "to the end of the string." > (substring "foobar" 0 0) "foobar" > (substring "foobar" 3 0) "bar" > (substring "foobar" 0 3) "foo" > (substring "foobar" 0 10) "foobar" expand: (expand expr1 expr2) The "expand" intrinsic performs tab expansion on arbitrary strings. The function accepts two arguments, the first of which must evaluate to a positive integer, while the second must evaluate to a string. The first value is interpreted as the location of tabstops (ie., every expr1 characters). A new string is returned, which has the same content as the original second argument, but with any tab characters expanded into an appropriate number of spaces. The number of spaces resulting from each expansion depends upon the position of the specific tab character in its enclosing line. An expansion will always contain at least one space character, but may contain up to expr1 space characters. As well, the expansion of tabs occurring earlier in expr2 will influence the expansion of tabs occurring later in expr2. lines: (line) The "lines" intrinsic returns the number of lines on the terminal device on which the interpreter is running. If the interpreter is not running on a terminal device, the returned value cannot be predicted. cols: (cols) The "cols" intrinsic returns the number of columns on the terminal device on which the interpreter is running. If the interpreter is not running on a terminal device, then the returned value cannot be predicted. exit: (exit) The "exit" interpreter accepts one argument, which must evaluate to a number, and causes the interpreter to exit with that number as the exit value returned to the system. complete: (complete expr) The "complete" intrinsic performs filename completion. The function accepts one argument which must evaluate to a string and is the partial file or directory name to be completed. The function returns a list of one or more strings. The first element of the list is always the result of applying the completion algorithm to the argument, and will be the same as the initial argument if the completion algorithm could not add more characters to it. If more than one element is present in the list, more characters may have been added to the initial argument, but it still did not unambiguously name a file. The subsequent list elements will be preformatted lines of text containing all the possible completions for the first element organized into a table the width of the terminal device, and may be printed as is. > (complete "/usr/share/pe") ("/usr/share/perl/man/" "./ ../ man3/ whatis cat3/ ") A leading ~ in the argument to "complete" will trigger home directory abbreviation expansion, similarly to csh. Home directory abbreviations are expanded, but not completed. For example, if the argument to "complete" is either "~" or begins with "~/", these characters will be replaced with the path to the current user's home directory, but arguments starting with text matching this pattern: ~[^/]+ will have those characters replaced with the path to the specified user's home directory, only if such a user exists. The function will not attempt to complete an incomplete home directory abbreviation. input: (input expr1 expr2) The "input" intrinsic reads data from a process into the active buffer. The function accepts two arguments. The first argument must evaluate to a whole number and specifies the line number to insert the new data after, while second argument must evaluate to a string specifying the command-line to pass to the shell (/bin/sh) to launch the source process, and so may be any expression that program understands. The number of lines read is returned on success. If the child shell cannot find the specified program or the process exits prematurely for any reason, "input" returns 0. Any other errors encountered will stop evaluation. Whitespace in the filename can either be present literally, or represented with the \t and \b escapes the "substitute" and "match" intrinsics recognize to represent the tab and the space character, respectively. This means if you wish to use a literal \b or \t in the filename, you must escape the backslash itself and use \\b or \\t instead. output: (output expr1 expr2 expr3) The "output" intrinsic writes content from the active buffer to a process. The function accepts three arguments. The first two arguments must evaluate to positive integers and specify the range of lines, inclusively, to be written to the process. The first index does not have to be less than the second index, but the range of lines so specified will be written to the child process in ascending order, regardless of how the range was specified. The third argument must evaluate to a string, and specifies the command-line to be passed to the shell to launch the process, and so may be any expression that program understands. The number of lines written to the process is returned upon success. If the child shell cannot find the specified program or the process exits prematurely for some other reason, "output" will return 0. All other errors will stop evaluation. Whitespace in the filename can either be present literally, or represented with the \t and \b escapes the "substitute" and "match" intrinsics recognize to represent the tab and the space character, respectively. This means if you wish to use a literal \b or \t in the filename, you must escape the backslash itself and use \\b or \\t instead. system: (system expr1) The "system" intrinsic is an interface to the "system" system call. The function accepts one argument which must evaluate to a string, and passes it to the shell for execution. The function returns the exit status of the shell; 127 if execution of the shell failed; -1 if fork() or waitpid() fails. maxidx: (maxidx) The "maxidx" intrinsic accepts on arguments, and returns the highest possible index in the buffer that the interpreter will recognize. chdir: (chdir expr1) The "chdir" intrinsic changes the current directory. The function accepts one argument which must evaluate to a string, specifying the new current working directory. The function returns 1 on success or a string describing the error condition on failure. table: (table) The "table" intrinsic returns a new associative array table, which may be used to store lisp objects indexed by atomic keys. Table objects are constants which evaluate to themselves. See the "hash", "unhash", "keys", and "values" intrinsics for the details on using tables. hash: (hash expr1 expr2 expr3) The "hash" intrinsic is used to store data into a table. The program accepts three arguments, the first of which must evaluate to the table to be modified. The second argument must evaluate to any atom, while the third argument can evaluate to an object of any type. The second argument becomes the "key" associated with the third argument "value." The "hash" intrinsic always returns the result of evaluating the third argument. Note that "lookup" returns the empty list if a key has no association, so there is no way to differentiate between a key associated with the empty list and a key with no association. > (set 't (table)) <TABLE#0> > (hash 'zero "zero") "zero" If you are going to insert a large number of objects (> 1000000) into a table at once, you might consider using "gc_freq" to turn off garbage collection before the insertions, and then turn it back on after. > (setq t (table)) <TABLE#1> > (for (n 0 999999) (hash t n n)) 999999 keys: (keys expr1) The "keys" intrinsic accepts one argument, which must evaluate to a table, and returns a list of all the objects used as hash keys in in no particular order. If the specified table is empty, the empty list is returned. > (set 't (table)) <TABLE#0> > (hash t "0" "zero") "zero" > (hash t "1" "one") "one" > (keys t) ("1" "0") values: (values expr1) The "values" intrinsic accepts one argument, which must evaluate to a table, and returns a list of all of the values stored in the table object in no particular order. If the specified table is empty, the empty list is returned. > (set 't (table)) <TABLE#0> > (hash t "0" "zero") "zero" > (hash t "1" "one") "one" > (hash t "2" "two") "two" > (values t) ("two" "zero" "one") unhash: (unhash expr1 expr2) The "unhash" intrinsic removes a key/value pair from a table. It accepts two arguments, the first of which must evaluate to the table to be modified, while the second of which must evaluate to the key of the key/value pair to be removed. The "unhash" intrinsic always returns the result of evaluating the second argument. > (set 't (table)) <TABLE#0> > (hash t "0" "zero") "zero" > (unhash t "0") 0 > (keys t) () lookup: (lookup expr1 expr2) The "lookup" intrinsic is used to retrieve an object associated with an atom in a table. The function accepts two arguments. The first argument must evaluate to the table to be searched, while the second argument must evaluate to an atom. If another lisp object is associated with the second argument in the specified table, the associated object is returned, otherwise the empty list is returned. Note that there is no way to tell the difference between a key with no association and a key associated with the empty list. > (set 't (table)) <TABLE#0> > (hash t "1" "one") "one" > (hash t "0" "zero") "zero" > (lookup t "1") "one" > (lookup t "0") "zero" > (lookup t "2") () sqlite_open: (sqlite_open expr) The "sqlite_open" intrinsic opens a SQLite database file. The function accepts one argument which must evaluate to a string specifying the filename for the database. It will be created if it does not exist. The opened database object is returned upon success. Upon failure, a string describing the error encountered is returned. Database objects are constants which evaluate to themselves. sqlite_close: (sqlite_close expr) The "sqlite_close" intrinsic closes an open SQLite database file. The function accepts one argument which must evaluate to the opened database object to close. If the database is currently open the function closes it and returns 1, otherwise, it returns 0. sqlite_exec: (sqlite_exec expr1 expr2) The "sqlite_exec" function executes SQL commands on an opened SQLite database file. The function accepts two arguments, the first of which must evaluate to the opened database object to query, and the second of which must evaluate to a string specifying the SQL command to execute. If the command is successfully executed, a list is returned. If the command would not normally return any data, or the command returns the empty set, an empty list is returned; otherwise, a list of lists is returned. The first sublist will contain the column keys, while each subsequent sublist will contain one row of returned table data. If a null entry in a row is encounted, an empty string will be returned for that field in its associated list. If an error is encountered during execution of the SQL query, a string is returned describing the error. Data returned by the SQLite interface is expressed as strings. Numbers are returned as the string representation of the appropriate value, and may be converted back to a number with the "digitize" intrinsic. "sqlite_exec" provides the simplest interface to the SQLite library, but another row-by-row interface is provided, which may be more convenient when working with rows containing large chunks of data, and which is also more efficient when the user wishes to invoke the a SQL statement multiple times on the same database. The row-oriented interface is provided by the "sqlite_prepare", "sqlite_bind", "sqlite_step", "sqlite_row", "sqlite_reset", and "sqlite_finalize" intrinsics, detailed below. See http://www.sqlite.org for further details. sqlite_prepare: (sqlite_prepare db sql) The "sqlite_prepare" intrinsic is used to compile a SQL statement for multiple uses with the alternative SQLite interface. The function accepts two arguments, the first of which must evaluate to an opened SQLite database object, while the second must evaluate to a string containing the SQL to be compiled against that database. The function returns a compiled SQL object upon success, or a string describing an error condition, upon failure. The compiled sql object is an opaque constant atom and may be passed as argument to "sqlite_bind", "sqlite_step", "sqlite_row", "sqlite_reset", or "sqlite_finalize". The SQL statement passed to this function may contain parameters of the form ?, ?NNN, or :AAA, where NNN is a number and AAA is an alphanumeric identifier. By using "sqlite_bind" values may be inserted in place of these parameters. This is documented in the entry for "sqlite_bind". > (setq db (sqlite_open "document.db")) <db#1> > (setq sql (sqlite_prepare "SELECT path FROM document WHERE parent = 0")) <sql#1> sqlite_bind: (sqlite_bind sql index text) Statements given to "sqlite_prepare" may contain parameter references in place of SQL literals, of the forms ?, ?NNN, or :AAA where NNN is a number, and AAA is an alphanumeric identifier. The "sqlite_bind" intrinsic is used to set or change the values bound to these parameters. Unfortunately, only SQL literal values may be parameterized, that is strings and numbers, and not column or table names. The function accepts three arguments. The first argument must evaluate to a compiled SQL statement returned by "sqlite_prepare". The second argument must evaluate to the index position of the parameter to be substituted. Parameter indices start at 1. The third argument must evaluate to a string containing the replacement text. Upon success, the function returns 1. Otherwise a string will be returned describing an error condition. Parameters are specified by their ordinal position in the SQL query. Note that parameters with the same name all share the same index value, that value being the index of the first occurrence of the parameter name in the SQL statement. Using named parameters, the same value may be substituted into a SQL statement in different locations. Note that "sqlite_bind" must be invoked on a SQL statement after a call to "sqlite_prepare" or "sqlite_reset", and before any call to "sqlite_step". > (setq db (sqlite_open "example.db")) <DB#1> > (setq sql (sqlite_prepare db "SELECT name FROM employees WHERE job = ?")) <SQL#1> > (sqlite_bind sql 1 "supervisor") 1 > (for (((setq more (sqlite_step sql))) (more) ((setq more (sqlite_step sql)))) >> (print (sqlite_row sql)) >> (newline)) ("Bob") > (sqlite_reset sql) 1 > (sqlite_bind sql 1 "technician") 1 > (for (((setq more (sqlite_step sql))) (more) ((setq more (sqlite_step sql)))) >> (print (sqlite_row)) >> (newline) ("Sally") ("Jeffrey") ("Boodles the cat") ("George") sqlite_step: (sqlite_step sql) The "sqlite_step" intrinsic is used to apply a compiled SQL object to its database to generate the returned data for a single row of the result set. The function accepts one argument which must evaluate to a compiled SQL object generated by "sqlite_prepare", and returns 1 if a row of data has been generated, or 0 if the data in the result set has been exhausted, or a string describing an error condition, upon failure. If the function returns 1, then "sqlite_row" may be invoked on the compiled SQL object to retrieve the generated data for the current row. Further invocations of "sqlite_step" will generate data for successive rows of the result set, until the result set has been exhausted. sqlite_row: (sqlite_row sql) The "sqlite_row" intrinsic may be called after a successful invocation of "sqlite_step" to retrieve a row of data from the result set of a SQL query. The function accepts one argument, which must evaluate to a compiled SQL object returned by "sqlite_prepare" which has had "sqlite_step" invoked on it, and returns a list of strings upon success. Each string represents the data for a single column in the current row of the result set. Upon failure, the function returns a string describing an error condition. > (setq db (sqlite_open "document.db")) <db#1> > (setq sql (sqlite_prepare "SELECT * FROM table1")) <sql#1> > (for (((setq more (sqlite_step sql))) (more) ((setq more (sqlite_step sql)))) >> (print (sqlite_row sql)) >> (newline)) ("first column first row" "second_column first row") ("first column second row" "second_column second row") sqlite_reset: (sqlite_reset sql) The "sqlite_reset" intrinsic is used to reset a compiled SQL object after "sqlite_step" has returned 0 when invoked on it. The function accepts one argument, which must evaluate to a compiled SQL object. The function returns 1 upon success, or a string describing an error condition, upon failure. After a successful invocation of "sqlite_reset", "sqlite_step" and "sqlite_row" may be invoked on the compiled SQL object to re-generate the previous result set again. sqlite_finalize: (sqlite_finalize sql) The "sqlite_finalize" intrinsic frees the resources associated with a compiled SQL object. The function accepts one argument, which must evaluate to a compiled SQL object generated by "sqlite_prepare" and returns 1 upon success, or a string describing an error condition, upon failure. "sqlite_finalize" MUST BE CALLED ON EVERY COMPILED SQL OBJECT ASSOCIATED WITH A PARTICULAR DATABASE, BEFORE THAT DATABASE MAY BE CLOSED BY "sqlite_close". Failure to do this may cause incomplete updates to be rolled back and transactions to be canceled. The garbage collector will call "sqlite_finalize" on any compiled SQL objects it deallocates, but implicit deallocation should not be relied upon, as there is no guarantee the database object will not be garbage collected before the SQL statements, which may result in corruption of the database. sqlp: (sqlp expr) The "sqlp" intrinsic accepts one argument, which may evaluate to any type of object, and returns 1 if that object is a compiled SQL object generated by "sqlite_prepare". Otherwise, it returns 0. stack: (stack [expr]) The "stack" intrinsic creates a new stack object. The function accepts one optional argument, which, if present, must evaluate to a positive integer specifying a number of elements to preallocate on the stack. Each element will be set to the empty list. Omitting the optional argument is the same as invoking (stack 0). The newly-created stack object is returned. > (set 's (stack)) <STACK1> > (used s) 0 > (set 's (stack 10)) <STACK2> > (used s) 10 push: (push expr1 expr2) The "push" intrinsic pushes an object onto the top of a stack. The function accepts two arguments, the first of which must evaluate to the stack to be affected, while the second argument may be any lisp object. The function returns the result of evaluating the second argument. > (set 's (stack)) <STACK1> > (push s 'foo) foo > (index s 0) foo pop: (pop expr) The "pop" intrinsic removes an object from the top of a stack. The function accepts one argument, which must evaluate to the stack to be affected. The removed object is returned. When the stack is empty, the empty list is returned. Note the only way to tell the difference between an empty stack and one which has the empty list stored in its top element, is to invoke the "used" intrinsic. > (set 's (stack)) <STACK1> > (push s 'foo) foo > (push s 'bar) bar > (pop s) bar > (pop s) foo > (pop s) () > (used s) 0 unshift: (unshift expr1 expr2) The "unshift" intrinsic adds a new element onto the bottom of a stack. The function accepts two arguments, the first of which must evaluate to a stack, while the second may evaluate to any lisp object. The result of evaluating the second argument is prepended to the stack, and also returned by the function. > (set 's (assign (stack) '(1 2 3 4))) <STACK1> > (flatten s) (1 2 3 4) > (unshift s 0) 0 > (flatten s) (0 1 2 3 4) shift: (shift expr) The "shift" library function removes an element from the bottom of a stack. The function accepts on argument which must evaluate to stack. If the stack is empty, the empty list is returned, otherwise the removed element is returned. > (set 's (assign (stack) '(0 1 2 3 4))) <STACK1> > (flatten s) (0 1 2 3 4) > (shift s) 0 > (shift s) 1 > (flatten s) (2 3 4) index: (index expr1 expr2)] The "index" intrinsic fetches the object stored at a specified index in a stack. The function accepts two arguments, the first of which must evaluate to the stack to be accessed, while the second must evaluate to a whole number specifying the desired element on the stack. The bottom location on a stack has an index value of zero, and the index value of the object on the top of the stack is one less than the number of elements on the stack. Specifying an index value of less than zero, or more than the index of the last element on the stack, generates an error which stops evaluation. > (set 's (assign (stack) '(a b c))) <STACK1> > (index s 0) a > (index s 2) c > (index s 3) <INDEX>: index 3 out of range. > topidx: (topidx expr) The "topidx" intrinsic accepts one argument which must evaluate to a stack, and returns the index value of the top element on the stack, or -1 if the stack is empty. > (topidx (stack 1)) 0 used: (used expr) The "used" intrinsic obtains the number of elements currently on a specified stack. The function accepts one argument which must evaluate to the stack to be queried, and returns a whole number describing the number of elements currently on that stack. > (used (stack 10)) 10 store: (store expr1 expr2 expr3) The "store" intrinsic stores an object into a specified element of a stack. The function accepts three arguments, the first of which must evaluate to the stack to be affected, while the second argument must evaluate to a whole number specifying the element of the stack to overwrite, and the third argument may evaluate to any lisp object. The result of evaluating the third argument is stored in the specified index of the specified stack, and the result of evaluating the third argument is returned. Specifying an index value of less than zero, or more than the index of the last element on the stack, generates an error which stops evaluation. > (set 's (stack 3)) <STACK1> > (store s 0 'foo) foo > (store s 1 'bar) bar > (store s 2 'wumpus) wumpus > (flatten s) (foo bar wumpus) > (store s 3 'error) <STORE>: index 3 out of range. > clear: (clear expr1 expr2) The "clear" intrinsic allows the user to remove and discard multiple elements from the top of a specified stack. The function accepts two arguments, the first of which must evaluate to a stack object, while the second must evaluate to a whole number. The second argument specifies the number of arguments to remove from the top of the stack. "clear" discards the removed elements and always returns 1. > (setq s (assign (stack) '(1 2 3 4 5))) <STACK#1> > (clear s 4) 1 > (flatten s) (1) > (assign s '(1 2 3 4 5)) <STACK#1> > (clear s (used s)) 1 > (flatten s) () assign: (assign expr1 expr2) The "assign" library function allows the user to store all the elements of a list into a stack at once, starting at index zero. The previous contents of each affected element are overwritten. If the number of elements on the stack is too small to hold all the objects in the list, the function pushes new elements onto the stack until it can hold the entire list, before performing the assignments. If the number of elements on the stack is greater than the number of items in the list, those elements on the stack starting at the index value equal to the length of the list, and continuing to the top of stack, inclusive, remain unaffected. The function returns the stack object affected. > (set 's (assign (stack) '(a b c d e))) <STACK1> > (used s) 5 flatten: (flatten expr) The "flatten" library function returns the elements of a stack in ascending order, as a list. The function accepts one argument which must evaluate to the stack to be queried. > (flatten (assign (stack) '(a b c d e))) (a b c d e) child_open: (child_open expr [expr]) The "child_open" intrinsic opens a full-duplex connection to another process which may be communicated with by the "child_read" and "child_write" intrinsics. Only one child process may be running at any one time. The function accepts one or two arguments. With the one-argument form, the lone argument must evaluate to a string specifying a command line to pass to the shell (/bin/sh) to run. The function will return 1 if the child can be created, otherwise it will return a string describing an error condition. If the child process cannot find the specified program, or is not able to run it, it will exit and print an error to stderr. The user may check for a successful launch of the specified program by invoking "child_running" after invoking "child_open". The two-argument form of the function is used to communicate with another process, local or remote, over a TCP socket, or a local process over a UNIX domain socket. To open a local or remote TCP connection, the first argument must evaluate to a string specifying the local or remote hostname or IP address. For UNIX domain connections the first argument must evaluate to a string representing the pathname in the filesystem to a UNIX domain socket. For TCP connections, the second argument must evaluate to either a string specifying a service defined in /etc/services, such as "http", or a fixnum specifying a port number to attempt to connect to. For UNIX domain connections, the second argument must be 0. If the interpreter successfully opens a connection to the specified entity, then it returns 1, otherwise it will return a string describing an error condition. child_running: (child_running) The "child_running" intrinsic accepts no arguments, and returns 1 if an child process is running; otherwise, the function returns 0. child_ready: (child_ready) The "child_ready" intrinsic accepts no arguments, and returns 1 if data is waiting to be read from a child process; otherwise the function returns 0. If a child process has not been started with the "child_open" intrinsic, 0 is returned as well. child_wait: (child_wait) The "child_wait" intrinsic accepts no arguments and blocks until data is ready from a child process for "child_read" to consume, when it returns 1, or if no child process is running, it will immediately return 0. child_close: (child_close) The "child_close" intrinsic terminates a child process launched by "child_open". The function accepts no arguments, and always returns 1. If there is no child process running, the function does nothing. child_eof: (child_eof) The "child_eof" intrinsic closes the writable half of the connection to child process opened with "child_open". The function accepts no arguments and returns 1 upon success. The child process will read EOF on its next read from its standard input. This can be useful when working with programs which buffer data. Any subsequent attempt to write to the child with "child_write" will generate an error which will stop evaluation. Note that a connection which has had its writable half closed with "child_eof" still needs to be fully-closed by invoking "child_close" on it, when one is finished with the child. For an example of the use of this intrinsic, consider a situation where one is sending lines of text to the fmt utility. This program will buffer input text until it has enough to print a full line of output, unless it encounters a blank line or EOF, when any buffered text will be output to form a short line. In the situation where one has sent data to the utility, and it has formatted all the data except for the tail end which is not long enough to form a complete line, the program will block forever until it reads more data or EOF from stdin. By invoking "child_eof" we can cause fmt to read EOF on stdin and print the last of its buffered output. If we invoked "child_close" we would close both halves of the full-duplex connection, and so be unable to read that last short line of data back from the child. child_write: (child_write expr1 ...) The "child_write" intrinsic writes a list of strings to a child process launched by "child_open". The function accepts one or more arguments, which must all evaluate to strings, and writes them to the child process. The "child_write" function returns 1 on success. Any errors encountered will stop evaluation. > (child_open "/usr/local/bin/munger") 1 > (for (a 1 10) >> (child_write a (char 10))) 1 ; this is the return value of the "child_write" >> (print (child_read))) 1 2 3 4 5 6 7 8 9 10 child_read: (child_read) The "child_read" intrinsic reads up to 1024 characters of data emitted by a child process, and returns it as a string. The function accepts no arguments. If "child_read" is invoked when no child process is running, an error will be generated which will stop evaluation. If no data can be read after 30 seconds, "child_read" will return the empty string. Zero is returned if EOF is read from the child process. > (child_open "munger") 1 > (child_write "(set 'foo 'bar)" (char 10)) 1 > (child_read) "bar " 1 clearscreen: (clearscreen) The "clearscreen" intrinsic accepts no arguments, and if stdout is connected to a terminal device, clears the screen. 1 is returned on success. Any error encountered will stop evaluation. clearline: (clearline expr1 expr2) The "clearline" intrinsic accepts two arguments, both of which must evaluate to whole numbers. If stdout is connected to a cursor- addressable terminal device, it will clear the line specified by the first argument, starting at the column specified by the second argument, to the end of the line. The coordinates for screen lines and columns start at 0. The position 0,0 is the top leftmost position on the screen. The maximum values for the terminal device can be ascertained with the "lines" and "cols" intrinsics. 1 is returned on success. Any error encountered will stop the interpreter. goto: (goto expr1 expr2) The "goto" intrinsic accepts two arguments, both of which must evaluate to whole numbers. If stdout is connected to a cursor-addressable terminal device, the function places the cursor at the specified screen coordinates. The coordinates for screen lines and columns start at 0. The position 0,0 is the top leftmost position on the screen. The maximum values for the terminal device can be ascertained with the "lines" and "cols" intrinsics. 1 is returned on success. Any error encountered will stop the interpreter. getchar: (getchar) The "getchar" intrinsic accepts no arguments. It does a blocking read on stdin until a character can be read, when it returns a whole number in the range of 0-255, representing the character code of the input character. If stdin is a terminal device, it will be taken out of canonical mode, so that unbuffered, uninterpreted data may be read. If EOF is encountered, "getchar" returns -1. If a SIGWINCH is received by the interpreter while waiting for data, the function returns -2 immediately. Any other error encountered will cause the function to return a string describing the error. The getchar intrinsic "getchar" is intended for use in creating interfaces which use character-I/O. If you just want to read a character from stdin, and stdin is redirected onto a file, use (getchars 1) instead. pushback: (pushback expr1) The "pushback" intrinsic accepts one argument which must evaluate to a whole number in the range 0-255. It causes the next subsequent invocation of "getchar" to return the number pushed back. After returning this value, subsequent invocations of "getchar" will read from the terminal again. 1 is returned on success. Any error encountered stop evaluation. Note that "pushback" does not work with "getchars", only "getchar". display: (display expr1 expr2 expr3) The "display" intrinsic aids the implementation of interactive buffer inspection tools. The function accepts three arguments, all of which must evaluate to whole numbers. If stdout is connected to a cursor- addressable terminal device, the function will print buffer lines, one per screenline, starting with the buffer line whose index value corresponds to the first argument, and continuing with subsequent buffer lines, until it has printed one less many buffer lines as there are screen lines, or until it runs out of buffer lines, if there are not enough. If it runs out of lines, the function will print single tilde characters on each of the remaining screenlines, except for the last. If the first argument is zero, "display" prints a full screen of tildes, and returns. The second argument specifies a buffer column to start printing with. If non-zero, only the portions of the lines from the specified column onward will be printed. Note the second argument does not specify a screen column, but a buffer column, and that tab expansion is performed before the slice is taken, according to the value of the third argument, discussed below. Lines will always be printed starting at screen column 0. Lines longer than the terminal width are truncated to the terminal width. The third argument specifies the tabstop periodicity. A value of 4, for example, indicates that tabs occur every 4 characters in a line. Any tabs found in the specified lines will be expanded according to this value before truncation and printing. For the details of tab expansion, see the description of the "expand" intrinsic, elsewhere in this document. 1 is returned on success. Any errors encountered stop evaluation. boldface: (boldface) The "boldface" intrinsic turns on boldface mode of the terminal connected to stdout. The function accepts no arguments, and always returns 1. normal: (normal) The "normal" intrinsic turns off boldface mode and resets the colors of the terminal connected to stdout to their default values. The function accepts no arguments and always returns 1. fg_black: fg_red: fg_green: fg_yellow: fg_blue: fg_magenta: fg_cyan: fg_white: bg_black: bg_red: bg_green: bg_yellow: bg_blue: bg_magenta: bg_cyan: bg_white: These sixteen functions set the foreground or background color of the terminal connected to stdout to the specified color. These functions accept no arguments and always return 1. hide: (hide) The "hide" intrinsic accepts no arguments, and if stdout is connected to a terminal device capable of having its cursor made invisible, the function hides the cursor. show: (show) The "show" intrinsic accepts no arguments, and if stdout is connected to a terminal device capable of having its cursor made invisible, the function shows the cursor. pause: (pause expr1) The "pause" intrinsic accepts one argument which must evaluate to a whole number no larger than 999999, specifying a number of microseconds for the interpreter to sleep. The interpreter returns when data is waiting on stdin or when the time value has expired. See the "sleep" intrinsic if you need to sleep for longer periods. "sleep" is also much less processor intensive. scrollup: (scrollup) If the device connected to stdin is cursor-addressable, the "scrollup" intrinsic scrolls the screen lines upward by one line. The line at the top of the screen is lost, while the line at the bottom of the screen becomes blank. scrolldn: (scrolldn) If the device connected to stdin is cursor-addressable, the "scrolldn" intrinsic scrolls the screen lines downward by one line. The last line is lost, while the first line becomes blank. insertln: (insertln) If the device connected to stdin in cursor-addressable, the "insertln" intrinsic scrolls the lines on the screen from the line the cursor is on, to the last line on the screen, inclusive, downward by one line. The line the cursor is on is cleared, while the last line on the screen is lost. printer: (printer) The "printer" intrinsic turns on the lisp printer, if it has been turned off with an invocation of the "noprinter" intrinsic. The result of evaluating an expression at toplevel is discarded unless the printer is turned on. The printer is turned on by default. noprinter: (noprinter) The "noprinter" intrinsic turns off the lisp printer, if it is turned on. The result of evaluating an expression at toplevel is discarded unless the printer is turned on. The printer is turned on by default. shexec: (shexec expr) The "shexec" intrinsic overlays a new process overtop of the interpreter process, similarly to how the command of the same name works in the shell. The function accepts one argument which must evaluate to a string, and attempts to use the execv() system call to run the shell (/bin/sh -c) with the command-line specified in the argument string. Upon success, the Munger interpreter process is abandoned, and the shell process starts running with the same process id. It "replaces" the interpreter. This is useful when a script is finished its work and wishes to run another program to do some further processing. If the new process image cannot be exec-ed, then the function returns a string describing the error. exec: (exec expr ...) The "exec" intrinsic behaves similarly to the "shexec" intrinsic. It overlays a new process image overtop of the interpreter process. The function accepts one or more arguments, which must all evaluate to strings, and interprets them as a command followed by its command-line arguments. It DOES NOT pass its arguments to the shell for interpretation. Upon success, the Munger interpreter process is abandoned, and the new process starts running with the same process id. It "replaces" the interpreter. This is useful when a script is finished its work and wishes to run another program to do some further processing. If the new process image cannot be exec-ed, then the function returns a string describing the error. truncate: (truncate expr) The "truncate" intrinsic alters the length of a file connected to the standard output of the interpreter. The function accepts one argument which must evaluate to a an integer specifying the new length of the file. If the specified length is greater than the length of the file, the file will be extended and the extended portion filled with zeros. The function returns 1 on success, or if an error is encountered, a string describing the error. > (with_output_file_appending "HOW_TO_EXTEND_IT" >> (when (writelock) >>> (truncate 100))) 1 > (with_input_file "HOW_TO_EXTEND_IT" >> (setq line (getchars 1000))) "ADDING INTRINSICS TO THE INTERPRETER ------------------------------------ Under the hood, Munger is" > (length line) 100 dynamic_let: (dynamic_let (symbol expr) expr1 [expr2...]) Intrinsic "dynamic_let" allows the creation of bindings with dynamic scope, which is to say, such a binding is globally visible for the time the dynamic_let is executing. A dynamically-scoped variable in Munger is a global which will cease to exist when the "dynamic-let" exits, or if a global of the same name existed previously, revert to its former value. The bindings created by "dynamic_let" cannot be captured by closures. A lexical binding of the same name as a dynamic binding will shadow the dynamic binding. This means a binding in an exterior "let" expression can shadow the binding in an interior "dynamic_let". The intrinsic accepts two or more arguments, the first of which is not evaluated and must be a two-element list consisting of a symbol followed by any expression. The expression will be evaluated in the current scope and the result bound to the symbol for the duration of the "dynamic_let". The remaining arguments are then evaluated, in order, with the new binding in effect. When the expression terminates the dynamic binding is removed. The value of the last expression evaluated is returned by "dynamic_let". The purpose of "dynamic_let" is to allow one to make temporary changes to global variables without having to save the previous value and restore it afterwards. By definition, a global is a variable one wishes to be globally visible. One would only knowingly mask it with a local binding if one wished to have some local code find a different value bound to the same symbol. A "let" is sufficient for this purpose, but if one wishes to have other functions which the local code may invoke to also see the changed value, then "dynamic_let" must be used. > (defun f () a) <CLOSURE#57> > (dynamic_let (a 10) >> (f)) 10 > (f) evaluate: symbol a not bound. f: error evaluating body expression 1. > (defun a (x) 'foobar) <CLOSURE#58> > (defun g (x) (a x)) <CLOSURE#59> > (dynamic_let (a (lambda (x) x)) >> (g 23)) 23 > (g 23) foobar > (dynamic_let (a 10) >> (defun h () a)) <CLOSURE#60> > (h) evaluate: symbol a not bound. h: error evaluating body expression 1. ; Exterior "let" is shadowing interior "dynamic_let": > (let ((a 12)) >> (dynamic_let (a 10) >>> (print a) >>> (newline))) 12 1 ; return value of "newline" basename: (basename expr) The "basename" intrinsic returns just the filename portion of a specified path. The function accepts one argument which must evaluate to a string and returns a string. If the argument string does not have a filename component, "basename" returns the empty string. NOTE, the path specified by the argument string does not have to exist in the filesystem. This is a string manipulation function only. dirname: (dirname expr) The "dirname" intrinsic returns just the directory portion of a specified path. The function accepts one argument which must evaluate to a string and returns a string. If the argument does not have a directory component "." is returned. NOTE, the path specified by the argument string does not have to exists in the filesystem. This is a string manipulation function only. chmod: (chmod expr1 expr2) The "chmod" intrinsic may be used to change the permissions associated with a specified file. The function accepts two arguments, both of which must evaluate to strings. The first argument must be a new mode specification of the form accepted by chmod(1), in either symbolic or octal form. The second argument must evaluate to the filename of the file to be affected. "chmod" returns 1 on success; otherwise it returns a string describing an error condition. chown: (chown expr1 expr2 expr3) The "chown" intrinsic may be used to change the owner (if the euid of the interpreter is the superuser's), and/or the group associated with a specified file. The function accepts three arguments, which all must evaluate to strings. The first argument must be the name of the new owner, or the empty string if the owner is not to be changed. The second argument must be name of the new group, or the empty string if the group is not to be changed. The third argument must be the filename of the file to be affected. "chown" returns 1 upon success; otherwise, it returns a string describing an error condition. Only the superuser may change the ownership of files. crypt: (crypt expr) The "crypt" intrinsic is a frontend to the crypt(3) library function. It accepts one argument which must evaluate to a string and encrypts it using the default scheme used to encrypt passwords in the user database. The encrypted string is returned. Any error encountered by crypt(3) will stop evaluation. checkpass: (checkpass expr1 expr2) The "checkpass" intrinsic verifies a user and password pair are correct for the system on which it is running. The function accepts two arguments which must both evaluate to strings, the first of which specifies the user name, and the second of which specifies the password of that user. The function will always return 0 unless the euid of the interpreter is 0 (the superuser), when it will return 1 if the user name and password are correct, or 0, if they are not. setuid: (setuid expr) The "setuid" intrinsic is used to change the uid and euid of the interpreter process. It accepts one argument which must evaluate to a string specifying the user to change to. The function will return a string describing an error condition, unless the euid of the interpreter is 0 (the superuser) and the requested user exists, when it will change to the specified user and return 1. It is not possible to switch back to the superuser after invoking "setuid" to become a non-privileged user. seteuid: (seteuid expr) The "seteuid" intrinsic is used to change the euid of the interpreter process if it is running set-user-id. It accepts one argument which must evaluate to a string specifying the user to change to, and returns 1 upon success or a string describing an error condition upon failure. The euid may be switched between the real uid and the set-user-id, of a set-user- id interpreter. If the interpreter is not running setuid, then the uid, the euid, and the saved set-user-id will all be the same and therefore it will not be possible to change the euid. getuid: (getuid) The "getuid" intrinsic accepts no arguments and returns a two element list consisting of the name of the real user the interpreter is running as, followed by the numerical uid of that user. If an unforeseen error occurs which prevents the interpreter from determining its own uid, the function returns the empty list. setgid: (setgid expr) The "setgid" intrinsic is used to change the gid of the interpreter process. It accepts one argument which must evaluate to a string specifying the group to change to. The function returns 1 upon success, or a string describing an error condition upon failure. setegid: (setgid expr) The "setegid" intrinsic is used to change the effective gid of the interpreter process. It accepts one argument which must evaluate to a string specifying the group to change to. The function returns 1 upon success, or a string describing an error condition upon failure. The egid may be switched between the real gid and the saved set-group-id, of a set-group-id interpreter. If the interpreter is not running set-group- id, the both the saved set-group-id and the real gid will be the same. geteuid: (geteuid) The "geteuid" intrinsic accepts no arguments and returns a two element list consisting of the name of the effective user the interpreter is running as, followed by the numerical uid of that user. If an unforeseen error occurs which prevents the interpreter from determining its own euid, the function returns the empty list. getgid: (getgid) The "getgid" intrinsic accepts no arguments, and returns a two element list consisting of the name of the primary group of the user the interpreter is running as, followed by the numerical gid of that group. If an unforeseen error occurs which prevents the interpreter from determining its gid, then the function returns the empty list. seek: (seek expr1 expr2 expr3) The "seek" intrinsic is used to move the file pointer of a file connected to one of the standard descriptors. The function accepts three arguments, the first of which must evaluate to 0, 1, or 2, and specifies whether the seek operation should affect the file pointer of stdin, stdout, or stderr, respectively. The second argument must evaluate to a integer specifying the number of characters by which the file pointer should be adjusted, and the third argument must evaluate to one of the following three strings: "SEEK_SET", "SEEK_CUR", or "SEEK_END", and specifies the position which the adjustment is relative to. "SEEK_SET" indicates the seek operation should seek from the beginning of the file. "SEEK_CUR" indicates the seek operation should seek from the current position of the file pointer. "SEEK_END" indicates the seek operation should seek from the end of the file. Upon success the function returns the number of characters from the beginning of the file corresponding to the new location of the file pointer. Further reads or writes to the stream will happen relative to the new position of the file pointer. Any errors encountered will stop evaluation. Seeking past the end of a file connected to stdout will cause the file to be automatically extended, with the new portion filled with zeroes. NOTE: calling "getline" on stdin will cause subsequent invocations of "seek" to return unexpected values, as "getline" does its own buffering. If the user wishes to intersperse the reading of data from stdin and calls to "seek" on stdin, the "getchars" intrinsic must be used to perform the read operations. getchars: (getchars expr [expr]) The "getchars" intrinsic is used to read a specific number of characters from the stream connected to stdin. It accepts one or two arguments which both must evaluate to fixnums. The first must evaluate to a whole number specifying the number of characters to read, while the second, if present, must evaluate to a positive value specifying a number of seconds after which a timeout will occur. Upon success, the characters read are returned as a string. Any errors encountered will stop evaluation. If "getchars" encounters EOF while reading, or a timeout occurs, less than the desired amount of characters may be returned. If EOF was encountered, any successive invocation of the "getchars" will return fixnum 0. Note that invoking "getchars" with a first argument of 0, always causes it to immediately return the empty string without even attempting to read from stdin. Mixing calls to "getline" and "getchars" will result in unexpected results, as "getline" performs its own input buffering. You MAY mix calls to "getchars" with calls to "getline_ub", however. The presence of a timeout value causes "getchars" to operate as follows. The function calls the read() system call until it has either read the desired number of characters, or it encounters EOF on the input stream. This is useful when stdin is connected to a socket. If any invocation of read() takes longer than the number of seconds specified by the timeout value, it will be interrupted, and "getchars" will return all the characters read so far. If no characters have been read, then "getchars" will return the empty string. This is the only other circumstance in which it will return the empty string. When invoked without a timeout value, but with a positive first argument, the function will block indefinitely until it can read at least one character or EOF, and either return a non-empty string, or 0 on EOF. This means, that after stdin has been connected to a socket with "accept", a return value of "" from (getchars 100 5) means a timeout occured before any data was read from the socket, while a return value of 0 indicates EOF. When reading from a terminal device in canonical mode, when a timeout value has been specified, the empty string may be returned even though the user has typed some characters, because the terminal driver will not return any character data to the interpreter in canonical mode until a carriage return or a newline are input. readlock: (readlock) The "readlock" intrinsic is used to obtain a shared lock on a file connected to stdin. The function accepts no arguments and returns 1 upon success, or 0 if the file is already locked by another process. If the lock cannot be obtained for any other reason, the function returns a string describing the error. writelock: (writelock) The "writelock" intrinsic is used to obtain an exclusive lock on a file connected to stdout. The function accepts no arguments and returns 1 upon success, or 0 if the file is already locked by another process. If the lock cannot be obtained for any other reason, the function returns a string describing the error. unlock: (unlock expr) The "unlock" intrinsic is used to release a lock obtained by the "readlock" or "writelock" intrinsics. The function accepts one argument which must evaluate to either 0 or 1, specifying whether a lock on stdin or stdout is to be released, respectively. hostname: (hostname) The "hostname" intrinsic accepts no arguments and returns the name of host it is running on, or a string describing an error condition it encountered while attempting to retrieve the hostname. symlink: (symlink expr1 expr2) The "symlink" intrinsic creates a symbolic link to a pre-existing filesystem entity. The function accepts two arguments, both of which must evaluate to strings. The first argument specifies the pre-existing filesystem entity, while the second argument names the symbolic link. "symlink" returns 1 upon success, or a string describing an error condition, if the symlink() system call failed. gecos: (gecos expr) The "gecos" intrinsic queries the user database for the value of the gecos field for a specified account. The function accepts one argument which must be a string specifying the user name associated with the account. If such a user exists, the value of the gecos field is returned as a string, otherwise the empty string is returned. The gecos field is used to store personal information about the user. Traditionally, it held 4 comma-separated fields containing the user's full name, office location, work phone number, and home phone number, but the system does not care what goes into the gecos field. Most administrators nowadays simply place the user's full name there. record: (record expr) The "record" intrinsic creates a fixed-size unidimensional array. Records are a more space-efficient means of representing fixed-size aggregate types. The one argument must evaluate to a positive integer specifying the size of the array. The items of the record are all preset to the empty list upon creation. getfield: (getfield expr1 expr2) The "getfield" intrinsic is used to retrieve an item from a record. It accepts two arguments, the first of which must evaluate to a record, while the second must evaluate to a positive integer specifying the index of the desired item in the array. The object at that index is returned. setfield: (setfield expr1 expr2 expr3) The "setfield" intrinsic is used to insert an object into a record. The function accepts three arguments. The first argument must evaluate to the record to be affected. The second argument must evaluate to a positive integer specifying the index location to overwrite. The third argument can evaluate to any lisp object and will be inserted into the first argument at the location specified by the second argument. The evaluated third argument is returned. extend: (extend expr1 expr2) The "extend" intrinsic adds a new binding to the currently-active local environment. The function accepts two arguments, the first of which must evaluate to a symbol, while the second of which may evaluate to any lisp object. The symbol is added to the local environment and bound to the result of evaluating the second argument, replacing any pre-existing local binding to that symbol. Note that if the pre-existing symbol occurs free in the current lexical environment, then it is shadowed, not replaced. The result of evaluating the second argument is returned. The intent behind the inclusion of this intrinsic in the interpreter is to allow extensions to the current lexical environment without the use of "let" and friends, for increased efficiency. > (defun fact (n) >> (extend 'a 1) >> (while (> n 1) >>> (setq a (* n a)) >>> (dec n)) >> a) <CLOSURE#23> > (fact 10) 3628800 Any lambda-expressions closing over the current lexical environment will "see" the new binding, because closures do not simply close over the lexical bindings visible at the time of their creation, but rather the lexical environments visible at the time of their creation. The difference between these two notions, is that environments may be dynamically extended with "extend" to contain a superset of the bindings visible at creation. If the closure is applied before an invocation of "extend" it will not see the binding created, since it will be created in the future, but if that VERY SAME CLOSURE is applied after an invocation of "extend" it will see the new binding created. The new binding suddenly appears in the current lexical environment. This means a closure bound to a new local via "extend" will "see" its own binding when it is applied, and therefore may call itself by name. The extent of bindings created by "extend" is unlimited, but may be limited by wrapping the expressions in which they occur with an invocation of "dynamic_extent" described elsewhere in this document. gc: (gc) The "gc" intrinsic sets the garbage collector to run on the next evaluation. The garbage collector normally runs once every 65536 evaluations. Judicious invocations of this function may allow the user to decrease the memory consumption of his or her program. dynamic_extent: (dynamic extent ...) The "dynamic_extent" intrinsic limits the extent of additions to the current lexical environment made with the "extend" intrinsic. The intrinsic accepts zero or more arguments. If no arguments are supplied, "dynamic_extent" does nothing and returns 1. If arguments are supplied, they are evaluated in order, and the result of evaluating the last expression is returned. If no lexical environment exists at the time of invocation, an error will be generated which will stop evaluation. When "dynamic_extent" finishes, any additions made to the current lexical environment by the expressions in its body are removed. Combinations of "dynamic_extent" and "extend" can replace occurrences of "let", "letn", or "labels" inside functions where the bodies of those expressions do not contain closures, or which contain closures which do not need to close over the new bindings, which will not persist beyond the extent of "dynamic_extent". In this example the new bindings introduced for "b" and "a" do not have their extent limited, and any closures formed in the body of the "let" would continue to "see" these bindings when the "let" returned, even if the closures were closed before the invocations of "extend". The binding to "c" however, has its extent limited to the extent of the invocation of "dynamic_extent". If no new bindings are introduced via "extend" inside an instance of "dynamic_extent", then it effectively does nothing. > (let ((a 10)) >> (extend 'b (* a a)) >> (dynamic_extent >>> (extend 'c (* b b)) >>> (print c) >>> (newline)) >> (print (boundp 'c)) >> (newline)) 10000 ; first "print" 0 ; second "print" gc_freq: (gc_freq expr) The "gc_freq" intrinsic allows the programmer to change the rate at which garbage collection occurs. The default is once every 1048576 new objects or internal atoms have been allocated. Internal atoms are not be confused with lisp atoms. Each unique syntax has one internal atom representing all occurences of that syntax. So, (+ a a) is a list of three objects, but each of the 'a' objects point to the same internal atom. You don't need to know this. This function accepts one argument, which must evaluate to a whole number fixnum specifying a new value for the GC frequency. Increasing GC frequency will cause the interpreter to run faster up to a point, but consume more memory, while decreasing the value of gc_frequency will cause the interpreter to run more slowly but consume less memory. The function returns the old frequency value. Setting GC frequency to zero will disable garbage collection. The programmer can manually invoke garbage collection with the "gc" intrinsic somewhere else in his or her program. There is a point, when garbage collection is turned-off or set to happen very infrequently, where the size of the object and atom pools will grow to be so large that GC itself will become the performance bottleneck of your program. This is because these pools are never returned to the system. Keep this in mind. getpid: (getpid) getppid: (getppid) getpgrp: (getpgrp) tcgetpgrp: (tcgetpgrp) The "getpid", "getppid", "getpgrp", "tcgetpgrp" intrinsics accept no arguments and each returns a fixnum representing the process id of the interpreter, or the process id of the parent process of the interpreter, or the process group id of the process group the interpreter belongs to, or the process group which is currently the foreground process associated with the terminal device the interpreter is running on, respectively. The "tcgetpgrp" function will return 0 if the interpreter is not associated with a terminal device. For any other error it will return a string describing the error condition. setpgid: (setpgid expr1 expr2) The "setpgrp" intrinsic accepts two arguments which both must evaluate to fixnums. The first argument must be a process id of a running process, while the second must be the process group id of a running process group or it must be the same as the first argument. The function puts the process specified by the first argument into the process group specified by the second argument. Upon success, the function returns 1, otherwise it returns a string describing an error condition. There processes which may be affected by this intrinsic are described in the manual page for the "getpgrp" system call. If the first argument is zero, then the pid of the interpreter process is used as the first argument. If the second argument is zero, then the first argument will be used for the second argument, as well. New process groups are created by setting both arguments to the same value. If the affected process is not already be a process group leader, it will become the process group leader of a new process group, and the process group id will be the same as its process id. tcsetpgrp: (tcsetpgrp expr) The "tcsetpgrp" intrinsic accepts one argument which must evaluate to a fixnum specifying a process group id and makes that process group the foreground process associated with the terminal device the interpreter is running on. If the interpreter has no associated controlling terminal, then the function returns 0. Upon success, it returns 1. Otherwise it returns a string describing an error condition. The processes which may be affected by this intrinsic are described in the manual page for the "tcsetpgrp" system call. Note that although it is not mentioned in the manual page, one cannot call "tcsetpgrp" when one is a background process, unless one has blocked SIGTTOU, which may be accomplished by calling the "block" intrinsic. kill: (kill expr1 expr2) The "kill" intrinsic accepts two arguments, the first of which must evaluate to a fixnum representing the process id of a running process, while the second argument must evaluate to a fixnum representing a signal number. The function sends the signal specified by the second argument to the process specified by the first argument. If successful, the function returns 1, otherwise it returns a string describing an error condition. The processes which may be affected by this intrinsic are described in the "kill" system call (man 2 kill). A table mapping signal numbers to signal names may be found in the "signal" manual page. killpg: (killpg expr1 expr2) The "killpg" intrinsic accepts two arguments, the first of which must evaluate to a fixnum representing the process group id of a running process group, while the second argument must evaluate to a fixnum representing a signal number. The function sends the signal specified by the second argument to every process in the process group specified by the first argument. If successful, the function returns 1, otherwise it returns a string describing an error condition. The processes which may be affected by this intrinsic are described in the manual page for the "killpg" system call (man 2 killpg). A table mapping signal numbers to signal names may be found in the "signal" manual page. fork: (fork) The "fork" intrinsic is a wrapper for the "fork" system call. The function accepts no arguments and generates a new interpreter process which is an exact copy of the current process. The function returns 0 to the child interpreter, and the process id of the child interpreter to the parent interpreter. Upon error, -1 is returned. The interpreter will reap any child processes which exit while it is running, unless the "zombies" intrinsic has been invoked, in which case, the programmer must reap them manually using the "wait" intrinsic. Read the entries for "child_open", "pipe", "with_input_process", and "with_output_process", and determine if they will do what you need a new process to do, before using "fork" or "forkpipe", because these other intrinsics and macros are more convenient to use. forkpipe: (forkpipe expr) The "fork" pipe behaves similarly to the "fork" intrinsic, but creates a pipe between the two interpreters. The function accepts one argument, which must evaluate to one of 0, 1, or 2, and specifies which of the parent interpreter's descriptors is attached to the pipe. These values correspond to the values of the file descriptors for the standard streams: 0 is the standard input, 1 is the standard output, and 2 is the standard error. The value specified implies which of the child interpreter's descriptors is attached to the pipe. If parent descriptor 1 or 2 is specified, then the child interpreter will have its standard input connected to the pipe. If parent descriptor 0 is specified, then the child will have its standard output connected to the pipe. The function returns 0 in the child interpreter, the process id of the child process in the parent interpreter. If the fork() system call fails, -1 is returned. Any error encountered while creating the pipe or duping descriptors will stop evaluation. The interpreter will reap any child processes which exit while it is running, unless the "zombies" intrinsic has been invoked, in which case, the programmer must reap them manually using the "wait" intrinsic. Read the entries for "child_open", "pipe", "with_input_process", and "with_output_process", and determine if they will do what you need a new process to do, before using "fork" or "forkpipe", because these other intrinsics and macros are more convenient to use. wait: (wait expr1 [expr2]) The "wait" intrinsic is used to reap a zombie process. When a child process terminates, its process table entry is preserved so that the interpreter may determine how it exited, and what its exit status was. When the interpreter starts-up it is in the "nozombies" state, which means it will automatically reap any zombies created by terminated child processes, discarding their exit statuses. If the programmer needs his or her program to wait for a child process to complete before proceeding, he or she may invoke the child with the "system" intrinsic, but if the programmer wants to the parent and child to proceed asynchronously, but still needs to know how the child process exited, or its exit status, then the programmer must invoke the "zombies" intrinsic before manually launching the child process with "fork" and "exec". To retrieve the child's termination information, the programmer invokes "wait" at some time subsequent to forking the child, to reap the child's process table entry. One should do this even if one subsequently decides one is not interested in the termination information. If the process has not yet terminated, "wait" will block until it does. The function accepts one or two arguments, the first of which must evaluate to a fixnum specifying a process id, 0, -1, or a process group id negated. This argument is passed as the first argument to the waitpid() system call. If the argument is a process id, "wait" will reap that process and return its termination information. If the argument is -1, then "wait" will reap any zombie child process waiting to be reaped, and return its termination information. If the argument is 0, then "wait" will reap any zombie child which belongs to the interpreter's process group. If the argument is a process group id negated, then "wait" will reap any zombie child whose process group id is equal to the absolute value of the argument. If the second argument is present, it can evaluate to any value. It is a "don't block" boolean flag. If is not present or if it evaluates to a boolean "false" value, AND there are no stopped or zombie processes which can satisfy the "wait" request, BUT there is at least one running process which can satisfy the "wait" request in the future, THEN "wait" will block until it can reap a process. Otherwise it will return immediately. The function returns a two or three element list. If "wait" cannot reap a process it returns a list containing the fixnum 0 or -1, and the symbol ECHILD. If no second argument was supplied to "wait" or if the second argument evaluated to a boolean "false" value, then the first element of the returned list will be -1. This means there is no running or zombie process which can satisfy the "wait" now or in the future. If a "true" second argument was supplied to "wait", then the first element of the returned list may be either -1 or 0. A value of 0 indicates there are processes which can satisfy the "wait" but which are still running. Otherwise, "wait" returns a list containing a fixnum representing the process id of the reaped process, followed by a symbol describing how the process exited, which will be one of EXITED, KILLED, or STOPPED. If the second element is EXITED, then the process terminated by calling the "exit" or "_exit" system calls, and a third element will be present which will be a fixnum representing the child's exit status, which is the argument it gave to "exit" or "_exit". If the second element is KILLED, then the process was terminated by a signal, and a third element will be present which will be a fixnum representing the signal number of the signal which terminated the process. If the second element is STOPPED, then the process was stopped by a signal and may be started again, and a third element will be present which will be a fixnum representing the signal number which stopped the process. Processes may be stopped by the job control related signals: SIGSTOP, SIGSTP, SIGTTIN, SIGTTOU. The manual page for "signal" maps signal numbers to constant names. If one wishes to simply ensure that all current zombies are reaped, one may invoke "wait" with an argument of -1, until it returns (-1 ECHILD). > (until (eq -1 (car (wait -1))) zombies: (zombies) The "zombies" intrinsic accepts no arguments, and when invoked causes the interpreter to stop reaping zombie child processes. They may be manually reaped with the "wait" intrinsic. Note that after invocation of "zombies", each invocation of "fork", "child_open", "with_input_process", "with_output_process", and "pipe" will generate a new process which must be manually reaped with the "wait" intrinsic. Note that "input", "output", and "filter" will all reap their own zombies, regardless of the setting of zombies state. This is because it is easy for the interpreter to reap the processes forked by these three, since they are running only while the associated intrinsic function is running. The function always returns 1. nozombies: (nozombies) The "nozombies" intrinsic accepts no arguments, and when invoked causes the interpreter to reap zombie child processes. The interpreter starts- up in the "nozombies" state. This function always returns 1. zombiesp: (zombiesp) The "zombiesp" intrinsic accepts no arguments and returns the value of the zombies state. It returns 1 if the interpreter is in the "zombies" state, and 0 otherwise. glob: (glob) The "glob" intrinsic accepts one argument, which must evaluate to a string, and calls the library function "glob" upon it, which interprets the string as a shell glob pattern and searches for matches in the filesystem on the pattern. Any matches found will be returned as a list of strings. If no matches are found, the empty list is returned. Any errors encountered, will stop evaluation. It is not an error for a pattern to have no matches. command_lookup: (command_lookup expr) The "command_lookup" intrinsic accepts one argument which must evaluate to a string, and attempts to find a file executable by the user the interpreter is running as, in the directories specified by the PATH environment variable. If successful, the fully-qualified filename is returned. If not successful, the empty string is returned. If the user has changed the value of the PATH environment variable, or has added new executables to the directories specified by the environment variable, then the "rescan_path" intrinsic must be invoked to get the interpreter to update its internal list of executables. > (command_lookup "munger") "/usr/local/bin/munger" > (command_lookup "foobar") "" getstring: (getstring expr) The "getstring" library function returns the output of an external process as a string. The function accepts one argument which must evaluate to a command line to be passed to the shell (/bin/sh) and gathers up the data which appears on the program's standard output into a string, which it then returns. No processing is performed on the program output. It is returned unaltered. If the interpreter cannot "forkpipe", the function returns -1. If the "forkpipe" is successful but the "shexec" is not, then the function will return the empty string. This means specifying a non-existent program to run, or a program to which you do not have read and execute permission, will cause the function to return the empty string. dec2hex: (dec2hex expr) The "dec2hex" intrinsic converts a fixnum representing a whole number into a string representing the number in hexadecimal notation. If the programmer passes a negative number to the function, an error will be generated. > (dec2hex 65535) "FFFF" hex2dec: (hex2dec expr) The "hex2dec" intrinsic converts a string representing a whole number in hexadecimal notation to a fixnum. The letter characters used in hexadecimal notation may be in either lower or upper case forms. > (hex2dec "FfFf") 65535 listen: (listen expr [expr]) The "listen" intrinsic is used to start the kernel accepting incoming tcp connections for the interpreter process. Both IPv4 and IPv6 connections will be accepted. The functions accept one or two arguments. The first argument must evaluate to either a fixnum specifying the port number to accept incoming connections on, or a string naming a service, as listed in /etc/services. If the port number is 0, then the kernel will choose a port from the ephemeral ports. The second optional argument, if present, must evaluate to a string specifying the IP address of the interface to use. It must be expressed in the presentation format for either IPv4 or IPv6. If the second argument is not present, the function will accept connections on all interfaces of both protocol families. If successful, the function returns the port number the listening socket is listening on. Otherwise, it returns a string describing an error condition. The listen() system call is called with backlog argument of 4096. This determines the number of connections the kernel will queue, awaiting service. Only one listening socket may be active at any time. To accept connections on more than one interface simultaneously, the programmer must "fork" the interpreter, and have each instance call "listen". The listening socket can be closed with the "stop_listening" intrinsic. After a successful invocation the kernel will start accepting and queuing incoming connections on the specified interface. To service a connection, the programmer invokes the "accept" intrinsic, documented below. listen_unix: (listen_unix expr) The "listen_unix" intrinsic is used to start the kernel accepting incoming connections over a UNIX domain socket. The function accepts one argument, which must evaluate to a string specifying the desired pathname for the listening socket in the filesystem. If the entity so named exists, an attempt will be made to unlink it first. If the Munger interpreter lacks the necessary permissions to do so, the bind() system call will fail and "listen_unix" will return a string containing an error message. All other errors will also cause an error string to be returned. The function returns 1 on success. The second paragraph of the entry in this manual for the "listen" intrinsic applies to "listen_unix" as well. stop_listening: (stop_listening) The "stop_listening" intrinsic closes a listening socket opened by the "listen" or "listen_unix" intrinsics. The function accepts no arguments and returns 1 if a listening socket was active, or 0 otherwise. After calling "stop_listening" no more incoming connections may be accepted with "accept" until the programmer calls "listen" again. accept: (accept) The "accept" intrinsic accepts an incoming tcp connection. It can only be invoked after "listen" or "listen_unix" has been invoked, and is invoked repeatedly to accept successive incoming connections. The function accepts no arguments and returns 1 if successful, -1 if the system call was interrupted by a SIGTERM, or a string describing an error condition otherwise. The function blocks until an incoming connection has been completed and is ready for communication with the client who initiated it. When "accept" returns, the stdin and stdout of the interpreter will have been redirected onto the incoming connection. Any of the intrinsics which read and write from those descriptors ("print", "println", "newline", "getchar", getchars", and "getline") may be used to communicate with the client. One must keep in mind "accept" works like "pipe" or "redirect" in that the new streams connected to the affected descriptors "shadow" the previously connected streams, but the previously-connected streams are still open in the interpreter. This means one should invoke both (resume 0) and (resume 1) when one is finished communicating with a client, before calling "accept" again, unless one intends to come back to the previously-accepted connection in the future. To service more than one client at a time, the programmer may "fork" the interpreter multiple times and have each child call "accept" for itself in a loop to accept incoming connections, or the programmer may choose to have one process call "accept" and then fork a new process as needed to service each client. In the example of an echo server below, the parent process "accepts" each incoming connection, then forks off a child to service it. The child calls "stop_listening" to close its reference to the listening socket. This is because the parent could exit before the child, and if the child still had a valid reference to the listening socket, then the kernel would still keep queuing incoming connections which would never be accepted, because the parent process does the accepting. Therefore, we want incoming connections rejected when the parent exits. The parent calls "resume" on both stdin and stdout to close its references to the newly-accepted client. If it did not do this, the parent's connection to the client could remain open until the next time "resume" was invoked in the parent, unless the client explicitly closed its end of the connection. Note that this program must be run as "root" to bind to port 7, and will never exit unless "listen" or "accept" encounters an error. It must be killed with a signal. (fatal) (daemonize "echo1.munger") (defun service_client () (while (setq line (getline)) (print line) (flush_stdout))) (when (fixnump (setq err (listen 7))) (setuid "nobody") (while (fixnump (setq err (accept))) (if (not (fork)) (progn (stop_listening) (service_client) (exit 0)) (resume 0) (resume 1)))) (syslog 'CRITICAL err) (exit 1) Another example of an echo server is below. In this server, the parent calls "listen" then forks off 9 children. All ten processes then call "accept" in an infinite loop to service clients. All ten accept connections from the same listening socket, so all ten need to keep their references to it open. Since each process calls "accept" separately, none of them will have references to each others' accepted connections to worry about, but there is another detail related to "accept" we must worry about in this example. In order to ensure we do not keep previously accepted connections open, we must explicitly close each tcp connection when we are finished with it, before calling "accept" again, hence the calls to "resume" in the process_clients function below. Note that this program must be run as root to bind to port 7, and will never exit unless "listen" or "accept" encounters an error. It must be killed with a signal. (fatal) (daemonize "echo2.munger") (defun service_client () (while (setq line (getline)) (print line) (flush_stdout))) (defun process_clients () (while (fixnump (setq err (accept))) (service_client) (resume 0) (resume 1))) (when (fixnump (setq err (listen 7))) (setuid "nobody") (catch (iterate 9 (unless (fork) (throw 0))))) (process_clients) (syslog 'CRITICAL err) (exit 1) get_scgi_header: (get_scgi_header) The "get_scgi_header" intrinsic parses an SCGI header netstring from standard input and returns a list of strings, guaranteed to be a multiple of 2 in length, or fixnum 0. A 0 return value indicates the function encountered an error and gave up. A returned list of strings indicates the SCGI header was read successfully, and each pair of strings in the returned list will be a name of an SCGI environment variable and its value. For an example of usage see the scgi.munger exampler SCGI server. Standard input is positioned at the beginning of the SCGI body, when the function returns. This function should not be used in conjunction with "getline", as that function does its own buffering, and "get_scgi_header" will not see already-read data that "getline" has accumulated in its buffer. Rather to read data in a server program, use the unbuffered input functions, "getchars", or "getline_ub". send_descriptors: (send_descriptors); The "send_descriptors" intrinsic can only be invoked after "child_open" has been successfully invoked to open a client connection to a server process over a UNIX domain socket. The function accepts no arguments and returns 1 on success, or a string describing an error condition, otherwise. This function in conjunction with "receive_descriptors", is used to cause one Munger interpreter to pass its standard input and output to another Munger interpreter. The server interpreter calls "receive_descriptors" while the client calls "send_descriptors". After both intrinsics have returned success to their respective interpreters, the server's standard input will be connected to the same source the client's standard input is connected to, and the server's standard output will be connected to the same source the client's standard output is connected to. The connection over the UNIX domain socket is unaffected. The client may close it with "child_close" afterward, if the client has no more communication to accomplish with the server. receive_descriptors: (receive_descriptors) The "receive_descriptors" intrinsic can only be invoked after "listen_unix" and "accept" have been successfully invoked to accept a client connection over a UNIX domain socket. The function accepts no arguments and returns 1 on success, or a string describing an error condition, otherwise. In conjunction with "send_descriptors", this function is used to cause one Munger interpreter to pass its standard input and output to another Munger interpreter. The client invokes "send_descriptors" and the server invokes "receive_descriptors". After both intrinsics have returned success in their respective interpreters, the server's standard input will be connected to the same source the client's standard input is connected to, and the server's standard output will be connected to the same source the client's standard output is connected to. The "resume" intrinsic can be called by the server interpreter to return either or both descriptors to the sources they were formerly connected to, which is the connection to the client over the UNIX domain socket. Invoking "resume" again on both 0 and 1, will close the connection, and cause the server interpreters standard input and output to be connected to the sources they were connected to before the call to "accept". busymap: (busymap expr) The "busymap" intrinsic is used to create an byte-array of shared memory between parent and child server processes. The function accepts one argument which must evaluate to a fixnum specifying the length of the array in bytes. Only one busymap can exist at any time. If a busymap already exists, then the function returns -1. Upon successfully creating a new busymap, the function returns 1. Otherwise, a string describing an error condition is returned. There is no locking mechanism provided to arbitrate access to the busymap. The intended usage is for slave processses of multi-process servers to write to the busymap with the "busy" and "notbusy" intrinsics, and the master process manager to read it with "busyp". See the httpd.munger example web server for usage. Master server processes requests their children exit by sending them a SIGTERM (signal number 15) with the "kill" intrinsic. If this signal is received by the interpreter, the next invocation of "accept" will cause the interpreter to exit. If the interpreter is blocked in "accept" at the time of the arrival of the signal, the interpreter will exit immediately. This allows the slave server process to continue processing any established client connection to completion before exiting. nobusymap: (nobusymap) The "nobusymap" intrinsic frees a busymap which has been created with the "busymap" intrinsic. The function accepts no arguments and returns -1 if no busymap exists, 1 if the busymap was successfully unmapped, or a string describing an error condition. Each process which has access to the busymap, which is to say all the children of the caller of "busymap" who have not invoked "exec", must call "nobusymap", in order to completely remove the shared mapping. busy: (busy expr) The "busy" intrinsic sets a byte in a busymap to 1. The function accepts one argument which must evaluate to a fixnum specifying the index of a byte (indices start at 0) in the active busymap to affect. If no busymap exists, the function returns -1. If successful, the function returns 1. If the index is out of range, an error will be generated which will stop evaluation. notbusy: (notbusy expr) The "notbusy" intrinsic sets a byte in a busymap to 0. The function accepts one argument which must evaluate to a fixnum specifying the index of a byte (indices start at 0) in the active busymap to affect. If no busymap exists, the function returns -1. If successful, the function returns 1. If the index is out of range, an error will be generated which will stop evaluation. busyp: (busyp expr) The "busyp" intrinsic returns the value of a byte in the active busymap. The function accepts one argument which must evaluate to a fixnum specifying the index of a byte (indices start at 0) in the active busymap. If no busymap exists, the function returns -1. If successful, the function returns either 0 or 1, 0 indicating the "not busy" state, and 1 indicating the "busy" state. If the index is out of range, an error will be generated which will stop evaluation. chroot: (chroot expr) The "chroot" intrinsic accepts one argument, which must evaluate to a string, and calls the chroot(2) system call with that string as argument. This system call is used to change the root directory for the interpreter, which must be running as root for the call to succeed. After a successful invocation, the initial slash (/) in all pathnames will refer to the specified directory. See the chroot(2) manual page for more details. Upon success the function returns 1, otherwise it returns a string describing an error condition. Once a process has been chroot-ed, it, and its children, may no longer access any part of the file system above the new root directory. This means a program needing shared libraries existing outside of the visible hierarchy will not start in the chroot environment. For those programs a miniature filesystem with appropriate libraries must be set up before invoking "chroot". Processes are allowed to keep their open descriptors in the new environment, allowing access to previously opened items outside of the new hierarchy, so programs typically chroot after they have done all of their startup tasks, such as opening log files. For example, "daemonize" should be called before "chroot" to open the connection to /dev/log, which will not be visible afterward. daemonize: (daemonize expr) The "daemonize" intrinsic turns the interpreter into a daemon process. The function accepts one argument which must evaluate to the name for the program to use for itself in the system logfile. The function returns 1 upon success, or if it encounters a problem evaluating its arguments, it generate an error which will stop evaluation. If the function encounters an error condition after it has closed the standard descriptors, it will cause the interpreter to exit with an exit status of 1, and write an error message to the system log. To become a daemon, "daemonize" undoes all redirections of the standard descriptors, and closes and reopens stdin, stdout, and stderr on /dev/null, and it closes any open full-duplex connection opened with "child_open". It then calls the "openlog" library function to establish a connection with the syslog daemon, so the daemon process may emit messages to the system log. The function then calls the "block" intrinsic to block a number of signals. Consult the entry in this document for the "block" intrinsic for details. It then forks and allows the parent process to exit, so that the child cannot be a process group leader. This is required so that it may next call setsid() to detach itself from its controlling terminal. This prevents users at the terminal from sending job-control signals to the daemon process. syslog: (syslog expr expr) The "syslog" intrinsic is used to send a message to the system logfile. It is a wrapper around the syslog() library function. It can only be used after "daemonize" has been successfully invoked. The function accepts two arguments, the first of which must evaluate to one of the following set of symbols indicating the priority of the event being logged: ALERT, CRITICAL, ERROR, WARNING, NOTICE, INFO, DEBUG. The symbols must be in all upper-case letters. The /etc/syslog.conf that ships with FreeBSD at the time this entry in the manual page is being edited, prevents any message with priority lower than NOTICE of being logged. The second argument must evaluate to a string containing the log message. Unlike the syslog() library function, no % processing (a la printf) takes place in the message string. Any % occurring in the message string will be escaped with another %. The "daemonize" intrinsic will have called openlog() to set the name of the daemon and its process id to automatically appear in the logfile before the message text. It is not necessary to include anything in the message text except a description of the event being logged. The function returns 1 upon success, or it uses the syslog() library call to send an error message to the system log, and then causes the interpreter to exit. There is no point in returning to toplevel in response to an error event because after "daemonize" has been invoked, the interpreter is no longer capable of performing terminal I/O. flush_stdout: (flush_stdout) The "flush_stdout" intrinsic accepts no arguments and calls fflush() on the standard output stream, returning to the caller whatever that function returns, either 0 upon success or -1 upon encountering an error. Any buffered data which has not yet been written to stdout, is written to the stream. This can be useful when writing to a TCP connection, as the network stack will buffer data to avoid propagating many small TCP segments over the connection. Invoking "flush_stdout" after each call- and-response interaction with a client, is a good idea, to ensure the client gets its response immediately. getpeername: (getpeername) The "getpeername" intrinsic is a wrapper around the system call of the same name. The function accepts no arguments, and if invoked after "listen" and "accept", returns a string representing the IP address of the host on the other end of the TCP connection currently connected to stdin and stdout. The function returns 0 upon encountering any error. base64_encode (base64_encode expr) The "base64_encode" intrinsic accepts one argument which must evaluate to a string, and returns a new string representing its argument encoded in the base64 encoding scheme used by MIME messages. It would be impractical to read a large binary file into memory as a string and feed it to this intrinsic. One may instead use the "getchars" intrinsic to read chunks of the file and feed them to this function. As long as each chunk given to "base64_encode" is a multiple of three bytes in length, except for the last chunk, which may be any length, the output strings from each invocation may be concatenated together to form the base64 encoding for the entire file. > (with_input_file "binary.file" >> (with_output_file "binary.file.base64" >>> (println "begin-base64 644 binary.file") >>> (while (setq line (getchars 57)) >>>> (println (base64_encode line))) >>> (println "===="))) will produce the same output as the b64encode system utility invoked as: b64encode -o binary.file.base64 binary.file binary.file Base 64 encoding represents 3 bytes of data as 4 printable characters, so using a line size of 57 will cause those lines to expand to 76 characters after encoding, which is less than the once-customary 80 character limit on line lengths in an email message. A more efficient method would read a larger amount of text and cut up the lines with "substring": > (with_input_file "binary.file" >> (with_output_file "binary.file.base64" >>> (println "begin-base64 644 binary.file") >>> (setq line "") >>> (while (setq segment (getchars 100000)) >>>> (setq line (concat line segment)) >>>> (while (> (length line) 57) >>>>> (println (base64_encode (substring line 0 57))) >>>>> (setq line (substring line 57 0)))) >>> (when line (println (base64_encode line))) >>> (println "===="))) base64_decode (base64_decode expr) The "base64_decode" intrinsic accepts one argument, which must evaluate to a string containing base64-encoded data, and returns a new string consisting of the unencoded data. If the function encounters a character in the input string which is not part of the base64 vocabulary, or if the length of the argument string is not a multiple of four characters, it returns 0. isatty: (isatty expr) The "isatty" intrinsic is a wrapper for the isatty(3) library call. The function accepts one argument which must evaluate to either fixnum 0, 1, or 2, specifying stdin, stdout, or stderr, respectively. The function returns fixnum 0 if the specified descriptor is not connected to a terminal device, and a non-zero fixnum value, if the specified descriptor is connected to a terminal device. sleep: (sleep expr) The "sleep" intrinsic is a wrapper for the sleep(3) library call. The function accepts one argument which must evaluate to a fixnum specifying a number of seconds for the interpreter to go to sleep. The function returns when specified number of seconds has elapsed or a signal has been received by the interpreter. The function returns 0 if the specified number of seconds has elapsed. If the function is interrupted by a signal, it returns the remaining, unslept number of seconds. If the interpreter receives a SIGTERM, this can be discovered by invoking "sigtermp". unsigned: (unsigned expr) The "unsigned" intrinsic accepts one argument which must evaluate to a fixnum and returns a string representing the value of the fixnum, expressed as an unsigned value. This is not the same as the absolute value of the fixnum. It is similar to casting an int to an unsigned int in C, but getting a string representation back instead of a number. The intrinsic is designed to be used in those situations where one would like to do unsigned arithmetic operations, with numbers large enough to cause the two's-complement representation of the result to wrap-around to the negative side, but not large enough to overflow the fixnum itself. That is to say, not generating a result greater than (unsigned (+ 1 (* (maxidx) 2))) form_encode: (form_encode expr) form_decode: (form_decode expr) These two intrinsics encode and decode strings to and from the x-www- form-url-encoding used by web clients to encode the data in forms. Each intrinsic accepts one string argument, and returns a string. Mon, Apr 21 2014

Search: Section: