Nocturnal Lisp

The idea behind nocturnal Lisp is to have a highly portable Common Lisp implementation written in mostly Lisp with an as small set of primitives as possible.

1 Core Lisp

The special forms for Core Lisp are QUOTE, IF, LAMBDA, setq, CATCH, and THROW and nothing more. This is suffice to implement all the other special forms, which are specified in Common Lisp. To keep things simple our Core Lisp is a Lisp-1. (Which is a bit unfortunate as Core Lisp programs are not Common Lisp programs.

An interpreter in Common Lisp is given below. Note that this interpreter is only almost meta-circular. Being meta-circular is entertaining, but of no particular value.

Sample Implementation


(defun eval (form env)
  (cond ((symbolp form)             (evsym form env))
        ((atom form)                form)
        ((eq 'QUOTE (car form))     (cadr form))
        ((eq 'FUNCTION (car form))  (evfun (cadr form) env))
        ((eq 'IF (car form))        (if (eval (cadr form) env)
                                        (eval (caddr form) env)
                                        (eval (cadddr form) env)))
        ((eq 'SETQ (car form))      (setsym (cadr form)
                                            (eval (caddr form) env)
                                            env))
        ((eq 'CATCH (car form))     (catch (eval (cadr form) env)
                                      (evbody (cddr form) env)))
        ((eq 'THROW (car form))     (throw (eval (cadr form) env)
                                      (eval (caddr form) env)))
        ((eq 'CAR (car form))       (car (eval (cadr form) env))
        : ... all the SUBRs
        :
        (t                          (apply (evfun (car form) env)
                                           (evlis form env))))))

(defun evlis (forms env)
  (cond ((null forms)   nil)
        (t              (cons (eval (car forms) env)
                                    (evlis (cdr forms) env)))))

(defun evbody (forms env)
  (cond ((null forms)       nil)
        ((null (cdr forms)) (eval (car forms) env))
        (t                  (eval (car forms) env)
                            (evbody (cdr forms) env))))

(defun pairlis (p a e)
  (cond ((and (null p) (null a))  e)
        ((null p)                 (error "Too many arguments"))
        ((symbolp p)              (cons (cons p a) e))
        ((null a)                 (error "Too few arguments"))
        (t                        (cons (cons (car p) (car a))
                                        (pairlis (cdr p) (cdr a) e)))))

(defun evsym (sym env)
  (let ((q (assoc sym env)))
    (cond ((not (null q))       (cdr q))
          (t                    (symeval sym)))))

(defun evfun (f e)
  (cond ((symbolp f)
         (getd f))
        ((and (consp f) (eq 'LAMBDA (car f)))
         (cons 'closure (cons e (cdr f))))
        (t
         (error "Bad function"))))

(defun setsym (sym val env)
  (let ((q (assoc sym env)))
    (cond (q (rplacd q val))
          (t (set sym val)))
    val))

(For brevity the necessary error checking has been skipped).

The special forms follow their usual semantics. One thing worth mentioning is that LAMBDA takes none of the lambda keywords like &optional or &rest. But lambda lists could be dotted, with the trailing symbol taking what ever arguments are left. That is: A lambda list (x . r) is what CL has as (x &rest r).

Although strictly speaking an implementation detail, it is crucial that tail calls are optimized. Tail positions are the last form in a LAMBDA body and both branches of an IF. This is of practical importance as we implement branches by tail calling to a continuation.

1.1 Primitives

We specify that a form invoking a primitve like

(PLUS 1 2)

is recognized by the interpreter directly. To make funcall work with e.g.

(SETQ F PLUS)
(FUNCALL F 1 2)

We place stubs into the symbol function cells, like

(PUTD' PLUS (LAMBDA (X Y) (PLUS X Y)))

Note, that none of our primitives take a variable number of arguments.

1.2 Data Types

To keep things simple our Core Lisp has the following data types only:

We punt on numbers and do not specify exactly what kind of numbers they are.

TODO

When we allow for our numbers to be floating point numbers, we also need something to explode and implode them, as this is not trivial and more or less impossible to do without integer-decode-float and integer-encode-float. (Well when we know the internal representation, it ought to be safe to use log and expt to read the exponent and shift the mantissa). It's not as bad as it sounds, as we need to implement floating point printing on our own anyway, as most non-Lisp runtimes cannot do that.

Functions. The most sane course is that closures and primitives are function objects. For the latter we could do without, but not for the former. We might want to leave that as an abstract data type though, with the only promise that functionp definitly returns nil on objects not fit to be invoked as a function.

Also: For a real spec, we would need to specify our number data type. Which somewhat defeats our goal, as we want to have a small Core Lisp, which is easy to implement in different languages. And JS for instance has no distinction between integers and double floats. So to tell flonums from fixnums would entail needing to box on of those with JS. (Bignums are no concern here, we would implement these in Lisp).

1.2.1 Conses


cons x y Function

    

Creates a new cons cell with x in its CAR and y in its CDR.


car x Function
cdr x Function

    

Retrieves the CAR or CDR of the cons cell x. When x is nil the result is nil.

An error is signaled, when x is neither nil or a cons cell.


rplaca x y Function
rplacd x y Function

    

Replaces the CAR or CDR of the cell cell x with y. The return value is the cons cell x.

An error is signaled, when x is not a cons cell.

1.2.2 Numbers


plus x y Function
difference x y Function
times x y Function
quotient x y Function
remainder x y Function

1.2.3 Symbols

Symbols are atoms distinct from any other atoms. A symbol has a name, which is a string of characters, a value, which may be unbound, and a property list. The property list is an alternating list of key value pairs.

Symbols may also have a value in the current lexical environment.

We may have actual pname properties. We may want to have some means to mark a symbol as constant. However, later in the process of booting, the primitive symbols will go away anyway. unintern? Literal marker? (Maclisp has +internal-string-marker).


symeval sym Function

    

Returns the global or dynamically bound value of the symbol sym. This is equivalent to evaluating sym in the null lexical environment.

An error is signal, if either sym is not a symbol, or the symbol is unbound and has no value.


set sym val Function

    

An error is signaled, if either sym is not a symbol, or the symbol's value may be not be changed.


boundp sym Function

    

Returns non-nil if the symbol sym is bound to a value in the current global dynamic environment and nil otherwise.

An error is signaled, when sym is not a symbol.


makunbound sym Function

    

Makes the current, if any, global dynamic binding of the sym having no value. A later symeval on this symbol will signal an error and boundp will return nil unless a new binding has been established or a new value has been set by set.


plist sym Function
setplist sym val Function

    

Returns and with setplist sets the property list of the symbol sym.

It is an error, if sym is not a symbol.


maknam name Function

    

Creates a new uninterned, unbound symbol with the given name. The name argument must be a list of character codes as unsigned integers.

Examples

(maknam '(65 66 67))    ABC
(eq (maknam '(65 66 67)) 'ABC)    NIL

Exceptional Situations

Should signal an error, when name is not a proper list of character codes.


pname sym Function

    

Returns the print name of the symbol sym.

Examples

(pname "EXAMPLE")    (69 88 65 77 80 76 69)


intern sym Function

    

Interns the symbol sym. If a symbol with the same exists already the symbol is not entered into the oblist. In any case the interned symbol with the given name is returned.

Examples

(setq x (maknam '(70 79 79)))    FOO
(eq x 'foo)    NIL
(setq y (intern x))    FOO
(eq x y)    NIL
(eq y 'FOO)    NIL

Exceptional Situations

An error is signaled, if sym is not a symbol.


putd symbol definition Function

    

Installs the lambda expression definition as the function of the symbol symbol.


getd symbol Function


fboundp symbol Function


fmakunbound symbol Function

1.2.4 Predicates

We are really minimalistic here and for comparision define eq and lessp only. These are enough to define eql and =. ALso there is no null, no not, no zerop. When efficiency is a concern, we count on the compiler to recognize e.g. (eq x nil) if it would make sense to have a special case for that.


atom x Function
symbolp x Function
numberp x Function


eq x y Function

    

Returns non-nil if the object x and y refer to the very same object, and nil otherwise.

eq applied to two cons cells return non-nil, if they were the result of the very same cons operation and changing the CAR or CDR of x would change the CAR or CDR of y.

Applying eq to two numbers of the same mathematical value may return either nil or non-nil.

Applying eq to two symbols could also be defined on changing the value or name of two symbol which are eq is reflected on the other.

Can we find a good definition for eq on function objects? This is tricky, as we want ((lambda (x) (eq x #'eq)) #'eq)T, but want to allow for ((lambda (f) (eq (f) (f))) (lambda () (lambda ()))) to be either T or NIL.


lessp x y Function

1.2.5 Hunks

Hunks are very much like vectors of lisp objects. They are, once created, of fixed length. These are included as a building block to be able to boot different data types from them and are distinct from conses, symbols, and numbers.

With the default Core Lisp syntax they print as s-expressions enclosed by square brackets, like [].


makhunk n Function

    

Creates a new hunk with room for n elements. The elements are initialized with nil.

Examples

(makhunk 5)    [NIL NIL NIL NIL NIL]
(makhunk 0)    []
(atom (makhunk 2))    T


hunkp object Function


href hunk index Function


hset hunk index value Function


hlen hunk Function

1.2.6 I/O

We specify most baisc input and output, as we want to write read and print in Lisp itself. These primitives operate on so called channels for doing I/O. We use the term channel to differenciate them from what ever will later be implemented as streams in Lisp itself.

It is not specified what kind of object these channels are. These are just some objects to enable the underlying implementation to tell channels apart. After a channel is closed a new call to OPEN may return an identical object again.

Channels could be fit for input or fit for output or both. Channels could be character channels or binary channels. When they are character channels each character read or written is an Unicode code point. When channels are binary each "character" read or written is semantically a byte, which is an eight bit byte as this is what these days is universal.

A note about end-of-file: An end-of-file is processed as another kind of item, which could be read from a channel. It is not a permanent condition per se. Some channels may deliver more input even after an end-of-file has been read. Therefore after reading end-of-file another tyi may hang and an end-of-file could be untyied. Also: On a pending end-of-file the listen primitive will return non-nil. (This is different from the behavior specified with Common Lisp).


tyo char chan Function

    

Output the character with the code point char given as an integer to the channel chan. Output may be buffered and it is not guaranteed that it will actually be send to the underlying destination until flush is invoked.

When the output channel is a character output channel the code point is specified to be an Unicode code point. When the channel is a binary output channel the given "character" is taken as an 8-bit byte to write.


flush chan Function

    

Flushes all buffered output on the channel chan.


tyi chan Function

    

Get the next character or byte from the input channel chan and returns it as an unsigned integer. If an end-of-file is present, nil is returned.


listen chan Function

    

Returns nil, if a tyi operation on the channel chan would hang and non-nil otherwise. Note that our listen returns non-nil on a channel with a pending end-of-file unlike Common Lisp's listen function.


untyi char chan Function


open
pathname
&optional direction type errorp
Function
chan

    

Opens the resource or file named by the operating system dependent name pathname for input and/or output and returns a channel designator. The pathname must be a symbol, whose print name is the pathname passed to the operating system.

direction may be one of the following values with the default being :input:

:input

The file or resource named by pathname is opened for input operations.

:output

The file or resource named by pathname is opened for output operations. If a file named by pathname does not exists an attempt is made to create it.

:append

The file is opened for writing and data written to the channel is appended to it.

The nature of the element read from or written to the channel is given by the type option, which defaults to :text:

:text

Elements read from and written to the channel are characters. The tyi and tyo primitives work on Unicode code points.

:binary

Elements read from or written to the channel are octets, 8-bit bytes represented by non-negative integers in the range 0255 inclusive.

Examples

(open 'foo.txt :output :text)    42
(tyo #/A 42)    65
(close 42)    nil

Exceptional Situations

Errors might be raised when the named resource does not exist, may not be opened or can't be created.


close chan Function

    

Closes the channel named by chan. Further I/O on that channel is an error.

Note: The channel designator chan might be reused, when a new channel is create by e.g. open.

1.3 Runtime


error
message
&rest arguments
Function

    

An error is signal with the given message and arguments in arguments. This function can be redefined and the runtime calls whatever its definition is.


eval form Function

    

Evaluates the form form and returns its value.

The default driver uses this function to evaluate forms read. It could be redefined.


toplevel Function


reclaim Function


room Function


save pathname Function


stdin-chan Variable
stdout-chan Variable
stderr-chan Variable
terminal-chan Variable

    

On a POSIX system these correspond to the 0, 1, 2 file descriptors respectively. terminal-chan corresponds /dev/tty and is a bidirectional text channel.

2 Booting Common Lisp

In this chapter we show how to boot Common Lisp semantics from our primitive set of special forms. We are not concerned about what could be regarded as just a library, but about the special forms of Common Lisp and features like being a Lisp-2, having multiple values, and special (dynamic) variables.

2.1 tagbody

The trickiest part is to implement tagbody. Core Lisp has no program feature (prog macro or special form). But Core Lisp has tail calls. A tagbody which has gos to one if its tag only in one of its top-level forms or recursively in either branches of an if, which itself is toplevel could be mechanically translated into what would be labels in Common Lisp.

For each form not being a tag in the body of the tagbody a lexical variable is bound to a closure with the form as its body prepended to an invokation of the next local function. When a tag is in front, the tag is identified by that lexical.

(tagbody
    (print 'hi)
L1  (if (= n 0) (go L2))
    (print n)
    (setq n (difference n 1))
    (go L1)
L2)

Turns into

((lambda (G1 G2 G3 G4)
   (setq G1 (lambda ()
              (print 'hi)
              (G2)))
   (setq G2 (lambda ()
              (if (= n 0) (G4) (G3))))
   (setq G3 (lambda ()
              (print n)
              (setq n (difference n 1))
              (G2)))
   (setq G4 (lambda () nil)))
  nil nil nil nil)

Note that both LISP I and Standard Lisp specify a prog special form with the constraint that a go may only be in the toplevel of the prog. And in this regard prog could be seen as a very thin layer of syntactic sugar.

However, we also want to cope with gos not at the top-level of the tagbody. And also with non-local gos. We turn our attention to local non-toplevel gos first. These could be forms like:

(foo (if x (go done) 3) 4)

2.2 block/return-from

This is the most easy of all. We just turn

(block tag
   ...
   (return-from tag 42)
   ...)

into

(let ((res nil))
  (tagbody
     ...
     (setq res 42)
     (go done)
  done)
  res)

2.3 Multiple Values