Nocturnal Lisp
The idea behind nocturnal Lisp is to have a highly portable Common Lisp
implementation written in mostly Lisp with an as small set of primitives
as possible.
1 Core Lisp
The special forms for Core Lisp are QUOTE
,
IF
, LAMBDA
, setq
,
CATCH
, and THROW
and nothing more. This is
suffice to implement all the other special forms, which are specified in
Common Lisp. To keep things simple our Core Lisp is a Lisp-1.
(Which is a bit unfortunate as Core Lisp programs are not Common
Lisp programs.
An interpreter in Common Lisp is given below. Note that this interpreter
is only almost meta-circular. Being meta-circular is entertaining, but
of no particular value.
Sample Implementation
|
(defun eval (form env)
(cond ((symbolp form) (evsym form env))
((atom form) form)
((eq 'QUOTE (car form)) (cadr form))
((eq 'FUNCTION (car form)) (evfun (cadr form) env))
((eq 'IF (car form)) (if (eval (cadr form) env)
(eval (caddr form) env)
(eval (cadddr form) env)))
((eq 'SETQ (car form)) (setsym (cadr form)
(eval (caddr form) env)
env))
((eq 'CATCH (car form)) (catch (eval (cadr form) env)
(evbody (cddr form) env)))
((eq 'THROW (car form)) (throw (eval (cadr form) env)
(eval (caddr form) env)))
((eq 'CAR (car form)) (car (eval (cadr form) env))
: ... all the SUBRs
:
(t (apply (evfun (car form) env)
(evlis form env))))))
(defun evlis (forms env)
(cond ((null forms) nil)
(t (cons (eval (car forms) env)
(evlis (cdr forms) env)))))
(defun evbody (forms env)
(cond ((null forms) nil)
((null (cdr forms)) (eval (car forms) env))
(t (eval (car forms) env)
(evbody (cdr forms) env))))
(defun pairlis (p a e)
(cond ((and (null p) (null a)) e)
((null p) (error "Too many arguments"))
((symbolp p) (cons (cons p a) e))
((null a) (error "Too few arguments"))
(t (cons (cons (car p) (car a))
(pairlis (cdr p) (cdr a) e)))))
(defun evsym (sym env)
(let ((q (assoc sym env)))
(cond ((not (null q)) (cdr q))
(t (symeval sym)))))
(defun evfun (f e)
(cond ((symbolp f)
(getd f))
((and (consp f) (eq 'LAMBDA (car f)))
(cons 'closure (cons e (cdr f))))
(t
(error "Bad function"))))
(defun setsym (sym val env)
(let ((q (assoc sym env)))
(cond (q (rplacd q val))
(t (set sym val)))
val))
|
(For brevity the necessary error checking has been skipped).
The special forms follow their usual semantics. One thing worth
mentioning is that LAMBDA
takes none of the lambda keywords
like &optional
or &rest
. But lambda
lists could be dotted, with the trailing symbol taking what ever
arguments are left. That is: A lambda list
(x . r)
is what CL has as
(x &rest r)
.
Although strictly speaking an implementation detail, it is crucial that
tail calls are optimized. Tail positions are the last form in a
LAMBDA
body and both branches of an IF
. This
is of practical importance as we implement branches by tail calling to a
continuation.
1.1 Primitives
We specify that a form invoking a primitve like
(PLUS 1 2)
is recognized by the interpreter directly. To make funcall
work with e.g.
(SETQ F PLUS)
(FUNCALL F 1 2)
We place stubs into the symbol function cells, like
(PUTD' PLUS (LAMBDA (X Y) (PLUS X Y)))
Note, that none of our primitives take a variable number of arguments.
1.2 Data Types
To keep things simple our Core Lisp has the following data types
only:
- Symbols
- Numbers
- Cons cells
- Functions
We punt on numbers and do not specify exactly what kind of numbers they
are.
TODO
When we allow for our numbers to be floating point numbers, we also
need something to explode and implode them, as this is not trivial and
more or less impossible to do without
integer-decode-float
and
integer-encode-float
. (Well when we know the internal
representation, it ought to be safe to use log
and
expt
to read the exponent and shift the mantissa). It's
not as bad as it sounds, as we need to implement floating point
printing on our own anyway, as most non-Lisp runtimes cannot do that.
Functions. The most sane course is that closures and primitives are
function objects. For the latter we could do without, but not for the
former. We might want to leave that as an abstract data type though,
with the only promise that functionp
definitly returns
nil
on objects not fit to be invoked as a function.
Also: For a real spec, we would need to specify our number data type.
Which somewhat defeats our goal, as we want to have a small Core Lisp,
which is easy to implement in different languages. And JS for instance
has no distinction between integers and double floats. So to tell
flonums from fixnums would entail needing to box on of those with JS.
(Bignums are no concern here, we would implement these in Lisp).
1.2.1 Conses
|
Creates a new cons cell with x in its CAR and
y in its CDR.
|
|
Retrieves the CAR or CDR of the cons cell x. When
x is nil the result is nil .
An error is signaled, when x is neither
nil or a cons cell.
|
|
Replaces the CAR or CDR of the cell cell x with
y. The return value is the cons cell x.
An error is signaled, when x is not a cons cell.
|
1.2.2 Numbers
1.2.3 Symbols
Symbols are atoms distinct from any other atoms. A symbol has a name,
which is a string of characters, a value, which may be unbound, and a
property list. The property list is an alternating list of key value
pairs.
Symbols may also have a value in the current lexical environment.
We may have actual pname
properties. We may want to have
some means to mark a symbol as constant. However, later in the process
of booting, the primitive symbols will go away anyway.
unintern
? Literal marker? (Maclisp has
+internal-string-marker
).
|
Returns the global or dynamically bound value of the symbol
sym. This is equivalent to evaluating sym in
the null lexical environment.
An error is signal, if either sym is not a symbol, or
the symbol is unbound and has no value.
|
|
An error is signaled, if either sym is not a symbol,
or the symbol's value may be not be changed.
|
|
Returns non-nil if the symbol sym is
bound to a value in the current global dynamic environment and
nil otherwise.
An error is signaled, when sym is not a symbol.
|
|
Makes the current, if any, global dynamic binding of the
sym having no value. A later symeval on
this symbol will signal an error and boundp will
return nil unless a new binding has been
established or a new value has been set by set .
|
setplist
|
sym val
|
Function
|
|
Returns and with setplist sets the property list
of the symbol sym.
It is an error, if sym is not a symbol.
|
|
Creates a new uninterned, unbound symbol with the given name.
The name argument must be a list of character codes
as unsigned integers.
Examples
(maknam '(65 66 67)) ⇒ ABC
(eq (maknam '(65 66 67)) 'ABC) ⇒ NIL
Exceptional Situations
Should signal an error, when name is not a proper
list of character codes.
|
|
Returns the print name of the symbol sym.
Examples
(pname "EXAMPLE") ⇒ (69 88 65 77 80 76 69)
|
|
Interns the symbol sym. If a symbol with the same
exists already the symbol is not entered into the oblist. In
any case the interned symbol with the given name is returned.
Examples
(setq x (maknam '(70 79 79))) ⇒ FOO
(eq x 'foo) ⇒ NIL
(setq y (intern x)) ⇒ FOO
(eq x y) ⇒ NIL
(eq y 'FOO) ⇒ NIL
Exceptional Situations
An error is signaled, if sym is not a symbol.
|
putd
|
symbol definition
|
Function
|
|
Installs the lambda expression definition as the
function of the symbol symbol.
|
fmakunbound
|
symbol
|
Function
|
1.2.4 Predicates
We are really minimalistic here and for comparision define
eq
and lessp
only. These are enough to define
eql
and =
. ALso there is no null
,
no not
, no zerop
. When efficiency is a
concern, we count on the compiler to recognize e.g.
(eq
x nil)
if it would make
sense to have a special case for that.
|
Returns non-nil if the object x and
y refer to the very same object, and nil
otherwise.
eq applied to two cons cells return
non-nil , if they were the result of the very same
cons operation and changing the CAR or CDR of
x would change the CAR or CDR of y.
Applying eq to two numbers of the same
mathematical value may return either nil or
non-nil .
Applying eq to two symbols could also be defined
on changing the value or name of two symbol which are
eq is reflected on the other.
Can we find a good definition for eq on
function objects? This is tricky, as we want ((lambda
(x) (eq x #'eq)) #'eq) ⇒ T , but
want to allow for ((lambda (f) (eq (f) (f))) (lambda
() (lambda ()))) to be either T or
NIL .
|
1.2.5 Hunks
Hunks are very much like vectors of lisp objects. They are, once
created, of fixed length. These are included as a building block to be
able to boot different data types from them and are distinct from
conses, symbols, and numbers.
With the default Core Lisp syntax they print as s-expressions enclosed
by square brackets, like [
…]
.
|
Creates a new hunk with room for n elements. The
elements are initialized with nil .
Examples
(makhunk 5) ⇒ [NIL NIL NIL NIL NIL]
(makhunk 0) ⇒ []
(atom (makhunk 2)) ⇒ T
|
hset
|
hunk index value
|
Function
|
1.2.6 I/O
We specify most baisc input and output, as we want to write
read
and print
in Lisp itself. These
primitives operate on so called channels for doing I/O. We use
the term channel to differenciate them from what ever will
later be implemented as streams in Lisp itself.
It is not specified what kind of object these channels are.
These are just some objects to enable the underlying implementation to
tell channels apart. After a channel is closed a new call to
OPEN
may return an identical object again.
Channels could be fit for input or fit for output or both. Channels
could be character channels or binary channels. When they are character
channels each character read or written is an Unicode code point. When
channels are binary each "character" read or written is semantically a
byte, which is an eight bit byte as this is what these days is universal.
A note about end-of-file: An end-of-file is processed as
another kind of item, which could be read from a channel. It is not a
permanent condition per se. Some channels may deliver more input even
after an end-of-file has been read. Therefore after reading
end-of-file another tyi
may hang and an
end-of-file could be untyi
ed. Also: On a pending
end-of-file the listen
primitive will return
non-nil
. (This is different from the behavior specified
with Common Lisp).
|
Output the character with the code point char given
as an integer to the channel chan. Output may be
buffered and it is not guaranteed that it will actually be
send to the underlying destination until flush is
invoked.
When the output channel is a character output channel the code
point is specified to be an Unicode code point. When the
channel is a binary output channel the given "character" is
taken as an 8-bit byte to write.
|
|
Flushes all buffered output on the channel chan.
|
|
Get the next character or byte from the input channel
chan and returns it as an unsigned integer. If an
end-of-file is present, nil is returned.
|
|
Returns nil , if a tyi operation on
the channel chan would hang and non-nil
otherwise. Note that our listen returns
non-nil on a channel with a pending
end-of-file unlike Common Lisp's listen
function.
|
open
|
pathname
|
&optional
|
direction type errorp
|
|
|
Function
|
→
|
chan
|
|
Opens the resource or file named by the operating system
dependent name pathname for input and/or output and
returns a channel designator. The pathname must be a
symbol, whose print name is the pathname passed to the
operating system.
direction may be one of the following values with the
default being :input :
-
:input
-
The file or resource named by pathname is opened
for input operations.
-
:output
-
The file or resource named by pathname is opened
for output operations. If a file named by
pathname does not exists an attempt is made to
create it.
-
:append
-
The file is opened for writing and data written to the
channel is appended to it.
The nature of the element read from or written to the channel
is given by the type option, which defaults to
:text :
-
:text
-
Elements read from and written to the channel are
characters. The tyi and tyo
primitives work on Unicode code points.
-
:binary
-
Elements read from or written to the channel are octets,
8-bit bytes represented by non-negative integers in the
range 0 …255 inclusive.
Examples
(open 'foo.txt :output :text) ⇒ 42
(tyo #/A 42) ⇒ 65
(close 42) ⇒ nil
Exceptional Situations
Errors might be raised when the named resource does not exist,
may not be opened or can't be created.
|
|
Closes the channel named by chan. Further I/O on that
channel is an error.
Note: The channel designator chan might be reused,
when a new channel is create by e.g. open .
|
1.3 Runtime
|
An error is signal with the given message and arguments in
arguments. This function can be redefined and the
runtime calls whatever its definition is.
|
|
Evaluates the form form and returns its value.
The default driver uses this function to evaluate forms read.
It could be redefined.
|
|
On a POSIX system these correspond to the 0, 1, 2 file
descriptors respectively. terminal-chan
corresponds /dev/tty and is a bidirectional text
channel.
|
2 Booting Common Lisp
In this chapter we show how to boot Common Lisp semantics from our
primitive set of special forms. We are not concerned about what could be
regarded as just a library, but about the special forms of Common Lisp
and features like being a Lisp-2, having multiple values, and special
(dynamic) variables.
2.1 tagbody
The trickiest part is to implement tagbody
. Core Lisp
has no program feature (prog
macro or special form). But
Core Lisp has tail calls. A tagbody
which has
go
s to one if its tag only in one of its top-level forms or
recursively in either branches of an if
, which itself is
toplevel could be mechanically translated into what would be
labels
in Common Lisp.
For each form not being a tag in the body of the tagbody
a
lexical variable is bound to a closure with the form as its body
prepended to an invokation of the next local function. When a tag is in
front, the tag is identified by that lexical.
(tagbody
(print 'hi)
L1 (if (= n 0) (go L2))
(print n)
(setq n (difference n 1))
(go L1)
L2)
Turns into
((lambda (G1 G2 G3 G4)
(setq G1 (lambda ()
(print 'hi)
(G2)))
(setq G2 (lambda ()
(if (= n 0) (G4) (G3))))
(setq G3 (lambda ()
(print n)
(setq n (difference n 1))
(G2)))
(setq G4 (lambda () nil)))
nil nil nil nil)
Note that both LISP I and Standard Lisp specify a prog
special form with the constraint that a go
may only be in
the toplevel of the prog
. And in this regard
prog
could be seen as a very thin layer of syntactic sugar.
However, we also want to cope with go
s not at the top-level
of the tagbody
. And also with non-local go
s.
We turn our attention to local non-toplevel go
s first.
These could be forms like:
(foo (if x (go done) 3) 4)
2.2 block
/return-from
This is the most easy of all. We just turn
(block tag
...
(return-from tag 42)
...)
into
(let ((res nil))
(tagbody
...
(setq res 42)
(go done)
done)
res)
2.3 Multiple Values