The Tiny BASIC Interpretive Language IL—and Onions

Tiny BASIC implementation onion diagram

In the fall of 1975, the People’s Computer Company (PCC) poured more energy into their “Build Your Own BASIC” project, fired up by the announcement of Altair 4K BASIC by MicroSoft [as it was then] for $150. They wanted to create an implementation that was free for anyone to use. With microcomputers and microprocessors proliferating, they envisioned a virtual machine that could be ported from platform to platform:

When you write a program in TINY BASIC there is an abstract machine which is necessary to execute it. If you had a compiler it would make in the machine language of your computer a program which emulates that abstract machine for your program. An interpreter implements the abstract machine for the entire language and rather than translating the program once to machine code it translates it dynamically as needed. Interpreters are programs and as such have theirs as abstract machines. One can find a better instruction set than that of any general-purpose computer for writing a particular interpreter. Then one can write an interpreter to interpret the instructions of the interpreter which is interpreting the TINY BASIC program. And if your machine is micro-programmed (like PACE), the machine which is interpreting the interpreter interpreting the interpreter interpreting BASIC is in fact interpreted.

This multilayered, onion-like approach gains two things: the interpreter for the interpreter is smaller and simpler to write then an interpreter for all of TINY BASIC, so the resultant system is fairly portable. Secondly, since the main part of the TINY BASIC is programmed in a highly memory efficient, tailored instruction set, the interpreted TINY BASIC will be smaller than direct coding would allow. The cost is in execution speed, but there is not such a thing as a free lunch.

“The machine … interpreting the interpreter interpreting the interpreter interpreting BASIC is in fact interpreted.” These guys were having fun! Geeky fun, but fun.

While writing compilers was just being generalized (lex and yacc were published circa 1975), PCC recognized that hobbyists wouldn’t have access to these new tools. Writing a parser is a very specific domain. Accordingly, they created a simple language, IL (Interpretive Language), to write their Tiny BASIC in and encode the parsing in.

Let’s look at a representative example:

 S1: TST     S3,'GO'       ;GOTO OR GOSUB?
     TST     S2,'TO'       ;YES...TO, OR...SUB
     CALL    EXPR          ;GET LABEL
     DONE                  ;ERROR IF CR NOT NEXT
     XFER                  ;SET UP AND JUMP

A common pattern in IL is to test for a keyword or token, then act on that information. Each test is an assertion as to what is next in the line buffer. If the assertion fails, control jumps to a subsequent label (usually looking for a new keyword or token). Here the system advances its buffer cursor over any spaces and tests for “GO” and if it fails to find it then jumps to line “S3”. If it finds it, execution continues with the next IL command. In this case, the system next tests for “TO”, skipping to line “S2” if it fails (a test for “SUB”, to see if this is instead a GOSUB command). If it passes, control continues; in this case, calling an IL subroutine that starts at label “EXPR”, which parses an expression. In Tiny BASIC, “GOTO X*10+100” (a computed GO TO) is as legal as “GOTO 100” and is the alternative to the ON-GOTO of larger BASIC implementations. The subroutine “EXPR” pushes the result of the expression onto the arithmetic stack (in this case, the line number). “DONE” verifies no other text follows the expression and gives an error if it does. “XFER” pops the number of the stack and transfers (GOTOs) the corresponding line number, if it exists.

The IL listing was concise: just 124 lines long! Just 124 lines to implement a workable BASIC that you can write programs and even games in. It’s very elegant.

IL is simple enough that I thought I could program it in a high-level language in an evening. (It took two, plus another to debug.) I wrote a two-pass assembler from the specs, implementing each command. Then I assembled the program, using the listing that Tom Pittman had shared and corrected a few errors in. As far I can tell, I’m the first person to actually write an IL assembler in the 42 years the language has been around. (Tom had his own version of the IL: he changed the mnemonics to two characters each, to be easier to parse on a microcomputer, and modified some commands, and added others.)

I encountered three different types of problems.

The first class of problems were mental mistakes in the listing and the spec:

  • “CALL VAR” needed to be “TSTV S17”, as the listing has no VAR routine but does have a built-in command to test a variable.
  • The use of “MPY” in the listing for the “MUL” command (all five math functions in IL are the first three letters of their corresponding word: ADDition, SUBtraction, MULtiplication, DIVision, NEGation).
  • The language specification neglected to define “LIT”, which pushes the following byte onto the arithmetic stack.

The second problem was that the language hardcodes jumps: for instance, “DONE” jumps to the third instruction (label “CO”). For what is essentially a finite state machine, I found this to be pretty insufferable. In my implementation, I changed commands to take their destination label. (In defense of PCC, they had to worry about carefully managing 4KB of RAM. I have 16GB: 4 million times more.) I later discovered that Tom Pittman changed “DONE” to “BE” (“Branch if Not Endline”) with a label as his solution to this issue.

The third problem with the program was subtler, and escaped my notice at first. Typically, “RUN” resets all variables (in this case, the 26 variables A to Z) to zero. There’s nowhere for that to happen in the listing. You could do this programmatically in IL:

1
0
STORE
2
0
STORE
…
26
0
STORE

However, 78 added instructions to a listing of 124 instructions (not counting comments) seemed excessive.

Sadly, while IL uses a stack machine, it doesn’t provide any way to actually manipulate the stack: no DUP, no SWAP, no conditional jump based on the top of the stack. If it had, the following would have been possible:

LIT 26    ; START AT THE 26TH VARIABLE
VINIT: DUP      ; DUPLICATE THE TOP OF THE STACK
LIT 0      ; PUSH 0
STORE   ; STORE 0 INTO THE VARIABLE
LIT 1      ; PUSH 1
SUB       ; SUBTRACT 1 FROM THE VARIABLE INDEX
DUP       ; DUPLICATE
JMPC VINIT  ; JUMP IF NOT 0

So I implemented a VINIT keyword to do this instead.

Tom Pittman notes:

The TINY BASIC interpreter was designed by Dennis Allison as a Recursive Descent parser. Some of the elegant simplicity of this design was lost in the addition of syntactical sugar to the language but the basic form remains. The IL is especially suited to Recursive Descent parsing of TINY BASIC because of the general recursive nature of its procedures and the simplicity of the TINY BASIC tokens. The IL language is effectively optimized for the interpretation of TINY. Experience has shown that the difficulty of adding new features to the language is all out of proportion with the nature of the features. Usually it is necessary to add additional machine language subroutines to support the new features. Often the difficulty outweighs the advantages.

This is from The Tiny BASIC Experimenter’s Kit that he wrote 40 years ago. It is well worth reading in detail, as his is the only Tiny BASIC implementation to use an IL. (The first Tiny BASIC, TBX, used the IL as pseudocode for the outline of the machine language program.)

Here are additions I made to the IL to extend my own Tiny BASIC:

  1. I added a single array A(), a la Palo Alto Tiny BASIC’s @(), which can take a subscript from 0 to 255. (But note that subscripts of 230 to 255 map to the values of variables A to Z, since I use the “IND” operator to reference variables.)
  2. I added RND(X), which returns a pseudorandom number from 1 to X.
  3. I added ASC(“A”), which can take as an argument “A” to “Z” and returns the ASCII value from 65 to 90. (This limitation is because it uses the TSTV function as a test for a letter of the alphabet.) I added this because I extended INPUT A to return the ASCII value of the first character if a number is not entered. This is so that rather than “1 FOR YES OR 2 FOR NO” you can write “(Y)ES OR (N)O” and IF A=ASC(“Y”) THEN.

If you have an iPhone, you can download the free version of LowRes Coder and use my Tiny BASIC on your phone.

Whatever its merits as a language, IL helped inspire a proliferation of BASICs to compete with Microsoft. There was Tiny BASIC Extended       , Denver Tiny BASIC, 6800 Tiny BASIC, MINOL, Palo Alto Tiny BASIC, Integer BASIC, and Micro Basic. Palo Alto Tiny BASIC became the basis of TRS-80 Level I BASIC, which launched the TRS-80. Of course, Microsoft BASIC ultimately triumphed, replacing Integer BASIC on the Apple, Level I BASIC on the TRS-80, and being the only BASIC for the Commodore PET. And even today you can buy a Microsoft BASIC for your own use, as part of Visual Studio. Yet IL remains an important landmark in the history of personal computing.

Modified Tiny BASIC IL Listing

;THE IL CONTROL SECTION
START: INIT                  ;INITIALIZE
NLINE                 ;WRITE CRLF
CO:    GETLINE               ;WRITE PROMPT AND GET LINE
TSTL    XEC      ;TEST FOR LINE NUMBER
INSERT                ;INSERT IT (MAY BE DELETE)
JMP     CO
;
;STATEMENT EXECUTOR
XEC:   XINIT                 ;INITIALIZE
STMT:  TST     S1,'LET'      ;IS STATEMENT A LET
TST     S1A,'A('
CALL    EXPR
TST     SYN,')'
JMP     S1B
S1A:   TSTV    SYN           ;YES, PLACE VAR ADDRESS ON AESTK
S1B:   TST     SYN,'='       ;(THIS LINE ORIGINALLY OMITTED)
CALL    EXPR          ;PLACE EXPR VALUE ON AESTK
DONE    SYN           ;REPORT ERROR IF NOT NEXT
STORE                 ;STORE RESULT
NXT     STMT          ;AND SEQUENCE TO NEXT
S1:    TST     S3,'GO'       ;GOTO OR GOSUB?
TST     S2,'TO'       ;YES...TO, OR...SUB
CALL    EXPR          ;GET LABEL
DONE    SYN           ;ERROR IF CR NOT NEXT
XFER    STMT          ;SET UP AND JUMP
S2:    TST     SYN,'SUB'     ;ERROR IF NO MATCH
CALL    EXPR          ;GET DESTINATION
DONE    SYN           ;ERROR IF CR NOT NEXT
SAV                   ;SAVE RETURN LINE
XFER    STMT          ;AND JUMP
S3:    TST     S8,'PRINT'    ;PRINT
S4:    TST     S7,'"'        ;TEST FOR QUOTE
PRS                   ;PRINT STRING
S5:    TST     S6,','        ;IS THERE MORE?
SPC                   ;SPACE TO NEXT ZONE
JMP     S4            ;YES JUMP BACK
S6:    DONE    SYN           ;ERROR IF CR NOT NEXT
NLINE
NXT     STMT
S7:    CALL    EXPR
PRN                   ;PRINT IT
JMP     S5            ;IS THERE MORE?
S8:    TST     S9,'IF'       ;IF STATEMENT
CALL    EXPR          ;GET EXPRESSION
CALL    RELOP         ;DETERMINE OPR AND PUT ON STK
CALL    EXPR          ;GET EXPRESSION
TST     SYN,'THEN'    ;TEST FOR THEN
CMPR    STMT          ;COMPARE -- PERFORMS NXT IF FALSE
JMP     STMT
S9:    TST     S12,'INPUT'   ;INPUT STATEMENT
S10:   TSTV    SYN           ;YES, PLACE VAR ADDRESS ON AESTK
INNUM                 ;MOVE NUMBER FROM TTY TO AESTK
STORE                 ;STORE IT
TST     S11,','       ;IS THERE MORE?
JMP     S10           ;YES
S11:   DONE    SYN           ;MUST BE CR
NXT     STMT          ;SEQUENCE TO NEXT
S12:   TST     S13,'RETURN'  ;RETURN STATEMENT
DONE    SYN           ;MUST BE CR
RSTR                  ;RESTORE LINE NUMBER OF CALL
NXT     STMT      ;SEQUENCE TO NEXT STATEMENT
S13:   TST     S14,'END'
FIN     CO
S14:   TST     S14A,'LIST'   ;LIST COMMAND
DONE    SYN
LST
NXT     STMT
S14A:   TST    S15,'LLIST'
DONE    SYN
LLST
NXT     STMT
S15:   TST     S16,'RUN'     ;RUN COMMAND
DONE    SYN
VINIT                 ;(NEW COMMAND TO CLEAR VARIABLES)
LIT     1             ;(NEED TO TRANSFER TO FIRST LINE OF PROGRAM)
XFER    STMT          ;(XFER MODIFIED TO FIND NEXT HIGHER LINE # RATHER THAN PRODUCE AN ERROR)
S16:   TST     SYN,'CLEAR'   ;CLEAR COMMAND
DONE    SYN
JMP     START
SYN:   ERR     CO            ;SYNTAX ERROR ***
EXPR:  TST     E0,'-'
CALL    TERM          ;TEST FOR UNARY -.
NEG                   ;GET VALUE
JMP     E1            ;NEGATE IT
E0:    TST     E1A,'+'       ;LOOK FOR MORE
E1A:   CALL    TERM          ;TEST FOR UNARY +
E1:    TST     E2,'+'        ;LEADING TERM
CALL    TERM
ADD
JMP     E1
E2:    TST     E3,'-'        ;ANY MORE?
CALL    TERM          ;DIFFERENCE TERM
SUB
JMP     E1
E3:    RTN                   ;ANY MORE?
TERM:  CALL    FACT
T0:    TST     T1,'*'
CALL    FACT          ;PRODUCT FACTOR.
MUL
JMP     T0
T1:    TST     E3,'/'
CALL    FACT          ;QUOTIENT FACTOR.
DIV
JMP     T0
FACT:  TST     F01,'RND('    ;(ADDED RND(X) FUNCTION WHERE X>1)
CALL    EXPR          ;***
TST     SYN,')'       ;***
RND                   ;(ADDED RND COMMAND TO PUSH # ON AESTK)
RTN                   ;***
F01:   TST     F0A,'ASC(''   ;(ADDED ASC("A") FUNCTION)
TSTV    SYN           ;***
TST     SYN,'')'      ;***
LIT     166           ;(A = 231 IN MEMORY SCHEME)
SUB                   ;(REDUCE BACK TO 65 TO 90)
RTN                   ;***
F0A:   TST     F0B,'A('      ;(ADDED 1 BUILT-IN ARRAY)
CALL    EXPR          ;***
TST     SYN,')'       ;***
IND                   ;(CHANGED IND TO SUPPORT 0 TO 255)
RTN                   ;***
F0B:   TSTV    F0            ;***
IND                   ;YES, GET THE VALUE.
RTN
F0:    TSTN    F1            ;NUMBER, GET ITS VALUE.
RTN
F1:    TST     F2,'('        ;PARENTHESIZED EXPR.
CALL    EXPR
TST     F2,')'
RTN
F2:    ERR     CO            ;ERROR.
RELOP: TST     R0,'='        ; 0, =; 1, <; 2, <=; 3, <>; 4, >; 5, >=
LIT     0             ;=
RTN
R0:    TST     R4,'<'
TST     R1,'='
LIT     2             ;<=
RTN
R1:    TST     R3,'>'
LIT     3             ;<>
RTN
R3:    LIT     1             ;<
RTN
R4:    TST     SYN,'>'
TST     R5,'='
LIT     5             ;>=
RTN
R5:    TST     R6,'<'
LIT     3             ; ><
RTN                   ;(THIS LINE ORIGINALLY OMITTED)
R6:    LIT     4             ; >
RTN
EOF:                         ;(ENDS INPUT)

Leave a Reply

Your email address will not be published. Required fields are marked *