Coding a new BASIC interpreter in 2025 to replace a slow one
nanochess.org81 points by nanochess 3 days ago
81 points by nanochess 3 days ago
For those interested in BASIC, here's "A curated list of awesome BASIC dialects, IDEs, and tutorials":
https://github.com/JohnBlood/awesome-basic?tab=readme-ov-fil...
It's not as popular as Python, obviously, but that lists over fifty implementations of BASIC.
I applied for a job about 12 years ago where the company was still using BASIC for some of their software. If I remember correctly it was numbered BASIC, not the more modern stuff. I think the software was doing some type of accounting—stuff that worked and they didn't want to change.
There is a chapter in the Blue Book about how the GW-BASIC byte code is structured, and from what I understand it used pointers to lines, not just offsets? But I did not look too carefully (guess the answer is in the source code: https://gitlab.com/tkchia/GW-BASIC).
That book is full of interesting facts and fun low-level tricks for (GW-)BASIC programming. Available for download here: https://github.com/robhagemans/hoard-of-gwbasic
Before reading that I never considered how primitive early BASICs were. There is a lot of linear-searching for things (variables, line-numbers) that has to be considered when optimizing.
Oscar Toledo (nanochess) is one of my personal heros. And one of his books, Boot Sector Games, is one of my favorite books. Seeing him post something like he just did gives me a kind of joy that is becoming pretty rare on the internet these days, for me at least.
> I discovered the pointer to the next line wasn't a good idea, because it needed to move every pointer after a line insertion.
Huh? Don't you need to only change the "next-line-pointer" for the line that's right before the inserted line?
> but the NEXT changed the line, but on the next statement it would lost track and get back to the line following the NEXT. The loops also require their own stack, but including the counter variable address, a pointer to the TO expression, and a pointer to the STEP expression (5 words in total).
Mmm. IIRC, usually the compiled NEXT statement would store the pointer to the corresponding FOR statement, so you don't need an additional stack for loop depth during the execution. But you still need it (or some other sort of chaining) during the program input so whatever.
> Typing the program was difficult, as the keyboard bounced a lot. This happens when you read too fast the keyboard, so fast you can see that effectively the key contact isn't perfect.
Yeah... I've read that keyboard microcontrollers has to deal with contact bounce even today.
> Mmm. IIRC, usually the compiled NEXT statement would store the pointer to the corresponding FOR statement, so you don't need an additional stack for loop depth during the execution
I think you do. Apart from common sense, nothing forbids one from writing stuff like
100 for i = 1 to 10
110 if i = 4 gosub 100
120 print i
130 next
140 return
I think many basics also allowed changing that goto 100 to goto 200 and adding 200 for j = 1 to 4
210 print i
220 print j
230 next
Yes, things would likely end badly, but the basic interpreter would not be smart enough to reject such programs. Its editor didn’t even guarantee that a for statement had a corresponding next or vice versa; all it guaranteed was that the program consisted of a list of lines that each in isolation are valid basic code.This is some beautifully horrible spaghetti, well done. It's like I'm eleven years old and barely understand what's going on in the computer again.
I just fired up VICE and my virtual c64 happily ran both of those, if throwing an "out of memory" error after about five runs through the first one counts as "happily".
> I discovered the pointer to the next line wasn't a good idea, because it needed to move every pointer after a line insertion.
I get the impression that they were storing everything sequentially in memory, rather than having a linked list of instructions. Why? I can only speculate. Perhaps it is to make memory management simpler (don't have to keep track of which addresses are in use), or to avoid memory fragmentation in system with limited memory (any modification of code would introduce unusable holes). If that's the case, what you want is an offset rather than an absolute address.
> I get the impression that they were storing everything sequentially in memory, rather than having a linked list of instructions. Why? I can only speculate.
I expect that’s because that is how ‘every’ homecomputer basic did it. Yes, that makes it slow to insert or remove a line close to the start of a long program, but it allow those offsets to be 8 bits, gaining a precious byte over a 16-bit absolute address.
Now, why they initially chose to waste those bytes? I wouldn’t know, but I guess that, because (FTA) “The CP1610 processor cannot address directly the internal memory in byte terms, instead everything is handled by full word”, they didn’t think of using a single byte.
> Yeah... I've read that keyboard microcontrollers has to deal with contact bounce even today.
That will always be the case in hardware. The switch to on will be messy, for example like this:
https://makeabilitylab.github.io/physcomp/arduino/assets/ima...
I love basic
Can you OOPize it?
The following BASIC implementations support OOP:
- Visual Basic .NET
- PureBasic
- XoJo
- FreeBASIC
- Gambas
- PowerBASIC
You can peruse various implementations, IDEs, and tutorials here: https://github.com/JohnBlood/awesome-basic
I always wondered if any engineers suggested changes to make some BASICs faster and the companies didn't want it competing with "real" software