The update that describes why 4KB of RAM was necessary was interesting to me. In...

seanmcdirmid · on May 3, 2014

Around the same time, I wrote a JVM for the PalmPilot [1]. I think it had to fit into 128K of ram with an additional 256K of memory for data (like the class libraries). I was able to fit the entire JDK 1.0.2 runtime into it by transforming the byte code.

But 4K...wow!

[1] http://lampwww.epfl.ch/~mcdirmid/papers/ghost.pdf

radio4fan · on May 3, 2014

Yep, 4k is pretty wow indeed.

My first computer was a Sinclair ZX81, which only had 1K or RAM, but there was 8K of ROM, and the BASIC interpreter was pretty, um basic.

analog31 · on May 3, 2014

The hardware was interesting too. It bit-banged the video display circuit, leaving only the re-trace interval for general purpose computation, including running BASIC programs. Yet the result was an extremely simple and inexpensive design.

Every early microcomputer had to come up with a solution to the video display problem. As I understand it, the Apple II had a bus architecture that interleaved the clock timing of the video circuit and the CPU, sharing a single bus. The Commodore computers had custom video graphics chips. And so forth.

sitkack · on May 3, 2014

That is amazing. I used a Dallas Semiconductor 8051 with a JVM on it. Can't believe you fit a JVM into 4k of ROM.

k1w1 · on May 3, 2014

Since the product itself is now discontinued (though it did sell for 10 years), I have posted the source code for posterity:

https://github.com/k1w1/javelin-stamp/blob/master/asm/javeli...

Reading back over the assembly code gives me fond memories - but I am glad I don't spend my nights debugging my code with an oscilloscope and flashing LEDs anymore. It is a bit more productive coding in Ruby!

The JVM had some limitations to fit into 4k: no floating point, 16-bit integers and no dynamic linking. There was a PC program that took the Java class files, statically linked them and translated the bytecode to a more compressed form for download to the chip.

sitkack · on May 3, 2014

This is great, reading through it now.

What do you think about Dalvik's register based VM ?

Have you looked http://www.jopdesign.com/ ?

Are you doing anything hardware related these days ?

k1w1 · on May 4, 2014

At the time I wrote this code there was a lot of talk about Java specific processors. Ultimately that was a mistake because Moore's law meant the x86 architecture (and JIT technology) got faster more quickly that anyone could get chips to market.

These days I am about as far from embedded systems as you can get. I write Ruby on Rails code for aha.io. There is a lot of satisfaction in seeing the perfect waveform on your oscilloscope when the assembly code finally works - but these days I think there is much more satisfaction in being able to crank out a complex algorithm with some elegant Ruby and have it being used by customers before going home for dinner.

sitkack · on May 3, 2014

I don't yet understand how you did memory allocation. Could you give me a pointer?

k1w1 · on May 4, 2014

Well, it was 16 years since I wrote most of this so my memory is a bit fuzzy...

The stack and heap are both stored in an external 32kB SRAM. The stack is accessed simply by pushing and popping using the JVMPush and JVMPop routines around line 4553. The CPU is 8-bit, bit the JVM is 16-bit, so everything takes two operations to write both bytes. You can see the stack frame format at line 48.

Java objects and arrays are allocated on another stack that acts as the heap. Objects are allocate only - they are never freed. This isn't as big a deal as you might think since in embedded applications the code tends to just repeat the same operations over and over so you write your code to reuse the same instances instead of allocating new objects (which is slow anyway).

_do_new on line 2044 allocates a new object. Arrays are allocated at line 1991 in _l_j_newarray.

The nice thing about the JVM (at least version 1 which this implements) is that there are not many variations on operations - and if you statically link you can reduce some of the variations to common cases too.

sitkack · on May 4, 2014

I had a hunch it might be like this. The embedded Java I did on the Dallas part was using fixed sized arrays.

I'd actually love an annotation for Python that ran it under the null-collector. Lots of times in short run or steady state programs one doesn't generate any garbage and the constant GC or ref count over head could done away with.

Did you look @ the Bob language? It was the spiritual seed for Java by David Betz http://www.xlisp.org/

https://www.google.com/search?tbm=pts&hl=en&q=dr+dobbs+betz

I really like http://www.amazon.com/Threaded-Interpretive-Languages-Design...

and

http://users.ece.cmu.edu/~koopman/stack_computers/