Category Archives: Hardware

The final Cube – Snow white coffin

This is it. Hooray! The final Cube as I always wanted it to be.
It just took me about 2 years of planning, blood, sweat & tears, huffing and puffing. Many tries to find the right parts, plenty materials evaluated always trying to keep the budget low.

The sleeping beauty – yeah, it’s a bit snow white coffin’ish

Meet the ancestors

You might have followed the route I took for quite some time now:
It began with the ‘Tower of Power‘, basically a component carrier with a power-supply.
After some years it was replaced by the first Cube.  Well, yes, while it had a somewhat cubic’ish case, is still was just a dull standard industrial case. Not really what I imagined how my computer should look.

Form follows function

If you’ve read some posts here on GeekDot, you might already got the impression that I’m a sucker for design. Well, not that kind of a surprise, given I studied design some decades ago 🙄
I’m also heavily influenced by the works and philosophy of Dieter Rams (mainly for BRAUN) and Hartmut Esslinger (of frog design), of which you might not have heard about, but you know their designs for sure…
So I fell in love with the Parsytec x’plorer and other iconic computer designs like these ‘cubistic’ examples:

Yes, I am a strong believer that a computer, while basically being a rather boring calculation tool, should look good, timeless and might give you an idea of its innards are actually doing something.
We could probably go on forever, defining how a well designed computer should look like. But like the Romans used to say: “non potest argui per gustum” (You can’t argue about taste)…
Let’s say, I’m probably not totally off, given that most designs I like are also on display at the Museum Of Modern Art 😉

So mentioning the Final Cubes design, it’s  case we’re talking about: If you’re really picky, then yes, the Final Cube is actually two cubes:

The carrier-cage on the top which I tried to keep simplistic and invisible to give the PCBs as much stage as possible. The user should be able to see the many, shining CPUs. So 10x10mm aluminum square tubes are connected by 3D-printed frame-corners to provide maximum view onto the technology.
For protection but also as a design statement and tribute to Rams’/Gugelots famous ‘snow white’s coffin‘ everything is surrounded by a 30x30x30cm translucent acrylic cube.

The white base actually isn’t cubical at all being much wider than tall. Nevertheless, its design should be even more simplistic and cautious to serve three purposes:

  • Give the computing-parts above it a proper podium
  • House the LED array which provides the fitting aura
  • …and finally house all the tech the user should not care about

With quite a big fan between the base and the top both work like a chimney (following the convection) sucking the air from the bottom and blowing it through 169 holes in the top plate of the cube.
Here’s an idea out how it looks “working”:

When one thing comes to another

The parts of which the Final Cube is build from aren’t all created this year. Actually only the case and the cage-frame are from 2019 – all other parts were designed by me some years before.

The core of everything are TRAMs – these are Transputer computing modules defined by Inmos back in 1990. The specific TRAMs used are my own AM-B404, each containing a 25MHz T800 and 2MB of  fast SRAM.

Finest home-made TRAMs

16 of these TRAMs are placed onto an Inmos B012 (or compatible) carrier board. And up to 10 of these carriers can be put into the Cubes carrier-frame creating the cluster.

Under the hood

Below the carrier-frame, in the base, you can spot a 32×16 LED panel. This one is actually from 2012 when I designed the T2i2c, an i2c-bus to Transputer TRAM.

Yes, that’s an Arduino Micro on top of a TRAM

So it was a natural move to make the T2i2c into a system-controller. It does not only controls the LEDs displaying the current load of all Transputers, but also using a photo-diode to set the display brightness as well as measuring the internal temperature and overall power consumption.

Here’s an overview of the base internals:

I know, the venting holes are not pretty – but they do their job and prevent you from accidentally touching the power-supply.

The red arrow points to the T2i2c being connected to the LED panel to the left as well as to a hall-sensor (blue arrow) measuring the power consumption, a temperature sensor (orange) and a photo-diode (green).
You cannot overlook the 22cm fan in the back sucking air from the bottom along the power-supply and pushing it up to the Transputers above to keep them cool.

And their power consumption is not to be trivialized. In average a single Transputer TRAM requires is about 1 ampere… so the math is easy. This means the quest for a powerful power-supply was on.
After some months I found what used to be the power-supply meant for a 3Com Corebuilder 7000: The mighty 3C37010A. A whopping 90A@5V should be OK for starters… here’s the fitting procedure:

Mooooore powerrrrrrr, Igor! You touch, you die!

The back of the medal…

The backside did not change compared to the previous Cube back  – well besides the supply-cabling which now goes down into the base instead to the side of the cage.


In consequence you’ll spot the power-connector there. No switch though  – still thinking about that… as well as a nicer cable-management for the link-cable which is normally connected to the host.

Next up would be a host matching the look. Mhhhh….

Olivetti LSX 5010

Years ago I got that broken Olivetti CP486 board (the predecessor of the LSX 5010 and 5020 family) – one of the two ever made i486/i860 combo Mainboards (the other one was the 4860 by Hauppauge). Well because it was broken, missing important parts and I felt like I’m the only one on the planet having such system I dumped it.

There’s life out there!

Now I learned there are at least two LSX 5010 owners left on this planet and one of them contacted me, primarily asking for an i860… well, long story short:
His LSX 5010 was broken, too, but complete! We agreed on a deal: I try to fix it (plus an i860) and he’ll give me a system out of his collection.

Some days later I had everything I’d call a good-to-go system:

The grey box with an LCD display is the “console”, giving you POST information, a speaker and some buttons. Next to it the huge power-supply and in the slots you can spot the EVC-1 graphics card an my trusty ISA/PCI POST card…

Let there be light

Booting the system just the console showed a “CMOS Periodic Int Error“. Doing a warm-boot it replaced by a „Base 128k Ram Error“.
Additionally it behaved somehow flaky, booting  into different states every now and then:

These three Errors were solvable:

  • Flaky behavior: Replacing all caps – this always helps. Believe me. The system booted into a reproducible state after this.
  • CMOS error: The dreaded DALLAS CMOS clock-chip… we all know the drill. Its battery is empty and EISA systems heavily rely on a working CMOS storage. So it got an external battery surgery.
  • RAM error: That was a bit tricky. The LSX’es need parity RAM. One SIMM per bank. Max. mem is 16MB – I only have 16MB+ SIMMs. So I had to get small parity PS/2 SIMMs. 2x4MB did it.

Booting the system now, the console greeted me with

  • „Console Passed“
  • „P.O.D. Running“ (That’s the Power On Diagnostic)

…and then „Non-Maskable Int. Error“. Dammit! This can have many reasons, most of the time it’s RAM (parity). But in my case, it’s been different…

That was a fun one (actually two):
The trace to the i486 processor NMI-pin (B15) was scratched and needed repairs. But it still kept throwing that error. Why-oh-why?!?! After a whole day of digging I had a severe facepalm-moment:

The owner replaced the CPU by an 80486SX because he was under the assumption the LSX 5010 was an SX system. But it wasn’t. It’s a 80486DX @ 25Mhz system (while the 5020 is 33MHz).
And while everybody is claiming the SX is just a DX minus FPU… it is not a 100% drop-in replacement!
While the DX’es have their NMI-pin at B15 the SXes have it located at A15 (where DXes have IGNEE) and B15 is not-connected. Doh! (Checkout the pinout here)

So replacing the SX by one of my 486DX  we finally got a full boot! Tadaa:

Those stripes came from the EVC-1, which definitely also had its problems. So checking its board with my microscope I came about this:

Uhhh…. a cracked diode (D14) connected to address-line A0 to the video RAM. That explains the lines quite well.
When the new DA5 (BAR43S) diodes arrived I replaced the broken one, fired up the LSX 5010:
Looking good, booting into the EISA CMOS setup and while editing the config I could watch the picture disintegrating by every keystroke. More and more garbage was displayed, columns disappearing until it was all black.
The EVC-1 literally died in action in front of my eyes 🙁

I’m not sure what happened here. The fixed address-line can’t be responsible for this. All ICs still get their clean 5 volts. I suspect that one or more of the many old PALs (some of them even bipolar) died…

Ride on…

Anyhow, plugging in my ET4000 workhorse I was able to resume the setup. EISA systems always need a setup tool to tell them all the features of their Mainboard as well as the cards being installed. Luckily the owner had the basic tools at hand… you’re screwed without them.
So this is the one for the LSX booting:

After that’s been done I ran the diagnostic tool – in German just for the fun of it (You can spot all those “OKs”, right?)

But wait a second! Isn’t there something missing?
You’re right… here you go:

The mighty i860 RISC processor… and it is detected just fine: “Pass” 🙂

But does it work, too?

Yes it does! Hooray… mission accomplished!

Downloads

As usual, here are the dumps of the BIOS, Config EPROM and CMOS.
Additionally, you’ll find the floppy images of the config- and diagnostic tools in this archive.

Conclusion

The LSX 5010 is very much like the Hauppauge 4860 a fragile system to work with.
If the unusual configuration of EISA systems weren’t enough, the use of the many, many, many proprietary ICs (i.e. GALs and PALs) make them prone to aging and hard to fix.
They were bespoke designs, limited in their compatibility – Olivetti lists about 30 cards (VGA, SCSI, most of them multi-RS232) officially working – and most importantly need specific parts like the console and software.
Without the proper EISA config tool you always get at least error messages. Without drivers for the i860 you will not be able to use that and it’s just a heating-element inside your computers case.

Those (server) systems were meant to run as-is. Pretty much like a SUN, HP or SGI server of the same period. You can pick from like 2-3 devices to add and that’s pretty much it. They were not designed as an average PeeCee running Doom.

The Number Cruncher – FPU for the Apple II

This is a post I’d like to write since 2006… so 15 years after I’ve put all the Transputer, MIPS, i860 and-what-not stuff aside and made some room for my other (late) love: The Apple IIgs

You might have stumbled about my post/project connecting a Transputer to the Apple II called the T2A2… well, what I did not mention was, that the inspiration to this came from another card built in 1988, called the FPE made by Innovative Systems. That’s a small card featuring just a buffer, an old XILINX FPGA (actually the first of its kind) and the Motorola 68881 floating point co-processor.

Co-pro-ces-sor! I was hooked!  😯

After more research I learned that the FPE was actually a diva like this newsgroup post says:

“The FPE is suffering from a major problem, namely the coproc is crashing internally and has to be reset in software. This happens in a non deterministic way, and software written for that engineering junk must be adapted to that. “

But a bit later there was something better available: The Number Cruncher (NC for short). Fine German engineering  😉 And the newsgroup post was quite nice to it:
“The Number Cruncher is compatible with the FPE but is actually what the FPE was supposed to be – a math coproc that works. It perfoms very well.”

This is the “marketing blurb” from back then

  • totally compatible with the FPE from Innovative Systems
  • much less sensible to heat, voltage problems etc.
  • supports the FPE SANE patch for speeding up any program that does floating point calculations
  • you can compile ORCA/Pascal and C programs to use it directly with the special floatlib for the FPE provided by The Byte Works
  • works in any slot (slot 3 & 4 without need for setting it to “Your Card”)
  • works with TransWarp GS and Zip GS accelerators and RamFAST SCSI card
  • comes with lots of mathematical software and an enhanced SANE patch by Albert Chin-A-Young

I directly contacted the creators of the NC, Dirk and Andreas, but both of them hadn’t had a tiny bit of the NC left. No schematic, no HDL, no nothing 🙁

During looooong and wild eBay raids I managed to get my hands onto an FPE as well as on an NumberCruncher. Woohoo! They both work with the same driver and yes, the NC is much more robust than the FPE.
But it seems that there are only a handful of them survived, if anymore at all. So I thought it might be interesting for me and others to make a re-release…

Let’s re-animate the Number Cruncher!

So, again, it’s up to me to save the world… reverse-engineering time again!
This job consists of two parts – dissect the hardware and to get a better understanding of this, the software/driver, too.

Hardware

Well, you could simply rebuild the card, copy the bitstream from the serial PROM for the FPGA and you’re done.
But it’s not so easy, as the FPGA which has been used, the XILINX XC2064 is long time EOL’ed and if you buy old stock, they’re more expensive than a comparably recent CPLD – if they’re not Chinese fakes and/or broken… not mentioning that a simple copy-paste job is not really manly 😉

The XC2064 is a 5V FPGA (the first of its kind actually), has 600-1000 logical gates and 58 user IO pins of which 32 are used. It is getting its bitstream fed by an external 12K serial PROM. I’ve pulled the bitstream in a file available here, but because XILINX never documented that format, it’s pretty much useless.

The rest is all standard stuff. A ‘245 transceiver, an oscillator (~12MHz)  and some chicken feed.

Revving it up

To prove that I got everything right and having a better attack analysis vector I designed the “NumberCruncher Reloaded v0.1“… more or less a 1:1 clone but bringing out all FPGA signals to pin-headers to have a convenient access for my logic analyzer.
Also I used a PLCC version of the 68881, because I have more of those.

Then, out went the logic analyzer and weeks and weeks of looking/listening to the the FPGAs conversation to the the FPU and Apple bus while having the 68881 manual on my lap, I got an idea of what’s going on.

Software

Later I found the original floppy which came with the NumberCruncher containing  just the init-files for GS/OS patching SANE to use the FPU instead of the 65c816  (download the ShrinkIt archive here) – well, at least something!

Next came my ol’ buddy Mr. disassembler… a lot. About a year on-and-off… and here’s my initial, slightly commented disassembly of the SANE patch.

Again, some years down the road I found the floppies which were delivered with the FPE and those are quite helpful with code examples, assembler macros and libs etc. – that would have been quite helpful during the disassembly  😕 Anyhow, it’s good to see that most of my interpretations were correct.
Also, the FPE came with quite a nice manual – which actually was more valuable than the card 😉
It very well describes the basic functionality, the FPU registers used and the programming side of things – even the most important parts from the 68881 manual are cited. Very good job on that Innovative Systems!

Having all this at my hands I was prepared to start and after some huffing and puffing the NumberCruncher Reloaded v.1.0 was born.

NumberCruncher Reloaded

The NumberCruncher Reloaded is a peripheral card for the Apple II series that features a math co-processor, often also called a Floating Point Unit (FPU) which is specialized on, well, floating point calculations. Doing so, it is much, much faster than any 6502 or 65c816 CPU ever will be.

That said, the NumberCruncher Reloaded will not automatically speed-up your programs as CPU accelerators like the Transwarp GS, ZIP CHIP or AppleSqueezer would do. Programs will have to be either specifically written to use the NumberCruncher Reloaded or use a floating-point library like the SANE interface which then needs to be patched to itself use the NumberCruncher Reloaded for calculations instead of the main CPU.

TLDR; In a hurry? Here are some shortcuts for this page:

In the beginning…

The “Reloaded” in its name hints towards the fact that this is a reboot of an already existing card. To make writing/reading easier, NumberCruncher Reloaded might be shortened to ‘NC-R’ further down this page…
In 1988, there was the Floating Point Engine (FPE) created by Innovative Systems (‘iS’ for short).
Read more about those in my separate post over here.

While it was a great idea, it wasn’t the most stable design – but it laid the foundation especially and most importantly for the software we’re still using today.
Due to the FPE’s issues there was quite some displeasure in the user-ship and in 1990 a German company called Alternative Systems announced the Number Cruncher, an ‘overhaul’ of the original design – here’s their newsgroup announcement:

“The FPE is suffering from a major problem, namely the coproc is crashing internally and has to be reset in software. This happens in a non-deterministic way, and software written for that engineering junk must be adapted to that.
The Number Cruncher is compatible with the FPE but is actually what the FPE was supposed to be – a math coproc that works. It performs very well.”

Over the years the FPE as well as the NC faded in unobtanium. Because they were cool, and I love processors of all kinds it was time to revive the Number Cruncher.

Revival!

If you have read the above mentioned post about the the FPE you learned that the predecessors were built around the first, very obsolete and proprietary FPGA, a 555 timer and the 68881 FPU. All these parts would have a Facebook status of ‘#complicated’ today and needed to be replaced.

Logic: The Xilinx XC2064 FPGA was replaced by something more recent like my universal 5V-tolerant weapon of choice, the Altera EPM3064 (aka MAX3000). That little fellow has enough logical gates and using the 100-pin version sufficient I/O-pins are available even when using ISP. The timer for blinking the busy-LED went into this, too.

FPU: There are still some 68881 around, but the 68882 is much easier to find  and both are cheaper in PLCC packaging than ceramic PGAs as of today. But as future NC-R owners might already own one or the other, so we’ll go with… both. Yes – to offer maximum flexibility, you can use either Pin-Grid-Array or PLCC packages.
Physical differences aside, the original FPE/NC did not work with the 68882 – the NumberCruncher Reloaded does.

To sum it up the NumberCruncher Reloaded was improved in many aspects to make it much more usable in the 21st century:

  • it also supports the enhanced and the easier to find MC68882
  • FPU’s can be used either in pin-grid-array or PLCC package thanks to the two sockets provided. Again, the latter being much more common these days
  • Further increased stability by using low-power SMD parts and a 4-layer PCB with dedicated supply layers
  • Speed optimized FPU protocol handling
  • 2 more LEDs, which I consider very important.
  • Updatable firmware (ALTERA ByteBlaster required)

Provided Software

In contrast to my T2A2 Transputer Link-Adapter, there is already some software available for the NC-R.

The Tools Disk

This is a good start but was mainly intended for that warm fuzzy feeling of unboxing a real product 😉

Download: 2MG image or ShrinkIt image

  • Up to Sept.23rd 2021 there was an error in the 2MG image – One INIT was corrupted
  • On Nov. 3rd 2021 the image was updated with the new SANE patch INIT
  • May 27th 2022 – added some low-level benchmarks by Dirk Fröhling

Still, based on the original Innovative Systems disk (updated to the latest releases),  it provides everything you need to start:

  • All Apple IIGS-related software is located in the FPE.IIGS folder.
  • All Apple II/II+/IIe-related software is located in the FPE.6502 folder.
  • The Appleworks 2.x modification software is located in the APPPLEWORKS.FPE8 folder.
  • The EXAMPLES folder has some code snippets to show you how to talk to the NC(-R)
  • The BENCHMARKS folder contains the executables of the different benchmarks by Dirk Fröhling used in his blog entry.
    The archive on his page contains the sources, too.

In the FPE.IIGS folder you will find:

  • The folder called NCR.INIT containing the new SANE patch init which is highly recommended to be used with your Apple IIgs.
  • In the FPETOOLS.INIT folder you’ll find an archive containing the FPE tool set named FPE.INIT.Sx . These were developed by/for iS and are now deprecated and left on the disk for completeness.
  • The EXAMPLE folder contains an assembly language file which demonstrates the use of NC-R register-to-register operations to significantly improve floating point operations speed.
  • The BENCHMARK folder contains an ORCA/PASCAL version of the SAVAGE benchmark and an APW C version of the Byte Magazine floating point co-processor benchmark. (The program in example is an assembly language adaptation of the co-processor benchmark).
  • The APW.ORCA.FPE16 , MERLIN.FPEand LISA816 folders contain macros and equates files for use in assembly language programming.
👉 There’s an updated INIT (V1.15) available here, written by Vincent Hemeury . 
This fixes an issue when running the NC-R with other tools using the SANE API (e.g. Twighlight II 2.0 screensaver module “Ripple”)

The NCR.INIT requires some extra explanation because it is a very elegant solution to put the NC-R to use:

This init redirects every Standard Apple Numerics Environment™ (SANE) floating-point call to the NumberCruncher Reloaded – so as long an application using the SANE library calls it will be accelerated. All you need is to copy the INIT from the NCR.INIT/Init folder into your SYSTEM folder.

 

…and you’re done!

The INIT will automatically detect your NC-R in any slot (if installed) and will nicely show so during boot-up like so:

Apple II

Compared to the IIgs, the support of an FPU is generally limited.
Michael J. Mahon wrote a very good explanation how to still use the FPE/NumberCruncher Reloaded with an 8bit Apple II(e) and why it’s not as efficient as being used in a 16bit IIgs.
Still, he wrote a patch for FPBASIC as well as some example code how to make use of the 68882.

As mentioned before in the FPE.6502 folder you will find all tools for the Apple II.
That also has a SANE patch which can be found in TOOLSET:FPE8.TOOLSET.  This toolset uses the following calls:

jsr $2100 ; to call the fp6502 routines 
jsr $2104 ; to call the ELEMS6502 routines

It loads into locations beginning at $(00)2100 and has a length of less than $1000 bytes.

I have included the APW.ORCA.FPE8 and MERLIN8.FPE macro folders.

The APPLEWORKS.FPE8 folder contains the Appleworks 2.x modification. To modify your copy of Appleworks, just run FPE.SYSTEM from the root folder and answer the questions. The modified code will automatically access the FPE whenever a floating point operation is required.

Mandelbrot!

In the FPEfractal folder you will find Zoombaya (and other fractal programs, all by Glen Brendon).
This is my currently favorite tool for benchmarking and testing.
It’s written in Applesoft BASIC(!) and uses Glens cool so-called ProCMD module which sets up an interface between Applesoft programs. The downside is, that it uses some 65816/65802 specific commands. So running it on a 6502 CPU will lead to a crash.

“Real” Applications

Yes, there actually were some applications making use of an FPU accelerator, eg. function plotter, math tools like a poor man’s Mathematica etc. – all written for GS/OS. Sorry Apple II users… hopefully other A2 enthusiasts start writing programs using the NC-R, too.
I prepared an dedicated post about them to make this one as short as possible:
So if you’ looking for some ‘food’ for your NC-R head over to the NumberCruncher Reloaded Software post.

FAQ

The ever growing FAQ and programming details were put into a post of its own. So you’re kindly invited to follow this link if you have more specific questions…

Benchmark

Still, we’re all hackers of some sort, so you might ask yourself “What’s the fastest option?” or “Can I make it even faster?”.
Well, while the 68882 is up to 30% faster than the 68881 due to its capability to execute commands in parallel this mostly doesn’t matter in the case of the NumberCruncher Reloaded.

As mentioned in the feature-list, thanks to protocol optimizations the NumberCruncher Reloaded is already a tiny bit (~4%) faster than the original at the same clock-speed, but more important is that it “saturates” later.
This means while with the FPE/NC you don’t see any improvements beyond 12MHz the NumberCruncher Reloaded still benefits from a faster clock up to 16MHz.

From there it does not make much sense as the Apple II bus, clocked at 1MHz, is the limiting factor here.

Using the ‘Zoombaya Mandelbrot’ program included on the Tools floppy disk on a TransWarp GS @ 10MHz gives these calculation times:

MHz Original NC NC-R 68882 NC-R 68881 Faster than NC
11 3:59 3:55 3:55 ~1.5%
12 3:55 3:45 3:45 ~4%
16 3:55 3:12 3:12 ~20%
20 N/A 3:12 N/A N/A

The speed of your Apple computer very much plays into the equation, too. It handles all the writing/reading and of course the endian conversion.
Here’s an NC-R clocked at the default 16MHz installed in an Apple IIgs running the same Mandelbrot at different clocking:

‘normal’  1MHz ‘fast’ 2.8MHz TWGS @ 10MHz TWGS @ 12 MHz
8:35 4:39 3:12 3:10

The TransWarp GS seems to saturate pretty early. Most likely because it’s running the code completely out of its cache.

❗👉 Dirk Fröhling, one of the original creators of the original Number Cruncher got a NC-R and did some very detailed benchmarks.
He also provides the sources to his benchmarks as well as the believed to be lost sources for Albert Chin-A-Young’s original SANE INIT!
Read more about those in Dirks blog article over here.
InSANEly savage!

Vincent Hemeury (the author of the new INIT) found an article in the famous BYTE magazine (from July 1987) where they compared the performances of a Mac SE with or without 68020/68881 accelerators and a PC AT or a Compaq Deskpro 386 with an 80287 numerical coprocessor.

The version of his InSANE in the archive sometimes uses direct access to the 68882 of the NC-R. In this case, a IIGS with an AppleSqueezer accelerator and the NC-R is 5.6 times faster than the Compaq Deskpro and 30% faster than the fastest machine (for the Savage test) of this benchmark, the 68020 HyperCharger.

As an outstanding example of the speed’s improvement of a coprocessor for some operations, the same machine is still faster than a 700 MHz IIGS (KEGS with a Mac Pro) for financial operations. And one limiting factor is the 8-bit bus at 1 MHz of the IIGS – imagine having this clocked faster…

Also, Vincent wrote a little benchmark (included in the INITs archive) called “inSANE” which tests the financial functions.
The speed gain is spectacular :

  • IIGS AppleSqueezer / NC-R disabled : 70 sec to complete the first test
  • IIGS AppleSqueezer / NC-R Init v1.03+ : A bit more than a Second(!) for the same test

A vanilla IIGS roughly needs more than 4.5 minutes to complete this test.

Purchase

Of course you want one now 😉 The first batch of 35 cards is sold out. That said…
The 2nd run of cards is available now. Output is slow as I have to hand-solder the 100pin TQFP CPLD (due to chip-shortage when I had them produced). On the plus-side, the PCB is ENIG gold plated now.
I’m building & testing them on-demand – just mail me if you’re interested in one.


Ahhh, look at those beauties!The NumberCruncher Reloaded comes in a nice, eco-friendly box with a high quality, glossy spiral-bound manual and an 800k ProDos floppy.

You can choose from 3 different configurations:

Config 1 “Plug’n’Play”:
NC-R with just the PLCC socket, a 68882 FPU is already installed. Plug the card into your Apple and you’re ready to go:  €88.82

Config 2A “I have a PLCC FPU”:
NC-R with just the PLCC socket – you provide and install a PLCC 68881 or 68882 yourself: €80.82

Config 2B “I have a PGA FPU”:
NC-R with PLCC plus a Pin-Grid-Array (PGA) socket – you provide and install a PLCC or PGA 68881 or 68882 yourself €84.82
(There’s no PGA-only option because the PLCC slot is used for the  final function-test)

Shipping

It’s 2021 and the world is  suffering from the COVID19 pandemic. Shipping services are still providing very limited services.
I am shipping from Germany as trackable package only. Also I chose the box dimensions that way, that it snugly fits into DHLs shipping categories. In that case, I will wrap the original box in strong wrapping paper. It weights 250 grams (that’s 0.55lbs or 8.8oz or 0.039 stone – no idea what’s that converted to Ningi 😉 )
Alternatively I can ship the box in an extra box for better protection against “Crazy delivery men”. That will increase the size and thus the shipping price.

Shipping into the European Union is relatively affordable with 9€ – if you want a “box around the box” it’ll be 14€
(Just in case, for Germany it’ll be 6€ as “Paket”)

UK and Switzerland will be 15€ already. Box-in-box will be 20€ then.

  The US  and Australia can choose between 15€ (or 24€ for box-in-box) and if you’re patient(*) or a hefty 50€ which will be priority shipping.

*) I had shipping times from 10 to a hefty 60 days. Totally random, no idea what influenced the one or the other.

That said, the priority shipping will be fine for 5kg (11lbs, i.e. good for ~15 NC-Rs plus padding), so if you can organize a group-buy, I’m more than happy to support you.

Ordering

Because of bad experiences in the past, this is pre-paid only.
Again: This is a hobby project made for fun & the community, I’m not a business.
I accept PayPal (for friends) and bank-transfer.
To order, send a mail to ncr@geekdot.com stating

  • your Address
  • the NC-R version(s) you like
  • the preferred shipping method
    • wrapped
    • box-in-box standard
    • box-in-box priority

I will then send you all the details for paying & shipping. As soon I received the money, I will ship and send you the tracking information.
If you want to arrange a group-buy I can reserve a certain amount of NC-Rs for a max of 5 days. After that they will be sold on-demand.

NumberCruncher Reloaded Details

Because the main page of the NumberCruncher Reloaded grew bigger and bigger, I’ve split the FAQ and programming stuff in this separate post.

FAQ

Q: Which Apple computers are compatible with the NC-R?
A:
I’ve tested the NC-R in my IIgs and IIe. Those work for sure.
The original FPE was communicated as being compatible with the II and II+, too. I don’t have those machines and while the compatibility is highly possible, it has yet to be proven.

Q: I’m experiencing crashes and instant lock-ups starting programs which are supposed to use the NC-R
A: Most likely your software is expecting the FPE/NC-R in another slot.
For speed sake, most current programs naively supporting an FPU card, expect the card in a certain slot. Especially the SANE INIT.
So please check if your NC-R is installed in the correct slot and try other programs if they are crashing, too.
I recommend the Mandelbrot program provided on the NC-R Tools disk. This program scans all slots for a FPE/NC-R by itself.

Q: What are these LEDs for?

  • The green BUSY LED blinks at every access to the FPU.
  • The yellow INFO LED doesn’t have a proper job yet. Currently it’s connected to DEVSEL, so you can see it blink very briefly, when your Apple II scans its bus.
  • The red ERROR LED will be lit when the FPU encounters a so-called ‘protocol violation’, i.e. there’s some problem in the communication between the Apple and the 68881/2.
    See page 30 in the manual for more details.

Q: And for what use is that 5×2 pin-row at the cards back-edge?
A: That’s the connector to update the firmware if that should ever be nesseccary. For now there’s just the one version which is installed.

Q: Can I make the NC-R go faster? What about overclocking?
A: Not really. See the ‘Benchmark‘ section further down.

Q: On the pictures of the NC-R I can identify a 40MHz Motorola 68882. Do all NC-R have such fast FPU?
A: No. I use whatever 68882 is available on the market. Strangely enough, sometimes a 40MHz version is cheaper than say a 16MHz.
So whichever version is installed in your NC-R, it’ll be fast enough and always clocked at 16MHz anyhow.

Q: How can I write programs using the NC-R?
A: This is an extensive question which can’t be answered satisfyingly in an FAQ. Refer to chapter 3 of the manual to learn how the NC-R works internally and how to program it in assembler, C or even Basic.
But I’m also thinking about a dedicated post about just that matter.

Q: I wrote a small program to test the NC-R and it’s not really faster than using SANE on my IIgs
A: There’s a certain amount of calculations need to be done until the NC-R shows its performance. A single addition probably takes longer than it would need on e.g. a stock Transwarp GS because all the communication overhead.
This dramatically changes if you have lots of floating-point calculations in one stream, optimally using the 68881/2 internal registers.

Q: Are you writing software for the NC-R
A: Well, maybe in the future but currently I’m busy with other projects.
But so much can be revealed: Brutal Deluxe has its NC-R already 😲

Q: How can I ask you a question / get help / praise / complain / rant about the NC-R?
A: I reactivated my little Forum on this page. Therefore it got its own Apple II board.
This way everybody can participate in your question or complaint… speaking of complains: Do not complain that you have to register and that it took so long for the approval! This is a one-man show, there are time-zones and despite other rumors, I do have a job 😀

Q: Why did you chose green for the PCBs colo(u)r?
A: It’s a remake. So  it should at least somewhat resemble the original look. Also given there are translucent lid/case options available today I personally don’t like the idea of a motley bunch of green/blue/red/white cards standing out of my Apple II… yeah, I’m old-school 👨‍🦳

NumberCruncher Reloaded Software

“Real” applications for your NumberCruncher Reloaded

I prepared an archive, containing all applications (I was able to find) supporting an Apple II FPU accelerator card in some manner, i.e. natively or via SANE.
These were commercial programs back in the days and were not provided with the card itself – to learn more about the basic tools which came with the card, visit the NumberCruncher Reloaded main post.

You can Download all presented applications as ShrinkIt archive or ZIP file

Obviously, they’re mainly math packages… and sad but true, as for now they’re all Apple IIgs programs:

GSnumerics (by Spring Branch Software)

GSnumerics

Symbolix (by Henrik Gudat of Bright Software)

Screen Shot

jazGraph (by Jason Perez)

MathGraphics (by Dirk Fröhling)

saneglue (by Söhnke Behrens)

From the README: “lsaneglue is a library that contains code to let you call SANE funtions directly from ORCA/C”.
This lib provides convenient functions like findfpcp() and most calls to floating-point operations.

I hope that due to the availability of the NumberCruncher Reloaded this software collection will get some new addition by enthusiasts of the vivid Apple II retro scene.

The T2C=

It had to be done… and now, 12 years later 😱, it is done:
Finally the T2C64 was poured into a proper PCB and some bells’n’whistles had been added – so it’s now on par with the T2A2 for the Apple II series. Say hello to the T2C=!

TLDR;

Here’s a quick intro for those being to lazy to follow the link to the T2C64.

This card enables your Commodore 8-bit computer to communicate with a Transputer Module piggybacked onto the card.
A Transputer is a 32bit RISC(ish) CPU from the 80’s that has the unique ability to connect to other Transputers by a very simple 2-wire protocol making it possible to create large, powerful computing networks – at least by 1980+ measures 😉

What can I do with it?
How to talk to the Transputer?
Can I do something useful with that?
How fast can I move data back and forth?
Ok, how much?

After many years the Commodore 8-bit bug had bitten me again and it was due time to put some love into my C64 Transputer interface.
But while at it, I thought it would be handy to use this card not just with my C64 but also on all other Commodore machines featuring an expansion port.

This led to the (to my knowledge) first 8-bit Commodore ‘flipper card’, i.e. it has a port connector on each end. One for the C64 & C128 and one for the C264 family, namely the C16, C116 and Plus/4. Yes, it works with all of them. Pull it from your C64, flip it 180° and plug it into e.g. your Plus/4. Cool, huh?
So here’s a quick feature list:

  • Edge connectors to connect to the Commodore
    • C64
    • C128
    • C16/C116
    • Plus/4
  • Each edge connector offers 2 I/O address ranges to be set by a jumper (0xDE00/0xDF00 and 0xFD90/0xFDF0)
  • Offers two TRAM (TRAnsputer Module) slots to connect either 2 Size-1 or one Size-2 TRAM
  • External Transputer-Link connector to connect the T2C= to larger external Transputer networks (pinout is the same as on the T2A2) – not populated on the pictures here.
  • The data-bus is fully buffered to prevent interference with other cards when used in e.g. an expander
  • 3.3V CPLD used to reduce power-consumption as much as possible.

Here’s the card in full glory… without a TRAM plugged in:

….and with one size-1 TRAM which itself provides a 32bit T800 Transputer and 128K RAM. The 2nd slot is still free.

TRAMs came in a wild range of variations. Be it CPUs used on them and/or the amount of RAM. But there also been peripherals like SCSI controllers or graphic cards – check my little TRAM page if you like to get an idea.
Yes, TRAMs are quite vintage and thus hard (or expensive) to get… but don’t despair… I’ve designed my own and also have some old ones in stock – probably enough to serve the hand-full of interested nerds 😉

What can I do with these?

I knew that was coming 😜
Well it mostly depends on you. The T2C= is an accelerator running its own code in its own RAM and can exchange data with your Commodore 8-bit machine – everything is possible.

All code examples and sources are available in this archive.
Commodore files are in a .D64 disk image.

Personally I always have the initial reflex to run a Mandelbrot fractal on everything’s slightly capable to do so. Most of the time, that’s where my euphoria ends and my project-ADHD kicks in… but that doesn’t stop you from having cool ideas.

Technically this setup isn’t much different from slapping a Raspberry Pi to your Commie and let that do stuff… but there’s something I’d like to call the “5 connoisseurs C’s” which might not be everyone’s cup of tee but very tempting to others:

  • Contemporary: Transputers are from the same era like your Commodore machine while being much more powerful – we’re talking about ~15MIPS here.
  • Completely different: Transputers are natively programmed in OCCAM, a very interesting, different language than the one you might be used to.
    That said, no worries, there are C, Pascal and Fortran compilers, too. Here’s a page offering a little “SDK” I created – it’s a VirtualBox image coming with everything you need to start coding.
  • Connected: Transputers are made to be networked into a parallel network… making your well programmed application running even faster as benchmarks show.
  • Challenging: “Well programmed” means wrapping your genius brain around multi-threaded, parallel paradigms or use the fast 2 or 4K on-chip RAM the most clever way.
  • Communicate: And finally, find a clever way to communicate with the host (i.e. your Commodore) and vice versa.

Ideas for using it could be raytracing, do complex calculations, heavily compress/manipulate data, use it as a simple storage (“Stupid-REU”) or write a Helios server for it and use your C= as a terminal [Helios is a UNIXish OS running on 1 to infinitive Transputers]

A final word of warning: While the T2C= uses very little power, a Transputer (and the RAM on a TRAM) does use quite some juice.
Depending on the TRAM this can be as little as 500mA up to 1A – which means your power-supply should be a stronger one.

Communication

Let’s start with the most simple and actually useful code:
Detect a connected T2C=/Transputer and check if it’s working correctly. This code was already shown on my T2C64 posts but now it’s enhanced for newly added machines and runs in BASIC V2 as well as V3.5 or V7.

After telling the base-address it does the following:

  • Init/Reset the Transputer to a sane state
  • Read & display the statuses of the Link interface
  • Write some data into the Transputers RAM and read it back
  • Finally, send a small program to the Transputer which makes it possible to find out its model (16bit T2xx, 32bit T4xx/T8xx or just a C004 programmable link switch)

So here’s the new TDETECT code:

100 SY=peek(65534):print chr$(147);"This seems to be a";
110 if peek(1177)=63 then poke1177,62:sy=peek(65534):poke1177,63
120 if sy=72 then print " c64": goto 160
130 if sy=23 then print " c128": goto 160
140 if sy=179 then print " plus 4 or c16":goto 160
150 print"n unknown model";print
160 print "select T2C= base address"
170 print "1: c64/c128 $de00 (56832, default)"
180 print "2: c64/c128 $df00 (57088)"
190 print "3: c264 $fd90 (64912, default)"
200 print "4: c264 $fdf0 (65008)"
210 print "5: enter your own"
220 input m
230 if m=1 then ba=56832: goto 290
240 if m=2 then ba=57088: goto 290
250 if m=3 then ba=64912: goto 290
260 if m=4 then ba=65008: goto 290
270 if m>5 goto 160
280 input "base address:";ba
290 print"initializing transputer"
300 do=ba+1:rem data out
310 is=ba+2:rem in status
320 os=ba+3:rem out status
330 re=ba+8:rem reset/error
340 an=ba+12:rem analyze
350 rem ------------------
360 poke re,1
370 poke an,0
380 poke re,0
390 rem clear i/o enable
400 poke is,0
410 poke os,0
420 print"reading statuses"
430 print"i status: ";(peek(is)and1)
440 print"o status: ";(peek(os)and1)
450 print"error: ";(peek(re)and1)
460 print"sending poke command"
470 pokedo,0
480 print"o status: ";(peek(os)and1)
490 :
500 print"sending test-data to t. (12345678)"
510 poke do,0:poke do,0:poke do,0:poke do,128
520 poke do,12:poke do,34:poke do,56:poke do,78
530 print"i status: ";(peek(is)and1)
540 :
550 print"reading back from t."
560 poke do,1:rem peeking
570 poke do,0:poke do,0:poke do,0:poke do,128
580 print peek(ba);peek(ba);peek(ba);peek(ba)
590 :
600 dimr(4)
610 print"sending program to transputer..."
620 forx=1to24
630 readt:poke do,t
640 wait os,1
650 nextx
660 print:print"reading result:"
670 c=0
680 n=ti+50
690 ifc=10 goto 760
700 if ti>n then ee=ee+1:if ee=10 goto 760
710 if(peek(is)and1)=0 goto 700
720 r(c)=peek(ba)
730 c=c+1
740 goto 680
750 rem ------------------------
760 if c=1 then print"c004 found"
770 if c=2 then print"16 bit transputer found"
780 if c=4 then print"32 bit transputer found"
790 data 23,177,209,36,242,33,252,36,242,33,248
800 data 240,96,92,42,42,42,74,255,33,47,255,2,0

Do something useful?

So now that we have detected a connected Transputer on our Commodore, it should do something useful like… adding numbers.
While this is way beneath his dignity, it’s a good example of uploading code to the Transputer and how to send and read data.

For this, I’d like to redirect you to the 2nd code example I’ve posted for the T2C64…

And finally in this former post I coded a Mandelbrot fractal (Video inside! 😉) for the C64 using cc65 and the TGI graphics library which calculates and displays the initial fractal within a minute or so.

Now having the whole family of C264 machines added, I thought it would be nice to have a demo for them, too.
So, just because it can do graphics out of the box,  I wrote the Mandelbrot “frontend” in BASIC 3.5. It worked but it was brutally slow…it takes like 10 minutes or so to get this screen 🐌

That is – of course – because BASIC is darn slow in doing the IO and plotting. Looking at one of the above examples, reading a byte from the Transputer means read a byte, set “pen” to the next coordinate, decide if to plot or not, repeat – in code (provided in the D64 disk image) this looks like that:

390 for y=0 to 199
400 :for x=0 to 319
420 ::px=peek(ba)
430 ::if px=32 then draw 0,x,y:else draw 1,x,y
440 :next x
450 next y

Because it’s so slow, I even didn’t need to check the input-status of the link-interface as the Transputer delivers the data much quicker than BASIC can say “next”…

This of course will be the ultimate show-stopper. What’s the sense of such a fast number cruncher, if you can’t get the data out of it fast enough?

Speed?

Mhh, so how long does it take to (just) read data from the T2C=?
Let’s start with BASIC to have milestone. This is “BAS-SPEEDTEST”, a very simple benchmark.
It loads a tiny Program into the Transputer which makes him spitting out an endless loop of counting from 1 to 10. Then we read the amount of 4KB and stop the time on that.

NB: As seen on the examples above, there’s an automatic handshaking in the way that the C012 link-interface chip on the T2C= sets a flag (Out-Status) each time there’s a byte ready to be fetched. But BASIC is so slow, that there’s always new data available the next round reading.

100 ba=56832:rem Adjust your base accordingly
110 dd=ba+1:rem data out
120 is=ba+2:rem in status
130 os=ba+3:rem out status
140 re=ba+8:rem reset/error
150 an=ba+12:rem analyze
160 rem ------------------
170 poke an,0
180 poke re,0
190 poke re,1
200 for d=1 to 500:next d
210 poke re,0
220 print"sending program to transputer..."
230 forx=1to33
240 readt:pokedd,t
250 waitos,1
260 nextx
270 print"reading incoming data..."
280 zeit=ti
290 for l=1 to 4096
300 in=peek(ba)
310 next
320 print"time for 4k:";(ti-zeit)/60
370 rem --- for the transputer
380 data 32,181,36,242,33,248,36,242,33,252,37,247
390 data 34,249,70,33,251,36,242,74,251,96,7,1,2,3
400 data 4,5,6,7,8,9,10

That showed it clearly… Basic is an IO-sloth.
Even without waiting for the Input-State ready it took the C64 16.5 seconds to read 4KB – nearly twice as long if we check for the input-status.

Machine without WAIT IS with WAIT IS
C64 16.5 29.4
C128 24.28* 40.5
Plus/4 19.8 33

*) It’s strange, that Basic V7 is even slower – investigation is ongoing
Talking to “the 128 Master” (Johan Grip) the mystery was solved.
Basic 7 also does a “long” fetch through extra vectors and code in ram. That does add quite a bit of overhead.
You could say that BASIC 7 has a “bad peek performance” 🙂

More speed, please!

Ok, let’s use something more mature… like the cc65 creating nice code for all our beloved Commodore machines.

[expand title=”This is a longer one… so please expand” ]

#pragma static-locals(1);

#include <stdlib.h>
#include <time.h>
#include <conio.h>
#include <peekpoke.h>
#include "trproc.h" // that's in the provided archive

#define TOBEREAD 4096 // how many bytes should be read

static char tcode[33] = {
0x20, 0xB5, 0x24, 0xF2, 0x21, 0xF8, 0x24, 0xF2, 0x21, 0xFC, 0x25, 0xF7,
0x22, 0xF9, 0x46, 0x21, 0xFB, 0x24, 0xF2, 0x4A, 0xFB, 0x60, 0x07, 0x01,
0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A
};

int main (void)
{
clock_t t;
unsigned long sec, kbps;
unsigned sec10;

int i;
char onechar;

clrscr ();

#if defined(__C64__) || defined(__C128__)
cprintf ("Expecting T2C= at 0xde00\r\n"); 
#elif defined(__PLUS4__) || defined(__C16__)
cprintf ("Expecting T2C= at 0xfd90\r\n"); 
#endif

/* Init Transputer - fixed to plus/4 dfeault for now */ 
init_t(); 

/* upload Transputer code */
cprintf ("Sending code to Transputer\r\n");
puttr(tcode, (sizeof(tcode)));
cprintf ("start reading %d bytes...", TOBEREAD);
t = clock ();

/* reading X KB byte by byte*/
for(i=0; i < TOBEREAD; i++) {
gettrchar(onechar);
}
cprintf ("done\r\n");
t = clock () - t;

/* Calculate stats */
sec = (t * 10) / 50;
sec10 = sec % 10;
sec /= 10;
kbps = TOBEREAD / sec;

/* Output stats */
cprintf ("\r\nDuration: %lu.%us (%lu byte/s)\n\r", sec, sec10, kbps);

/* Done */
return EXIT_SUCCESS;
}

[/expand]

But this time it just took 4.4/4.6/3.6 seconds on a 64/128/+4! 🏍💨 Four times faster than BASIC.

Are we there yet?

That looks promising and there’s still a C-compiler which we can optimize… read: replacing it with assembly code super-power.

For that I wrote a little macro-library for KickAssembler. Any other assembler will do, too, of course.
Besides the Transputer initialization and detection stuff there are macros for reading and writing a single byte, up to a “page” (256bytes) and the full 64K using two zero-page adresses… this is how we read the 4KB in the benchmark:

.label base = $de00 // define according to setting
.label inreg = base
.label outreg = inreg + 1
.label instat = inreg + 2
.label outstat = inreg + 3
.label reset = inreg + 8
.label analyse = inreg + $c
.label errflag = reset

		* = $C000 // or wherever you like
start:		
		inittr()  // macro init_transputer
		
		lda #$2E  // print a "." at 1/1 for debugging ;)
		sta $0400
		
		puttr_1page(bench, 34)  // upload the benchmark code

		lda #$00  // Set destination pointer to base ($8100)
		sta $FB   // in zero page
		lda #$81
		sta $FC
		
		gettr_big($FB, $1000) // get 4KB data and write it to $8100
	
		rts

		// code for benchmarking & testing (33 bytes)
		// The first byte is always 'sizeof(code)'
  bench:	.byte $21, $20, $B5, $24, $F2, $21, $F8, $24, $F2, $21, $FC, $25, $F7
		.byte $22, $F9, $46, $21, $FB, $24, $F2, $4A, $FB, $60, $07, $01
		.byte $02, $03, $04, $05, $06, $07, $08, $09, $0A

“Wrapped” as a SYS call into a BASIC program to stop the time – including the Transputer code upload and checking the input-status – this code takes 0.5 seconds to read 4KB and even with writing the read data to a defined memory area! 🚀
That’s 33 times faster than BASIC and still 8 times faster than cc65.

Getting one?

Still with me and you’re really interested in getting one of these?
Please go through this checklist first:

  • You’re aware that there’s no software for it yet
  • You’re aware that you have to code for the Transputer and are up to learning new things
  • You also need to write code on the Commodore side – assembly needed for maximum speed
  • Besides the T2C= you will need a TRAM. So you need to own/purchase one, too.

If you can answer all of them with “Yes”, “fine with me” and/or “sure!” I can provide you with a T2C= for €40 plus shipping.

I also have TRAMs available in different configurations:

  • My own design, the AM-B404  (size-1, 2MB SRAM): 45€
  • Various manufacturers: size-1, 1MB DRAM: Ask me.
  • “bargain offer”: original INMOS IMS-B404 (size-2, 2MB): 25€
    (given their size, they will clog your T2C= completely)

available CPUs are:

  • T425-20: 7€
  • T800-20: 12€
  • T800-25: 20€

For example, if you like to have a T2C=, my AM-B404 TRAM and a 20MHz T800 that would be 40 + 45 + 12 = 97€ plus shipping

Shipping with tracking is
European Union 9€  (Just in case, for Germany it’ll be 6€ as “Paket”)
UK/Switzerland 13€
USA 15€

⇒ drop me a mail tonobody likes SPAM
(Sorry, you have to type that into your mail-client – nobody likes SPAM, so do I)

The STG[A]TW

This is my first ever project I did for one of my favorite computers, the ATARI Mega-ST. Like told in one of my blog posts, the ATARI ST was my 2nd greatest love ❤ (after the C64) and being part of a very  cool company back in the days I only have fond and happy memories of it.

After all the years of fiddling with nearly every machine on the market, it’s like coming home by just looking at its system font or hearing it’s specific bell-sound (even the ever-annoying key-click sound it makes by default).
And now it’s time to do something cool with it… adding, what I’ve missed back then: Color and -of course- Transputers 😉

TLDR;

Ok, so you’re in a hurry or suffer from severe ADHD?

This is a graphics card for the ATARI Mega ST internal bus including a Transputer interface.

Got it. More details please…
What about software? (links to a different post)
Why, for god’s sake!?
There’s a relocator, too?
Ok, how much?

NB: This card is now superseded by the ATW800/2

Say hello to the STG[A]TW!

What’s that about the strange naming?! Well, this card is a hybrid of a classic STGA ISA graphics-card adapter and a Transputer interface for the Mega-ST bus.
Mega-ST, high-res graphics and Transputers? Mhh, does this ring a bell? Yes, component-wise this is exactly the configuration of an ATARI ATW800, the famous and rare ATARI Transputer Workstation (for which I designed a Farmcard, just in case you own an ATW).
So adding the two, it’s an STGA-ATW or STG[A]TW for short… and it looks like this:

Looking at the top you’ll spot the 90° angled ISA Slot at the right edge, giving (selected) ET4000 graphic cards a home.
To the left there are two Transputer TRAM slots making it possible to use two size-1 or a single size-2 TRAM.
Obviously, an ISA card and the TRAMs would collide, so you have to choose… or you’re a lucky owner of a low-profile ET4000. Then you could use your VGA card plus one TRAM like this:

But even if your ET4000 card is covering the whole STG[A]TW don’t despair! Looking at the backside you can spot the external Transputer link connector (on the right edge):

Using this you can connect to e.g. an external Transputer(-farm) of any size… for example something like my 64 CPU Final Cube 🔥

Looking further around the backside you can spot a preparation for a CR2032 coin-shape battery holder. That is meant to replace the two AA batteries used in the original case-lid because depending on the TRAMs used, it might be necessary to remove the battery compartment (yes, you’d need to cut it out 😰) .

Talking about power… at the bottom you can see the external power connector which supply is mandatory – you need to connect at least 5V and ground, optionally 12V if your ET4000 needs that.
That said, I highly recommend to make sure your Mega ST’s PSU is powerful enough – best would be to replace it by e.g. a Maxwell RD-50A.

Why?!

I knew you’d ask. Well in case you haven’t noticed yet, I’m a total Transputer nut. It’s a fabulous, genius CPU and design. The more you dig into it, the more you’ll love it.

Back then I adored the ATW800 and always wanted to own one. But it was insanely expensive and -to be honest – wasn’t a real member auf the ST/TT-family anyhow.
This is because the Mega-ST1 inside the ATW was mainly used as a bootup machine for the Transputer and after that was up and running, everything the ST did was file- and user-I/O (Mouse, Keyboard, RS232).

In my humble opinion, the STG[A]TW is (somewhat) the way how ATARI should have done it back then. Instead of creating an ‘island solution’ they should have used the existing install-base and offer an expansion to it. Plug in the missing parts (graphics & Transputer) and keep the TOS/GEM eco-system in charge.
Users could keep running their applications and use the extra ‘ooomph’ to speed them up. Think of all the accelerators Apple Macintosh users had available to speed up PhotoShop filters or have it do the heavy number crunching of science applications etc.
Even all data has to travel over the bus to the Transputer and back, this is still faster than the 8MHz 68000.

Given that in 1990 about 350 ATW800 were produced and sold at 5000-7000 GBP which equaled to about 13700 DM or 8000$ (that’s about 11400 GBP, 13700 EUR or the same in US$ today),
I bet the number of a “ATW for the poor” would have been much higher.

So, again, why? Well to have Mandelbrot fractals calculated fast and  in colo(u)r, of course!
Fast means ~60sec, even using slow GEM routines. Using the same algorithm and iteration depth, the ST’s 8MHz 68000 took nearly 3 hours to calculate the same fractal.

Here’s a quick peek how ‘fast’ looks like:

Evolution – a quick excursion

If you’re into hardware development you might wonder why there’s a very vintage GAL and a semi-vintage CPLD used in this design.
Here’s my explanation and shameful justification 😉

From the very simple and basic design of the STGA I took the usual nerdy feature-creep road to hell 🙄
My initial design naturally included the GALs logic into one big CPLD. And having all address-lines available on this, that design also included (on top of the ISA and Transputer interface) a 68882 FPU, an IDE interface and a ROM decoder… everything worked fine BUT all ‘modern’ ET4000 cards didn’t.
I stared at logic-analyzer traces for weeks and weeks and compared them to the original STGA they were absolutely identical. But whatever I did, I wasn’t able to get ET4k cards with a Rev. TC6100AF chip working.
In the end I decided to keep the STGA part as-is, including the external AND-ing of /LDS & /UDS and inverting of /DTACK and put the Transputer handing into a smaller (and cheaper) CPLD.
Thus the FPU, IDE and ROM decoding was off the table and to be honest, there are other solutions which do that job better anyhow.

From left to right: STGA, the Über-STGA and the final STG[A]TW

So there you have it: Colorful high-res GEM combined with the mighty Transputer power… but I understand, that those low-profile ET4k cards are getting rarer and rarer and not everybody has an external Transputer farm to connect to.
So I made another card or better a so-called CPU relocator…

The TRAM-Relocator

Most (Mega) ST users out there already have one or more expansions to their system, mostly plugging into or onto the CPU creating a ‘stack’ of PCBs.
Because the STGA (as well as the STG[A]TW) overlaps over the Mega STs CPU socket you might want relocate the CPU a bit away from the Mega-Bus socket. Simple relocators simply move it a bit towards the front of the case. But that still results in having a stack of multiple extensions. For example here’s a Storm ST (Alt-RAM) on top of a Cloudy (4x ROM) plugged into a Lightning ST (IDE & USB):

This can get tricky in some crowded Mega ST cases…

I really liked the ‘Bus I/O port design’ of the Exxos’ STF Remake Project having multiple sockets next to each other.
And if you have your original TOS ROMs removed (and replaced by e.g. a Cloudy) there’s actually some space to roll out 4 of them having the Relocator sitting flush on the Mega-ST mainboard (make sure the backside of the Relocator is completely isolated!):

4 Sockets and a cool TRAM socket 😁

Like clearly written on the PCB, SOCK1 goes into the (to-be-retrofitted) CPU Socket and using ‘hollow pins’, it can take a CPU itself.

SOCK2-4 are available to extensions of your choice – all 3 of them are protected against power-surges by a fuse and a diode.
This design decision has been made due to my own painful experience loosing everything which had been plugged into the CPU socket… and the Blitter 😥

In the lower right corner are pins for an additional external power connector, also protected. That might be necessary depending what you’re plugging into those sockets.

Finally, the left edge is a Transputer TRAM socket which can be connected to the STG[A]TW by a 10pin flat-cable providing link signals and a 5MHz clock signal.
That way, you can use the STG[A]TW with an internal Transputer even your ET4000 card is big as a baking-tray.
It is highly recommended to use external power when doing so. The poor 68000 power-pins won’t be enough for it.

If needed, the whole TRAM part can be snapped-off from the Relocator to, uhm, relocate the TRAM elsewhere in- or outside the case or use it stand-alone. For that matter itself features an optional connector for power as well as a place to solder a required 5MHz oscillator and 2 mounting holes.

With everything in place, your “ATW800 for the poor” could look like this:

What you see here is the STG[A]TW plugged in, giving home to a low-profile ET4000 and a Size-1 TRAM.
The Relocator was plugged into the CPU socket and in its 1st slot the  Cloudy-Storm  and the 68000 sitting on top of it, took seat.
Slot 2 of the Relocator is taken by a Lightning-ST… and last but not least, a second TRAM was put onto the Relocator (you can spot the grey flat-cable connecting it to the STG[A]TW.

Want one?

All this sounds so cool that you want to own a STG[A]TW?
Well, first check out this list:

  • There’s next to no GEM user software for it yet
    👉 but we’re working on it and there’s a pretty good system support in place already – and Helios is running already! 🥳
    An extra post on that is currently in the works available here.
  • Do you have an ET4000 card of which you know it’s working with the NOVA drivers?
    👉 I am not able to support you in getting your specific card working – there are just too many models and permutations of possible TOS/GEM/Driver installations. See this atari-forum.com thread to get an idea…
  • Do you own a TRAM?
    👉 I might provide you with one at extra cost, mail me.
  • Do you have time to wait?
    I’m manually building these boards and it’s a lot of work (0.5pich SMD, lots of trough-hole pins to cut and file down etc.)

If that’s 4  times “Yes” I can build & sell you one of the 6 which I have left for 100€ (plus shipping)… yes, that’s hefty but the quite large PCB is 4 layers (for stable power-distribution), just the ISA slot connector is 10€ already, Mega Bus 5€, GAL, CPLD etc.etc…. plus, as said,  it takes quite some time to build & test them.
Drop me a mail on the bottom of that page if interested…

SOLD OUT… sorry 😥

As for the CPU-relocator, I’m selling un-populated PCBs for 8€ (Or get the gerbers here and have yours made at your favorite PCB manufacturer).
I’m not building them because the CPU ‘socket’ (SOCK1) is made of 64 single pins which you have to pry/get out of precision pin-headers.

That’s a tedious work you most likely want to do once… but not many times.

All that said – If you weren’t able to get a STG[A]TW, don’t despair.
I consider this as my stepping stone and learning platform for something cooler to come 😎.
Because I don’t like vapor-ware and hot-air-talking, I’ll tell you more when it’s a) done and b) working.

STG[A]TW programming and software

Ok, you read/heard about the STG[A]TW and want to know more about how to use it and -most importantly- for what it’s good for?

First and foremost, a Transputer is a computer-system of its own connected to a host. In this case an ATARI Mega ST.
But given an available host-adapter that could also be e.g. a Unix machine, a classic PC, an Apple II or even a Commodore C64, C128 or Plus/4
That host communicates with the Transputer over a link-interface using specific memory addresses or, if available, a library. That way the host can send executable binaries to the Transputer, send or receive data to/from it and control  it (boot, debug, etc.).

Because each host system is different, these addresses are different, too. But the transfer protocol and Transputer executables are always the same. So looking at this BASIC code example for the C64 gives you an idea, how it works – the steps are the same for every host-communication no matter which host-system used.

As usual, here’s a table of contents for those being in a rush..

Quick intro about standards & history

Yes, there have been very different ATARI ST and Transputer interfaces in the past. “Two and a half” systems were most prominent – let’s have a look at them before we go into details of the STG[A]TW.

The Atari Transputer Workstation aka ATW800

I think I’ve already wrote a lot about the ATW800 in several post on this page, even designed an expansion card for it – despite I don’t own an ATW myself.
To make a long story short: This is basically a design, where the ATARI Mega-ST is used as a boot device and after that just handles file- and user-I/O. The Transputer is attached to the ST via DMA and runs the Helios OS and has direct access to the graphics controller called ‘Blossom’. Totally different concept.

KUMA K-MAX

The KUMA  K-MAX was a box connected to the ATARI ROM-module port and thus acted as pure ‘number cruncher on a leash’.
There are two reviews still available: The English review of atarimagazines.com and the German ST-Computer article even showing some photographs of which I ‘borrowed’ this:

Transfertech

Outside “the scene” this is a relatively unknown German company which actually made a lot of Transputer-centric hardware.
For the ATARI series they had 3 host interfaces:

  • A ROM port interface (all ST models)
  • A Mega ST bus interface (ROM port design botched onto the bus)
  • A VME-card (Mega-STE, TT)

Like the KUMA K-MAX, this design also attached the Transputer(s) as number cruncher.
As I own all of them, I might write a dedicated post about them some day.

This is how we do it

As all of the above did their own thing, there is and was no standard for interfacing the ATARI ST series – So I defined one with the other ATARI ST Transputer enthusiast André Saischowa, who did some intense ATARI Transputing fiddling back in the days.

In case of the ATARI ST the link-interface ( e.g. STG[A]TW) ‘lives’ at the base address 0xFFFAC0 and uses 18 bytes from there up to 0xFFFAD2. So the complete adress-range looks like this (uneven, so we can address the lower byte of a 68000 word):

#define base 0xfffac0
#define inreg base+1 /* C012 */
#define outreg ((base)+3)
#define instat ((base)+5)
#define outstat ((base)+7)
#define reset ((base)+17) /* writing*/
#define analyse ((base)+19)
#define errflag reset /* reading*/

But you don’t have to bother with those as we provide two more convenient ways to talk to a Transputer.

☝ Some words of warning to the programmers:

  1. While the 68000 in your ATARI is big-endian, Transputers are little-endian. So data being send back and forth might need conversion.
  2. Floating-point variables used by the Transputer are IEEE 754-1985, thus 32 Bit (single precision) or 64 Bit (double precision).
    Some compilers like Turbo/Pure-C on the ATARI ST use 80bit doubles.
    Those need to be converted by e.g. the xdcnv call from the PCFLTLIB library.

The static way

The raw-way is using an include file called “trproc.h”.  It’s – like everything else – included in the program archive, located in the “DEVELOP” folder.

This include-file provides you these calls to receive (get) or send (put) data to/from your Transputer:

get/puttrchar(char) read/send one byte
get/puttrshort(short) read/send a short (2 bytes)
get/puttrint(int) read/send an integer (32 bytes)
get/puttrlong(long) read/send a long (32 bytes)
get/puttrfloat(float) read/send a float (32 bytes)
get/puttrdouble(double) read/send a double (64 bytes)
get/puttrraw(char *array, int length) read/send an array of length

The calls marked blue are doing the endian-conversion for you.

Additionally there’s a call to check for an available Transputer: checkTransputer(int checkType) 

If checkType is ‘0’, this function will return ‘1’ if it was able to find a Transputer or ‘0’ when not.
Setting checkType to ‘1’, the return value will give you the “family” of the found Transputer:

0 – No Transputer found
1 – Found a C004 link-switch
2 – A 16bit T2xx Transputer was found
4 – A 32bit T4xx/T8xx Transputer was found
-1 – Found something unknown

The elegant way – TBIOS

The much more elegant way is provided by André who extended the ‘ALIABIOS’ from a project published in the German computer magazine c’t back in 1989.
It’s a GEMDOS driver called “TBIOS.PRG” and can be put into your AUTO folder or called manually when needed. This driver has all the bells’n’whistles like a proper XBRA-ID etc.

DOS# call-name - result (D0=0 Ok) 

100 SetLinkAdr(Adr:W) D0 =-1 not ready 
101 ByteToLink(Value:W) D0 =-1 Timeout 
102 ByteFromLink() D0 =-1 " 
103 LongWordToLink(Value:L) D0 =-1 " 
104 LongWordFromLink((Value):L) D0 =-1 " 
105 SliceToLink((Buf):L,Len:L) D0 RealLen 
106 SliceFromLink((Buf):L,Len:L) D0 " 
111 TestError() D0 =1 Transputer Error 
112 SetReset() D0 =0 
113 SetAnalyse() D0 =0 
114 BootRoot((FileName):L) D0 <0 Error 
115 NewFunkOk() D0 ="ELK1" functions available 
116 BlockToLink((Buf):L,Len:L) D0= sent bytes
117 BlockFromLink((Buf):L,Len:L) D0= sent bytes -
118 BlockFromLink((Buf):L,XLen:L,YLen:L,Offset:L) without timeout
119 GetCommand(Buf) D0 =-1 no command found 
(as SliceFromLink but shorter timeout)

👉 Need short coding examples here

Programs and demos

As ATARI never planned something like this card, there’s no ready-to-use software… it’s up to you to create miracles 😊
But compared to my 8bit Transputer adapters, there’s quite some stuff to start with:

💾 Visit the Atari Transputer Software repo at GitHub (most recent) or get this ZIP archive containing everything discussed below.

Basic Testing

Yes, literally, we’re testing if your Transputer is working correctly using a BASIC program called T_TEST.GFA – so right, it’s GfA Basic in this case. But in essence it’s nearly the same used for my C64 or Apple II interfaces.
This little Program checks if it can find a link-interface, a Transputer and if so, which kind (16 or 32 bit). If that went OK, it does a little coms-speed test by reading 4KB from the Transputer and times that.

Mandelbrot fractal

You knew that this has to be the first thing to be written 😜
There are two Transputer binaries…

TMANDEL.PRG – the evil, dirty, down-to-the-metal, direct-to-screen-writing version.
This is good for getting an idea of how fast data is being pushed to the Atari ST without much handling overhead.
As this writes to the Screen directly, it only runs in “ST-High” resolution (i.e. 640x400x1).

GEMMAN.PRG – The well behaving GEM version.
It opens a window max’ed to the current resolution and starts plotting the fractal in 16 colors. This takes longer than TMANDEL, as it does quite a bit of GEM juggling before plotting a pixel…

Getting serious

So, this is the part for doing serious things with your Transputer(s) and specifically André Saischowas domain.
He did not only port all needed INMOS tools like iserver to run all the available development tools from back in the days (OCCAM, C, etc) but also ported the Helios server, i.e. the software which runs on the host (i.e. your ATARI) and communicates with the Helios Kernel(s) running on your insane Transputer Farm!
This is a good 75% of what the ATW800 offered – the missing 25% are the graphics which ran on the Blossom chip and was only accessible by the Transputer.

That said you’ll currently find 2 folders in the archive:

  • C-Code – contains the Mandelbrot demos
  • Andres – the serious stuff containing
    • AUTO – the TBIOS driver and stuff needed during ATARI bootup
    • BIN – the INMOS tools like iserver as well as the always-needed ispy utility
    • D72UNI – contains the transputer hosted compiler environment based on d7205a (OCCAM) and d7214c (C-Compiler). Visit transputer.net for plenty of documentation on those. See the README in that folder.
    • HELIOS11 – well, that’s the Helios v1.1 distribution. It’s way smaller than the v1.3 and good for an initial try. You can later switch to v1.3.1 following these steps.

There you have it (for now) – the ATARI ST is therefore the currently third best supported host platform after the PC running DOS or Windoze NT(!) or SPARCStations running Solaris 2.

Tto68k

The Tto68k project started by a classic “phone call doodling” situation… but instead of drawing strange patterns I was fiddling alternately with one of Transputer TRAMs and a spare 68000 CPU I had laying on my desk.
At one point it dawned to me, that the 68000 classic 64pin DIL package perfectly fits in-between a TRAMs socket-pins 😲.

Obviously this discovery immediately had to go into a project which I called Tto68kactually it is a spin-off from the STG[A]TW project which I recently did for the Atari Mega-ST.
So this is fully compatible and everything developed for that card (minus the VGA stuff, obviously).

Three in a row…
Ahhh… a perfect fit!
15 MIPS topping the 68k’s 1 MIPS 💪

Where space allows, the PCB offers certain features:

  • 2 LEDs showing the Transputer status (running/error)
  • An external Link, compatible with the STGATW and my CPU-relocator. Thus you can connect to another TRAM on that one.
  • Dedicated 5V/GND pins to feed-in external power (if needed)
  • Version 1.1 will have two “multi-purpose” pins (see below)

So while the features are pretty basic compared to the STGATW, it has one advantage: The 68000 socket is system-agnostic. And I don’t mean just the different ATARI ST models (520, 1040, Mega) but other systems, too. E.g. the AMIGA, the entry Macintosh line etc. As some of them have more advanced bus management than the ATARI, I saved two of the CPLDs pins as “multi-purpose” pins.
For example in the case of an AMIGA these could be used for the configuration chain (/CFGINn, /CFGOUTn).
While in the ATARI STs those will be used for TOS ROM decoding… or whatever comes to my/your mind.

All that said, this post is just an announcement for now.
Like mentioned, I’m working on a Version 1.1 which will be much more usable, especially for other systems than just the ATARI ST.