The Cube

Meet The Cube – this is the Transputer Power-House successor to the Tower of Power, which was a bit of a hacked frame-case and based on somewhat non-standard TRAM carriers with a max. capacity of just 24 size-1 TRAMs…

The Cube hardware

This time I went for something slightly bigger  😎 …A clear bow towards the Parsytec GigaCube within a GigaCluster.
The Cube uses genuine INMOS B012 double-hight Euro-card carriers, giving home to 16 size-1 TRAMs – Parsytec would call this a cluster and so will I.
Currently The Cube uses 4 clusters, making a perfect cube of 4x4x4 Transputers… 64 in total. Wooo-hooo, this seems to be the biggest Transputer network running on this planet (to my knowledge)
If not, there still room left for more  😯

Just to give you a quick preview, this is what ispy responds when ran against the Cube:

Using 150 ispy 3.23 | mtest 3.22  # Part rate Link# [ Link0 Link1 Link2 Link3 ] RAM,cycle  0 T800d-24 276k 0 [ HOST ... ... 1:1 ] 4K,1 1024K,3; [expand title="Display all 64 lines"]  1 T800d-25 1.7M 1 [ ... 0:3 2:1 3:0 ] 4K,1 2048K,3;  2 T800d-24 1.8M 1 [ ... 1:2 4:1 5:0 ] 4K,1 2048K,3;  3 T800d-25 1.8M 0 [ 1:3 6:2 5:1 7:0 ] 4K,1 2048K,3;  4 T800d-24 1.8M 1 [ ... 2:2 6:1 8:0 ] 4K,1 2048K,3;  5 T800d-25 1.8M 0 [ 2:3 3:2 8:1 9:0 ] 4K,1 2048K,3;  6 T800d-24 1.8M 2 [ ... 4:2 3:1 10:0 ] 4K,1 2048K,3;  7 T800d-24 1.8M 0 [ 3:3 10:2 9:1 11:0 ] 4K,1 2048K,3;  8 T800d-25 1.8M 0 [ 4:3 5:2 10:1 12:0 ] 4K,1 2048K,3;  9 T800d-25 1.8M 0 [ 5:3 7:2 12:1 13:0 ] 4K,1 2048K,3;  10 T800d-24 1.8M 0 [ 6:3 8:2 7:1 14:0 ] 4K,1 2048K,3;  11 T800d-24 1.8M 0 [ 7:3 14:2 13:1 15:0 ] 4K,1 2048K,3;  12 T800d-25 1.8M 0 [ 8:3 9:2 14:1 16:0 ] 4K,1 2048K,3;  13 T800d-25 1.8M 0 [ 9:3 11:2 16:1 17:0 ] 4K,1 2048K,3;  14 T800d-24 1.8M 0 [ 10:3 12:2 11:1 18:0 ] 4K,1 2048K,3;  15 T800d-25 1.8M 0 [ 11:3 ... 17:1 19:0 ] 4K,1 2048K,3;  16 T800d-24 1.8M 0 [ 12:3 13:2 18:1 20:0 ] 4K,1 2048K,3;  17 T800d-25 1.8M 0 [ 13:3 15:2 20:1 21:0 ] 4K,1 2048K,3;  18 T800d-25 1.8M 0 [ 14:3 16:2 ... 22:0 ] 4K,1 2048K,3;  19 T800d-25 1.8M 0 [ 15:3 22:2 21:1 23:0 ] 4K,1 2048K,3;  20 T800d-25 1.8M 0 [ 16:3 17:2 22:1 24:0 ] 4K,1 2048K,3;  21 T800d-25 1.8M 0 [ 17:3 19:2 24:1 25:0 ] 4K,1 2048K,3;  22 T800d-25 1.8M 0 [ 18:3 20:2 19:1 26:0 ] 4K,1 2048K,3;  23 T800d-25 1.8M 0 [ 19:3 26:2 25:1 27:0 ] 4K,1 2048K,3;  24 T800d-24 1.8M 0 [ 20:3 21:2 26:1 28:0 ] 4K,1 2048K,3;  25 T800d-25 1.8M 0 [ 21:3 23:2 28:1 29:0 ] 4K,1 2048K,3;  26 T800d-25 1.7M 0 [ 22:3 24:2 23:1 30:0 ] 4K,1 2048K,3;  27 T800d-24 1.8M 0 [ 23:3 30:2 29:1 31:0 ] 4K,1 2048K,3;  28 T800d-25 1.8M 0 [ 24:3 25:2 30:1 32:0 ] 4K,1 2048K,3;  29 T800d-25 1.8M 0 [ 25:3 27:2 32:1 33:0 ] 4K,1 2048K,3;  30 T800d-25 1.8M 0 [ 26:3 28:2 27:1 34:0 ] 4K,1 2048K,3;  31 T805d-20 1.7M 0 [ 27:3 ... 33:1 35:0 ] 4K,1 1024K,3;  32 T800d-24 1.8M 0 [ 28:3 29:2 34:1 36:0 ] 4K,1 2048K,3;  33 T800d-20 1.8M 0 [ 29:3 31:2 36:1 37:0 ] 4K,1 1024K,3;  34 T800d-24 1.8M 0 [ 30:3 32:2 ... 38:0 ] 4K,1 2048K,3;  35 T800c-20 1.8M 0 [ 31:3 38:2 37:1 39:0 ] 4K,1 1024K,3;  36 T805d-20 1.7M 0 [ 32:3 33:2 38:1 40:0 ] 4K,1 1024K,3;  37 T800c-20 1.6M 0 [ 33:3 35:2 40:1 41:0 ] 4K,1 1024K,3;  38 T800d-20 1.6M 0 [ 34:3 36:2 35:1 42:0 ] 4K,1 1024K,3;  39 T800d-20 1.7M 0 [ 35:3 42:2 41:1 43:0 ] 4K,1 1024K,3;  40 T800d-20 1.8M 0 [ 36:3 37:2 42:1 44:0 ] 4K,1 1024K,3;  41 T800d-20 1.7M 0 [ 37:3 39:2 44:1 45:0 ] 4K,1 1024K,3;  42 T800d-20 1.8M 0 [ 38:3 40:2 39:1 46:0 ] 4K,1 1024K,3;  43 T800d-20 1.8M 0 [ 39:3 46:2 45:1 47:0 ] 4K,1 1024K,3;  44 T800d-20 1.8M 0 [ 40:3 41:2 46:1 48:0 ] 4K,1 1024K,3;  45 T800d-20 1.8M 0 [ 41:3 43:2 48:1 49:0 ] 4K,1 1024K,3;  46 T800d-20 1.7M 0 [ 42:3 44:2 43:1 50:0 ] 4K,1 1024K,3;  47 T800d-20 1.8M 0 [ 43:3 ... 49:1 51:0 ] 4K,1 1024K,3;  48 T800d-20 1.8M 0 [ 44:3 45:2 50:1 52:0 ] 4K,1 1024K,3;  49 T800d-20 1.6M 0 [ 45:3 47:2 52:1 53:0 ] 4K,1 1024K,3;  50 T800d-20 1.8M 0 [ 46:3 48:2 ... 54:0 ] 4K,1 1024K,3;  51 T800d-20 1.8M 0 [ 47:3 54:2 53:1 55:0 ] 4K,1 1024K,3;  52 T800d-20 1.8M 0 [ 48:3 49:2 54:1 56:0 ] 4K,1 1024K,3;  53 T800d-20 1.8M 0 [ 49:3 51:2 56:1 57:0 ] 4K,1 1024K,3;  54 T800d-20 1.6M 0 [ 50:3 52:2 51:1 58:0 ] 4K,1 1024K,3;  55 T800d-20 1.8M 0 [ 51:3 58:2 57:1 59:0 ] 4K,1 1024K,3;  56 T800d-20 1.7M 0 [ 52:3 53:2 58:1 60:0 ] 4K,1 1024K,3;  57 T800d-20 1.8M 0 [ 53:3 55:2 60:1 61:0 ] 4K,1 1024K,3;  58 T800d-20 1.8M 0 [ 54:3 56:2 55:1 62:0 ] 4K,1 1024K,3;  59 T800d-20 1.8M 0 [ 55:3 ... 61:1 ... ] 4K,1 1024K,3;  60 T800d-20 1.7M 0 [ 56:3 57:2 62:1 63:0 ] 4K,1 1024K,3;  61 T800d-20 1.6M 0 [ 57:3 59:2 63:1 ... ] 4K,1 1024K,3;  62 T800d-20 1.8M 0 [ 58:3 60:2 ... 64:0 ] 4K,1 1024K,3;  63 T800d-20 1.8M 0 [ 60:3 61:2 64:1 ... ] 4K,1 1024K,3;  64 T800d-20 1.7M 0 [ 62:3 63:2 ... ... ] 4K,1 1024K,3;[/expand]

Here are some more figures:

  • 32 x T800@25Mhz/2MB  (my very own AM-B404 TRAMs)
  • 32 x T800@20MHz/1MB  (mainly TRAMs from MSC and ARADEX)
  • -> 96MB of total RAM
  • -> 70-130 MFLOPS (single precision)
  • ~800MIPS combined integer power
  • ~60Amps @5V needed (That’s 300W  😯 )

So we’re talking about 70-130 MFLOPS here – depending which documentation you trust and what language (OCCAM vs. Fortran) and/or OS you’re using. That was quite a powerhouse back in 1990 (Cray XM-P class!)… and dwarfed by a simple Pentium III some years later 😉
Just for to give you an comparison with recent hardware (Linpack MFlops):

Raspberry Pi Model B+ (700 MHz) ~40 DP Mflops
Raspberry Pi 2 Model B (1000 MHz – one core) ~134 DP Mflops
Raspberry Pi 3 Model B (1200 MHz – one core) ~176 DP Mflops

Short break for contemplation about getting old…

Ok, let’s go on… you want to see it. Here it is – the front, one card/cluster pulled, 3 still in. On the left the mighty ol’ 60A power supply:

Cube_Front_4x

Well, this is the evaluation version in a standard case, i.e. this is meant for testing and improving. I’m planning for a somehow cooler and more stylish case for the final version (read: Blinkenlights etc.).

And here’s the IMHO more interesting view… the backside. It shows the typical INMOS cabling.

CubeBack

As usual, I color coded some of the cables.
The green arrow points to the uplink to the host system to which The Cube is connected to. Red are the daisy-chained Analyse/Reset/Error (ARE) signals. The yellow so-called jumper-cables connect some of the IMSB004 links back into the boards network. And in the upper row (blue) four ‘edge-links’ of each board are connected to its neighbor.

This setup connects four 4×4 matrices (using my C004 dummies as discussed here)  into a big 4×16 matrix. Finally I will ‘wrap’ that matrix into a torus. Yeah, there might be more clever topologies, but for now I’m fine with this.

Building up power

For completeness, here’s a quick look at how things came together.

The 4 carriers/clusters with lots of size-1 TRAMs… upper right one is the C004-dummy test board (now also fully populated). Upper left is pure AM-B404 love <3

ITEM_futter

Fixing/replacing the broken power-supply (in the back), including the somewhat difficult search for a working cooling solution:

ITEM_repairing

The Cube software

Well there isn’t any specific software needed to run The Cube, but it definitely cries out loud for some heavily multi-threaded stuff.

So the first thing has definitely to be a Mandelbrot zoom. As usual, I used my very own version with a high-precision timer, available in my Transputer Toolkit.

Here’s the quick run in real-time – you can still figure out visually each Transputer delivering its result:

Other Transputer and x86 results of this benchmark can be seen in this post over here.

We need (even) more power, Igor!

So this is running fine – using internal RAM only. On the other hand, it seems that the current power supply has some issues with, well, the electric current.
When booting Helios onto all 64/65 Transputers which uses all of the external RAM, very soon some of them do crash or go into a constant reboot-loop.
By just reducing the network definition (i.e. not pulling any Transputers) to 48, Helios boots and runs rock-solid.
Because measuring the voltage during a 64-T boot shows a solid 5.08V on all TRAM-slots it most likely means the power supply either can’t deliver the needed amount of Amps (~60) or produces noise etc.  😥
So this is the next construction site I have to tackle.

To be continued…

3 thoughts on “The Cube”

  1. I’m really interested to have found this site!

    I got here from a puzzle game by Zachtronics called “TIS-100”, an esoteric assembly language programming game which tasks the player with programming the fictional TIS-100 microcomputer, the specs for which can be viewed here.

    I’d found the TIS-100’s esoteric and unusual architecture – a grid of multiple very small communicating parallel processors with an extremely minimal machine language – absolutely fascinating, and I only just realized that it’s barely fictionalized at all – it’s clearly directly inspired by Transputer systems (the node classes are even named T(number), just simplified and made obtuse and esoteric enough that programming it to do anything useful became a tricky puzzle.

    Considering how much fun I had programming the TIS-100, it’s very pleasantly surprising to know that a more practical relative actually exists, in the form of real hardware, and has a small community of enthusiasts!

    1. Thanks. It’s still not finished yet and supplying the system with sufficient power is still an issue. But I’m working on it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.