Tag Archives: C64

The T2C=

It had to be done… and now, 12 years later 😱, it is done:
Finally the T2C64 was poured into a proper PCB and some bells’n’whistles had been added – so it’s now on par with the T2A2 for the Apple II series. Say hello to the T2C=!

TLDR;

Here’s a quick intro for those being to lazy to follow the link to the T2C64.

This card enables your Commodore 8-bit computer to communicate with a Transputer Module piggybacked onto the card.
A Transputer is a 32bit RISC(ish) CPU from the 80’s that has the unique ability to connect to other Transputers by a very simple 2-wire protocol making it possible to create large, powerful computing networks – at least by 1980+ measures 😉

What can I do with it?
How to talk to the Transputer?
Can I do something useful with that?
How fast can I move data back and forth?
Ok, how much?

After many years the Commodore 8-bit bug had bitten me again and it was due time to put some love into my C64 Transputer interface.
But while at it, I thought it would be handy to use this card not just with my C64 but also on all other Commodore machines featuring an expansion port.

This led to the (to my knowledge) first 8-bit Commodore ‘flipper card’, i.e. it has a port connector on each end. One for the C64 & C128 and one for the C264 family, namely the C16, C116 and Plus/4. Yes, it works with all of them. Pull it from your C64, flip it 180° and plug it into e.g. your Plus/4. Cool, huh?
So here’s a quick feature list:

  • Edge connectors to connect to the Commodore
    • C64
    • C128
    • C16/C116
    • Plus/4
  • Each edge connector offers 2 I/O address ranges to be set by a jumper (0xDE00/0xDF00 and 0xFD90/0xFDF0)
  • Offers two TRAM (TRAnsputer Module) slots to connect either 2 Size-1 or one Size-2 TRAM
  • External Transputer-Link connector to connect the T2C= to larger external Transputer networks (pinout is the same as on the T2A2) – not populated on the pictures here.
  • The data-bus is fully buffered to prevent interference with other cards when used in e.g. an expander
  • 3.3V CPLD used to reduce power-consumption as much as possible.

Here’s the card in full glory… without a TRAM plugged in:

….and with one size-1 TRAM which itself provides a 32bit T800 Transputer and 128K RAM. The 2nd slot is still free.

TRAMs came in a wild range of variations. Be it CPUs used on them and/or the amount of RAM. But there also been peripherals like SCSI controllers or graphic cards – check my little TRAM page if you like to get an idea.
Yes, TRAMs are quite vintage and thus hard (or expensive) to get… but don’t despair… I’ve designed my own and also have some old ones in stock – probably enough to serve the hand-full of interested nerds 😉

What can I do with these?

I knew that was coming 😜
Well it mostly depends on you. The T2C= is an accelerator running its own code in its own RAM and can exchange data with your Commodore 8-bit machine – everything is possible.

All code examples and sources are available in this archive.
Commodore files are in a .D64 disk image.

Personally I always have the initial reflex to run a Mandelbrot fractal on everything’s slightly capable to do so. Most of the time, that’s where my euphoria ends and my project-ADHD kicks in… but that doesn’t stop you from having cool ideas.

Technically this setup isn’t much different from slapping a Raspberry Pi to your Commie and let that do stuff… but there’s something I’d like to call the “5 connoisseurs C’s” which might not be everyone’s cup of tee but very tempting to others:

  • Contemporary: Transputers are from the same era like your Commodore machine while being much more powerful – we’re talking about ~15MIPS here.
  • Completely different: Transputers are natively programmed in OCCAM, a very interesting, different language than the one you might be used to.
    That said, no worries, there are C, Pascal and Fortran compilers, too. Here’s a page offering a little “SDK” I created – it’s a VirtualBox image coming with everything you need to start coding.
  • Connected: Transputers are made to be networked into a parallel network… making your well programmed application running even faster as benchmarks show.
  • Challenging: “Well programmed” means wrapping your genius brain around multi-threaded, parallel paradigms or use the fast 2 or 4K on-chip RAM the most clever way.
  • Communicate: And finally, find a clever way to communicate with the host (i.e. your Commodore) and vice versa.

Ideas for using it could be raytracing, do complex calculations, heavily compress/manipulate data, use it as a simple storage (“Stupid-REU”) or write a Helios server for it and use your C= as a terminal [Helios is a UNIXish OS running on 1 to infinitive Transputers]

A final word of warning: While the T2C= uses very little power, a Transputer (and the RAM on a TRAM) does use quite some juice.
Depending on the TRAM this can be as little as 500mA up to 1A – which means your power-supply should be a stronger one.

Communication

Let’s start with the most simple and actually useful code:
Detect a connected T2C=/Transputer and check if it’s working correctly. This code was already shown on my T2C64 posts but now it’s enhanced for newly added machines and runs in BASIC V2 as well as V3.5 or V7.

After telling the base-address it does the following:

  • Init/Reset the Transputer to a sane state
  • Read & display the statuses of the Link interface
  • Write some data into the Transputers RAM and read it back
  • Finally, send a small program to the Transputer which makes it possible to find out its model (16bit T2xx, 32bit T4xx/T8xx or just a C004 programmable link switch)

So here’s the new TDETECT code:

100 SY=peek(65534):print chr$(147);"This seems to be a";
110 if peek(1177)=63 then poke1177,62:sy=peek(65534):poke1177,63
120 if sy=72 then print " c64": goto 160
130 if sy=23 then print " c128": goto 160
140 if sy=179 then print " plus 4 or c16":goto 160
150 print"n unknown model";print
160 print "select T2C= base address"
170 print "1: c64/c128 $de00 (56832, default)"
180 print "2: c64/c128 $df00 (57088)"
190 print "3: c264 $fd90 (64912, default)"
200 print "4: c264 $fdf0 (65008)"
210 print "5: enter your own"
220 input m
230 if m=1 then ba=56832: goto 290
240 if m=2 then ba=57088: goto 290
250 if m=3 then ba=64912: goto 290
260 if m=4 then ba=65008: goto 290
270 if m>5 goto 160
280 input "base address:";ba
290 print"initializing transputer"
300 do=ba+1:rem data out
310 is=ba+2:rem in status
320 os=ba+3:rem out status
330 re=ba+8:rem reset/error
340 an=ba+12:rem analyze
350 rem ------------------
360 poke re,1
370 poke an,0
380 poke re,0
390 rem clear i/o enable
400 poke is,0
410 poke os,0
420 print"reading statuses"
430 print"i status: ";(peek(is)and1)
440 print"o status: ";(peek(os)and1)
450 print"error: ";(peek(re)and1)
460 print"sending poke command"
470 pokedo,0
480 print"o status: ";(peek(os)and1)
490 :
500 print"sending test-data to t. (12345678)"
510 poke do,0:poke do,0:poke do,0:poke do,128
520 poke do,12:poke do,34:poke do,56:poke do,78
530 print"i status: ";(peek(is)and1)
540 :
550 print"reading back from t."
560 poke do,1:rem peeking
570 poke do,0:poke do,0:poke do,0:poke do,128
580 print peek(ba);peek(ba);peek(ba);peek(ba)
590 :
600 dimr(4)
610 print"sending program to transputer..."
620 forx=1to24
630 readt:poke do,t
640 wait os,1
650 nextx
660 print:print"reading result:"
670 c=0
680 n=ti+50
690 ifc=10 goto 760
700 if ti>n then ee=ee+1:if ee=10 goto 760
710 if(peek(is)and1)=0 goto 700
720 r(c)=peek(ba)
730 c=c+1
740 goto 680
750 rem ------------------------
760 if c=1 then print"c004 found"
770 if c=2 then print"16 bit transputer found"
780 if c=4 then print"32 bit transputer found"
790 data 23,177,209,36,242,33,252,36,242,33,248
800 data 240,96,92,42,42,42,74,255,33,47,255,2,0

Do something useful?

So now that we have detected a connected Transputer on our Commodore, it should do something useful like… adding numbers.
While this is way beneath his dignity, it’s a good example of uploading code to the Transputer and how to send and read data.

For this, I’d like to redirect you to the 2nd code example I’ve posted for the T2C64…

And finally in this former post I coded a Mandelbrot fractal (Video inside! 😉) for the C64 using cc65 and the TGI graphics library which calculates and displays the initial fractal within a minute or so.

Now having the whole family of C264 machines added, I thought it would be nice to have a demo for them, too.
So, just because it can do graphics out of the box,  I wrote the Mandelbrot “frontend” in BASIC 3.5. It worked but it was brutally slow…it takes like 10 minutes or so to get this screen 🐌

That is – of course – because BASIC is darn slow in doing the IO and plotting. Looking at one of the above examples, reading a byte from the Transputer means read a byte, set “pen” to the next coordinate, decide if to plot or not, repeat – in code (provided in the D64 disk image) this looks like that:

390 for y=0 to 199
400 :for x=0 to 319
420 ::px=peek(ba)
430 ::if px=32 then draw 0,x,y:else draw 1,x,y
440 :next x
450 next y

Because it’s so slow, I even didn’t need to check the input-status of the link-interface as the Transputer delivers the data much quicker than BASIC can say “next”…

This of course will be the ultimate show-stopper. What’s the sense of such a fast number cruncher, if you can’t get the data out of it fast enough?

Speed?

Mhh, so how long does it take to (just) read data from the T2C=?
Let’s start with BASIC to have milestone. This is “BAS-SPEEDTEST”, a very simple benchmark.
It loads a tiny Program into the Transputer which makes him spitting out an endless loop of counting from 1 to 10. Then we read the amount of 4KB and stop the time on that.

NB: As seen on the examples above, there’s an automatic handshaking in the way that the C012 link-interface chip on the T2C= sets a flag (Out-Status) each time there’s a byte ready to be fetched. But BASIC is so slow, that there’s always new data available the next round reading.

100 ba=56832:rem Adjust your base accordingly
110 dd=ba+1:rem data out
120 is=ba+2:rem in status
130 os=ba+3:rem out status
140 re=ba+8:rem reset/error
150 an=ba+12:rem analyze
160 rem ------------------
170 poke an,0
180 poke re,0
190 poke re,1
200 for d=1 to 500:next d
210 poke re,0
220 print"sending program to transputer..."
230 forx=1to33
240 readt:pokedd,t
250 waitos,1
260 nextx
270 print"reading incoming data..."
280 zeit=ti
290 for l=1 to 4096
300 in=peek(ba)
310 next
320 print"time for 4k:";(ti-zeit)/60
370 rem --- for the transputer
380 data 32,181,36,242,33,248,36,242,33,252,37,247
390 data 34,249,70,33,251,36,242,74,251,96,7,1,2,3
400 data 4,5,6,7,8,9,10

That showed it clearly… Basic is an IO-sloth.
Even without waiting for the Input-State ready it took the C64 16.5 seconds to read 4KB – nearly twice as long if we check for the input-status.

Machine without WAIT IS with WAIT IS
C64 16.5 29.4
C128 24.28* 40.5
Plus/4 19.8 33

*) It’s strange, that Basic V7 is even slower – investigation is ongoing
Talking to “the 128 Master” (Johan Grip) the mystery was solved.
Basic 7 also does a “long” fetch through extra vectors and code in ram. That does add quite a bit of overhead.
You could say that BASIC 7 has a “bad peek performance” 🙂

More speed, please!

Ok, let’s use something more mature… like the cc65 creating nice code for all our beloved Commodore machines.

[expand title=”This is a longer one… so please expand” ]

#pragma static-locals(1);
 
#include <stdlib.h>
#include <time.h>
#include <conio.h>
#include <peekpoke.h>
#include "trproc.h" // that's in the provided archive
 
#define TOBEREAD 4096 // how many bytes should be read
 
static char tcode[33] = {
0x20, 0xB5, 0x24, 0xF2, 0x21, 0xF8, 0x24, 0xF2, 0x21, 0xFC, 0x25, 0xF7,
0x22, 0xF9, 0x46, 0x21, 0xFB, 0x24, 0xF2, 0x4A, 0xFB, 0x60, 0x07, 0x01,
0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A
};
 
int main (void)
{
clock_t t;
unsigned long sec, kbps;
unsigned sec10;
 
int i;
char onechar;
 
clrscr ();
 
#if defined(__C64__) || defined(__C128__)
cprintf ("Expecting T2C= at 0xde00\r\n"); 
#elif defined(__PLUS4__) || defined(__C16__)
cprintf ("Expecting T2C= at 0xfd90\r\n"); 
#endif
 
/* Init Transputer - fixed to plus/4 dfeault for now */ 
init_t(); 
 
/* upload Transputer code */
cprintf ("Sending code to Transputer\r\n");
puttr(tcode, (sizeof(tcode)));
cprintf ("start reading %d bytes...", TOBEREAD);
t = clock ();
 
/* reading X KB byte by byte*/
for(i=0; i < TOBEREAD; i++) {
gettrchar(onechar);
}
cprintf ("done\r\n");
t = clock () - t;
 
/* Calculate stats */
sec = (t * 10) / 50;
sec10 = sec % 10;
sec /= 10;
kbps = TOBEREAD / sec;
 
/* Output stats */
cprintf ("\r\nDuration: %lu.%us (%lu byte/s)\n\r", sec, sec10, kbps);
 
/* Done */
return EXIT_SUCCESS;
}

[/expand]

But this time it just took 4.4/4.6/3.6 seconds on a 64/128/+4! 🏍💨 Four times faster than BASIC.

Are we there yet?

That looks promising and there’s still a C-compiler which we can optimize… read: replacing it with assembly code super-power.

For that I wrote a little macro-library for KickAssembler. Any other assembler will do, too, of course.
Besides the Transputer initialization and detection stuff there are macros for reading and writing a single byte, up to a “page” (256bytes) and the full 64K using two zero-page adresses… this is how we read the 4KB in the benchmark:

.label base = $de00 // define according to setting
.label inreg = base
.label outreg = inreg + 1
.label instat = inreg + 2
.label outstat = inreg + 3
.label reset = inreg + 8
.label analyse = inreg + $c
.label errflag = reset
 
		* = $C000 // or wherever you like
start:		
		inittr()  // macro init_transputer
 
		lda #$2E  // print a "." at 1/1 for debugging ;)
		sta $0400
 
		puttr_1page(bench, 34)  // upload the benchmark code
 
		lda #$00  // Set destination pointer to base ($8100)
		sta $FB   // in zero page
		lda #$81
		sta $FC
 
		gettr_big($FB, $1000) // get 4KB data and write it to $8100
 
		rts
 
		// code for benchmarking & testing (33 bytes)
		// The first byte is always 'sizeof(code)'
  bench:	.byte $21, $20, $B5, $24, $F2, $21, $F8, $24, $F2, $21, $FC, $25, $F7
		.byte $22, $F9, $46, $21, $FB, $24, $F2, $4A, $FB, $60, $07, $01
		.byte $02, $03, $04, $05, $06, $07, $08, $09, $0A

“Wrapped” as a SYS call into a BASIC program to stop the time – including the Transputer code upload and checking the input-status – this code takes 0.5 seconds to read 4KB and even with writing the read data to a defined memory area! 🚀
That’s 33 times faster than BASIC and still 8 times faster than cc65.

Getting one?

Still with me and you’re really interested in getting one of these?
Please go through this checklist first:

  • You’re aware that there’s no software for it yet
  • You’re aware that you have to code for the Transputer and are up to learning new things
  • You also need to write code on the Commodore side – assembly needed for maximum speed
  • Besides the T2C= you will need a TRAM. So you need to own/purchase one, too.

If you can answer all of them with “Yes”, “fine with me” and/or “sure!” I can provide you with a T2C= for €40 plus shipping.

I also have TRAMs available in different configurations:

  • My own design, the AM-B404  (size-1, 2MB SRAM): 45€
  • Various manufacturers: size-1, 1MB DRAM: Ask me.
  • “bargain offer”: original INMOS IMS-B404 (size-2, 2MB): 25€
    (given their size, they will clog your T2C= completely)

available CPUs are:

  • T425-20: 7€
  • T800-20: 12€
  • T800-25: 20€

For example, if you like to have a T2C=, my AM-B404 TRAM and a 20MHz T800 that would be 40 + 45 + 12 = 97€ plus shipping

Shipping with tracking is
European Union 9€  (Just in case, for Germany it’ll be 6€ as “Paket”)
UK/Switzerland 13€
USA 15€

⇒ drop me a mail tonobody likes SPAM
(Sorry, you have to type that into your mail-client – nobody likes SPAM, so do I)

3rd sample and video

And now something which you knew it would come: Mandelbrot time! 😉

I don’t want to bore you with all the details before you had it seen in action… so here we go:

“Man, that was brilliant! And even you had a lot of geek-babble in there, I want to know more!”
Ok Timmy, let’s go into detail…

Like I said in the video, the Transputer is finally doing something for real, he’s actually doing the most of the work, crunching through a 320x200x8 Mandelbrot, 32 iterations in double-precision floating point. The code itself was written 1988 in OCCAM by Neil Franklin and is available on his page.
This shows the general beauty of Transputers: If the code is written flexible enough to fit into any topology, it’ll run on any platform!

That said, I ran into a certain limitation on the C64. The Transputer mandelbrot executable expects the initial data (resolution, coordinates, iterations) to be send in a specific order and format. While the order isn’t the problem, the format is: The coordinates have to be doubles (C-lingo i.e. 64bit IEEE 754 compliant float). The C-Complier I’ve used for this demo (CC65) doesn’t now a flying s*** about floats or even doubles.
So to get that demo done ASAP I tricked myself a bit and used the same technique we’ve seen in my 1st demo when POKEing something the Inmos-way:

To get the coordinates for left (-2.0), right (1.0), top (1.125) and bottom (-1.125) over to the Transputer they had to be converted into 64bit IEEE 754 format, ordered into little-endianess and finally put into an array like these:

static char left[] = {0x00,0x00,0x00,0x00,0x00,0x00,0x00,0xc0}; //-2.0
static char right[] = {0x00,0x00,0x00,0x00,0x00,0x00,0xf0,0x3f}; // 1.0
static char top[] = {0x00,0x00,0x00,0x00,0x00,0x00,0xf2,0x3f}; // 1.125
static char bottom[] = {0x00,0x00,0x00,0x00,0x00,0x00,0xf2,0xbf}; //-1.125

Yes, that’s a bit awkward, but it was OK to get a fast start. The ‘problem’ with this quick-hack is that the demo is pretty static, i.e. no zooming into the Mandelbrot. If you know of a quick way to create IEEE 754 compliant doubles from a long (which is the biggest floating point variable CC65 can handle, so StringToDouble() isn’t an option here) I’m happy to hear from you.
Of course I could have the Transputer do the typecasting but in this case, as part of a demo, I wanted to keep the original binary untouched.

In the video you saw (or didn’t because of the blurry picture) that the timer printout was about 70s… and as said, normally it takes about 60s to complete the fractal – and it did in the video, too! Watch the video-timer or check with a stopwatch. What I’ve forgot to take out from the timing was the actual upload of the code into the Transputer.
The Transputer binary is in this case a bigger array in the C-source, so it’s not being loaded from floppy but directly pushed to the Transputer after it was initialised. This takes some extra time which also went into the stopwatch timing… I’ll correct that in a later version.

“Later version” is a good catchword. If this wouldn’t be just a demo for now, there are obviously plenty of ways to optimise things:

  • First of all one should take off the burden of converting the colors from the C64 and let the Transputer do that.
  • My second idea would be to reduce the communication overhead (polling) by having the Transputer to render the whole screen into his own RAM and when done have it ‘pumped’ down to the C64
  • Yes, DMA would be cool but that’s not possible (yet)

Ok, that’s about it for now. The T2C64 is still in its prototype stage and I can image many more cool things to add… but first I will have a ‘proper’ circuit board being made.

Final words: Don’t get too excited about the acceleration of the C64… it’s not accelerated at all. It’s more like a co-processor attached to it. And even then, you’ll need something really calculation-intensive to justify the time you’ll loose due to communication between the C64 and the Transputer. A single square-root for example wouldn’t make sense at all. 100 sqrts in one go would certainly do.

Of course adding another linkOut/In to the T2C64 to get more Transputers involved into the calculation would be the final step. This is planned for the next version of the hardware but the bigger part of the work would be a complete rewrite of the Mandelbrot code to have it broken down to parts being run in parallel on each Transputer… which closes the loop to today where programmers are trying to wrap their brains around multithreaded programming. 22 years after the first Transputer was released 😉

2nd code example

This example actually puts the Transputer to some sort of use… well, it’s still way beneath him but it does its job as an example.
Two tasks: a) Read the Transputer executable from disk and b) exchange data with the Transputer.

Here we go:

10 PRINT"INIT TRANSPUTER"
20 POKE 56840,0
30 POKE 56844,0
40 POKE 56840,1:POKE 56840,0
50 FOR W=0 TO 1000:NEXT W
60 POKE 56834,0
70 POKE 56835,0
80 REM READ STATI
90 PRINT "I STATUS: ";(PEEK(56834)AND1)
100 PRINT"O STATUS: ";(PEEK(56835)AND1)
110 PRINT"ERROR: ";(PEEK(56840)AND1)
140 PRINT"O STATUS: ";(PEEK(56835)AND1)

Up to here everything should be clear. Init Transputer, read stati -just to be sure.
Now we’re reading the executable (“GRAB”) from floppy and poke the bytes to the Transputer as they come flying crawling in:

150 DIM R(4)
160 PRINT"SENDING PROGRAM TO TRANSPUTER..."
170 REM READ TRANSPUTER BIN FILE
180 OPEN 1,8,2,"0:GRAB,S,R"
190 FOR I=0 TO 1 STEP 0
200 IF ST=64 GOTO 240
210 GET#1,A$:A=ASC(A$+CHR$(0))
220 POKE 56833,A
230 NEXT I
240 CLOSE 1

Ok, the Transputer binary expects us to tell it the number of integers we’re about to send to it (AN), the we send these (X) and finally re-read the results… which should all be X+7.

250 INPUT"COUNT:";AN
255 POKE 56833,AN
260 FOR AA=1 TO AN
270 PRINT"ENTER ";AA:INPUT X
280 POKE56833,X
290 NEXT AA
295 PRINT"GETTING";PEEK(56832);" RESULTS"
300 FOR BB=1 TO AN
310 PRINTPEEK(56832);
320 NEXT BB

Some final words about Transputer executables. There are many kinds depending which development system was used. Some need special loaders or bootfiles before the actual binary will be send to the Transputer. A ‘raw’ upload is only possible with binaries mostly having the extension of “.btl” (BooTfromLink). AFAIK only the OCCAM SDK and the Inmos C-compiler ‘icc’ will create those executables.

For those geeks who actually (still) do know their OCCAM, here’s the source of GRAB:

PROC first(CHAN OF BYTE in, out)
  #USE "hostio.lib"

  BYTE data.in, data.out :
  INT count, temp :
  [1000]BYTE array :

-- {{{ start main SEQ

  SEQ
    in ? data.in
    count := (INT data.in)
    SEQ i=0 FOR count
      in ? array[i]                   

        -- {{{ inner loop
        
            SEQ i=0 FOR count
              SEQ
                temp := (INT array[i])
                temp := temp + 7
                array[i] := (BYTE temp)
        
        -- end of inner loop
        -- }}}
                
    out ! (BYTE count)
    SEQ i=0 FOR count
      out ! array[i]          
    in ? data.in       -- debug only
    CAUSEERROR()

-- }}}
        
:

1st code example

For the fun of it, here’s my first Commodore BASIC v2 program which does 3 things.

  1. Line 1-50: Initialize the Transputer and set/get its status.
  2. Line 69-110: Write (poke) some data into T’s memory and read (peek) it back.
  3. Line 128-240: Send a little program to the T and read its result.

10 PRINT"INIT TRANSPUTER"
11 POKE 56840,1:POKE 56844,0:POKE 56840,0
20 REM CLEAR I/O ENABLE
21 POKE 56834,0: POKE 56835,0
30 REM READ STATI
31 PRINT "I STATUS: ";(PEEK(56834)AND1)
35 PRINT"O STATUS: ";(PEEK(56835)AND1)
40 PRINT"ERROR: ";(PEEK(56840)AND1)

This is the Transputer POKE command (0) which tells the T that the next 4 bytes are an address followed by 4 bytes of data:

45 PRINT"SENDING POKE COMMAND"
46 POKE 56833,0
50 PRINT"O STATUS: ";(PEEK(56835)AND1)
58 :

Mind that were little endian and BASIC is using decimal numbers. So we’re POKEing 0,0,0,128 which is 0x00, 0x00, 0x00, 0x80 in hex which again is the 32bit address of 0x80000000.

59 PRINT"SENDING DATA TO T."
60 POKE56833,0:POKE56833,0:POKE56833,0:POKE56833,128
61 POKE56833,12:POKE56833,34:POKE56833,56:POKE56833,78
70 PRINT"I STATUS: ";(PEEK(56834)AND1)
79 :

Now reading back (PEEK) from 0x80000000 what we just wrote:

80 PRINT"READING FROM T."
90 POKE56833,1:REM PEEKING
100 POKE56833,0:POKE56833,0:POKE56833,0:POKE56833,128
110 PRINTPEEK(56832);PEEK(56832);PEEK(56832);PEEK(56832)

These 4 sequential PEEKs print 12 34 56 78. Yay! A very simple Ram-disk if you want.

Now something more serious. A very small programm is sent to the T and executed. Instead of peek and poke (1/0) the initial ‘command’ is the length of the program (23 in this case), so the T starts executing after he received the 23rd byte.
The program itself is trivial, after some initialisation it writes 0xAAAA0000 to link0out. The use of it is very simple. If the Transputer is working correctly the first 2 bytes will allways be “0xAA”, if it is just a 16bit Transputer (T2xx) there won’t be more than 0xAAAA. A 32bit Transputer returns the full 0xAAAA0000. Voilá, that’s a simple T-detection.
To handle timeouts end error conditions I had to do some ugly things in BASIC. Sorry, don’t blame me, BASIC v2 is just… very basic.

120:
128 DIM R(4)
129 PRINT"SENDING PROGRAM TO TRANSPUTER..."
130 FOR X=1 TO 24
140 READ T:POKE56833,T
150 WAIT 56835,1
160 NEXT X
170 PRINT:PRINT"READING RESULT:"
175 C=0
180 N=TI+50
181 IF C=10GOTO 220
189 IF TI>N THEN ER=ER+1:IF ER=10 GOTO 220
190 IF (PEEK(56834)AND1)=0 GOTO 189
195 R(C)=PEEK(56832)
200 C=C+1
210 GOTO 180
211 REM ------------------------
220 IF C=1 THEN PRINT"C004 FOUND"
230 IF C=2 THEN PRINT"16 BIT TRANSPUTER FOUND"
240 IF C=4 THEN PRINT"32 BIT TRANSPUTER FOUND"
1000 DATA 23,177,209,36,242,33,252,36,242,33,248
1001 DATA 240,96,92,42,42,42,74,255,33,47,255,2,0

Next up a bit more useful example, reading the Transputer executable directly from a binary file instead of having it in-line.

T2C64

So here we go, C64 first. Some moons ago I was at least midwife if not the mother of the 8BitBaby manufactured by the fairies and elves over at individual computers.

This little card features 4 slot-connectors to the most used home-computer systems as well as a CPLD… and makes a perfect basis for my first C64 “cartridge” ever 😉

The schematic was taken from “Das Transputerbuch” by U. Gerlach and being put into the CPLD saving lots of TTL chips. Actually all you need then is an Inmos C012 or C011 link-interface chip, a 5MHz oscillator and some resistors and caps.
I refrained from designing a complete Transputer system but went the easy route by interfacing to a TRAM. A TRAM is a complete “Transputer system on-a-stick”, like those used in my Tower of Power. All you need for a TRAM is an interface to your system. For the C64 you would need the thing I’ve called T2C64 (aka “Transputer to C64”)

The T2C64 prototype in its full glory:

T2c64

A short overview what’s on the card:

  • The left edge is the connector to the C64s expansion slot (ignore the other edges).
  • The square black chip is the Altera CPLD mainly containing the bus-interface and the Transputer Analyze/Reset/Error signal handling. Ah, and it controls the 2 LEDs.
  • The long black chip above the CPLD is an Inmos C012. This guy interfaces an 8bit parallel bus to the serial Inmos link ‘bus’.
  • The silver thing is a 5MHz oscillator to clock the C012.
  • And finally the longish module at the top edge is a TRAM with 128KB RAM and a mucho-macho T800 Transputer.

The CPLD maps the C012 register into the memory area of 0xde00, using just 6 addresses:

  • Data in       0xde00 (56832)
  • Data out     0xde01 (56833)
  • in-status     0xde02 (56834)
  • out-status   0xde03 (56835)
  • reset/error  0xde08  (56840)
  • analyse      0xde0c  (56844)

So programming it pretty easy, still, you have to read some Inmos manuals about the protocol etc… that said the next chapters will show you some code examples I’ve slapped together just to test the hardware. All sources and binaries (.D64 image) are available in a zip file over here, obviously most of that does not make sense without having a T2C64…

Read on in the next posts for demo-code and even a video. “uuh, videos!”