this might also be of interest: I have created a SoC/CPU on the iCEbreaker. It goes further than the hardware alone as it comes with an monitor program, simulator, assembler and even the beginnings of a C-like compiler (all in Python) and so far it has been enormous fun implementing all that.
I am only a beginner of course (this being my first real fpga project) but I did my best to document everything in a blog/wiki and am working on better documentation of the design.
Thanks for sharing this. I’m getting a board soon so it’s good to see how others are getting on with it. Any surprises or gotchas that you ran into when getting it running on the actual board?
I saw a blog mentioning 2 cycle delays on the block RAM that threw off your CPU state machine. I remember hitting the same hurdle a long time ago when I was trying a custom CPU and wondering what was wrong with the state machine.
My mental model of this is something like: Cycle 1: block RAM address input is stored to a register in your defined module, cycle 2: data output is stored in the block RAM output (“RDATA” according to Lattice PDF), cycle 3: RDATA output or some derived data can be assigned to whatever register in your module. I could get away with skipping cycle 2 on an asynchronous RAM but not the synchronous RAM in the FPGA I had. I just treat it as an extra pipeline stage with throughput unaffected. Happy to be corrected since I’m not an expert on this either.
Apart from getting the info from all the Lattice datasheets together (and sorted correctly between my ears ) it was pretty straightforward. Because I am more or less completely new to the whole fpga thing I had actually more issues getting to grips with Verilog than the actual hardware and I guess that was also what tripped me when looking at the timing for those block rams.
Now that I now how to set up test benches, in hindsight I guess I could have saved myself some headaches by investing a bit more time upfront on creating test benches.
The only thing I have not working right now is the dsp in adder mode (the inference for multiplication works fine, but whatever I do, I can’t get the SB_MAC16 primitive to add). This might be a tool chain thing or something I don’t understand yet, we’ll see, that’s the fun part
So basically all my issues are software related, the board itself great.