Designing a CPU in VHDL, Part 1: Rationale, Tools, Method

cottonseed · on June 19, 2015

I'm also an (ex?) compiler engineer and I've played a little with FPGAs and computer architecture. In addition to the obligatory processor design, I recently created a open-source place and route tool for iCE40 FPGAs:

https://github.com/cseed/arachne-pnr

Together with Yosys (a Verilog synthesis tool):

http://www.clifford.at/yosys/

and the IceStorm bitstream creation tools:

http://www.clifford.at/icestorm/

it provides a full Verilog-to-bitstream open source toolchain for the iCE40 FPGAs. There is also a low-cost (~$21) USB development board:

http://www.latticesemi.com/icestick

Unfortunately, this toolchain doesn't support VHDL, so I can't try out the OP's TPU.

Zardoz84 · on June 19, 2015

If someone like to try VHDL or Verilog and not have FPGA board, could try EDA Playground : http://www.edaplayground.com/x/Cs2 (An 32 bit ALU)

gshrikant · on June 19, 2015

This is amazing! I've been looking for an online HDL for learning and trying out ideas on the fly and with no IDE at hand. Thanks for the link!

Zardoz84 · on June 19, 2015

^_^

I have plans for eventually build a VHDL implementation of TR3200 CPU with it.

bisrig · on June 19, 2015

My best advice: synthesize early and often, and spend the time to poke around in the synthesis schematic viewer - Webpack still includes this I believe. It's a great way to compare what you wrote in code to the logic you intended to implement in your mind's eye (or better yet, your notebook).

JoachimS · on June 19, 2015

And read the design guidelines from the FPGA vendor of the device you are targeting. Xilinx, Altera (Intel), Microsemi and Cypress all have different rules for mapping things like memories, write enables etc.

Xilinx is happy to not reset registers, Altera will generate a bigger design if reset is not stated in the code.

domipheus · on June 19, 2015

Thanks for the advice! I've been synthesizing a bit, fixing a few issues that at least didnt present when building for the simulator. I've yet to run it actually on hardware, though - that part will be fun. Webpack does still come with the schematic viewer, which is really neat.

andmarios · on June 19, 2015

We did similar projects as part of our ECE degree. In general we would start with boolean algebra and logic design, then proceed to computer organization and last to computer architecture (one semester each).

Your project would be at the computer organization level. IIRC adding a pipeline would move it to computer architecture level.

I understand that reading outside a class may be a bit boring, but these courses gave us a much better understanding of key concepts and issues; from basic boolean operations (i.e how to do them right, how to optimize them), to basic concepts (e.g. 2's complement, base-2 arithmetic, FSMs, flip-flops), to more advanced concepts (e.g complex logical components, datapath, control), to various issues (e.g hazards, async / sync design), etc (i.e all these you probably won't implement as cache, OoO execution, virtual memory).

Knowing —to a certain degree— all these made the project easier and much more fun since we could actually have crazy ideas and try to implement them.

I don't know the current state of literature, but when I took the courses (12 years ago) the Patterson and Hennessy books (Computer Organization and Design, Computer Architecture) were terrific.

lfowles · on June 19, 2015

As of 2007-2011, Patterson and Hennessy books were in use in courses I took as well. Great stuff, but the treatment on GPUs was a little out of date even a year or two after being published (GPGPU was just getting started and evolving rapidly).

rdc12 · on June 19, 2015

Their Computer Organization and Design, had a new edition in 2013, that may cover GPU stuff better, havn't read it myself thou.

Still havn't seen a book that covers SIMD very well?

lfowles · on June 19, 2015

* Features the Intel Core i7, ARM Cortex-A8 and NVIDIA Fermi GPU as real-world examples throughout the book

Sounds promising. The 4th edition just had the GPU content stuffed into an appendix.

rdc12 · on June 19, 2015

What is the difference between Computer Organization and Computer Architecture?

mng2 · on June 19, 2015

COD is intended for undergrad courses, CA is more graduate level.

Svenstaro · on June 19, 2015

For those who really want to get into HDL stuff but are put off by the weird syntax, I highly suggesting taking a look at MyHDL[0]. It's a very nice Python lib with good docs and active development. I actually implemented a CPU in it. It can output to VHDL and Verilog and can even generate simple structures for you.

[0] http://www.myhdl.org/

alain94040 · on June 19, 2015

Please do yourself a favor and use Verilog instead. I understand that VHDL forces you to write cleaner code, but it's also frustrating for no good reason.

Also, start looking into pipelining ASAP. Implement one micro-arch, benchmark it, then try to do better. It's a great way to learn.

cyanoacry · on June 19, 2015

Having programmed in both, I prefer VHDL--the two take different paths, analogous to functional vs imperative programming.

The type checking that makes VHDL so annoying is also the same type checking that's saved me. Coming from Haskell, VHDL was a much easier language to learn than Verilog.

krupan · on June 19, 2015

For those not familiar with these languages, saying that VHDL is functional programming and Verilog is imperative programming is misleading. Both use essentially the same style of modeling digital hardware (which is really neither of those): encapsulated modules with ports, clocked processes, and combinatorial logic.

Where they differ is mainly in typing. VHDL requires you (unlike haskell, actually) to (very verbosely) spell out the type of everything. Verilog's type system is more like C's. You declare basic types and it's fairly loosey goosey about them. VHDL's syntax is based on Ada and Verilog's is more C like (but uses begin-end instead of curly braces).

bisrig · on June 19, 2015

I will never forgive the Verilog powers that be for allowing the 'reg' keyword to describe "things that are not registers". But joking aside, the differences between VHDL and Verilog are minor in this context. And one person's frustration is another's saving grace due to strictness etc.

jevinskie · on June 19, 2015

Yikes, I find Verilog to be much too C like while VHDL is almost Python like. I enjoy the strictness. VHDL 2008 (should be supported by tools now) fixes a lot of VHDL's warts. Also, I love the two process design methodology that you can use with VHDL. It lets you single-step through your VHDL code in the simulator as if it were plain procedural code!

http://www.gaisler.com/doc/vhdl2proc.pdf

sigterm · on June 19, 2015

One can do the "two process design methodology" in Verilog too. It's not unique to VHDL.

gshrikant · on June 19, 2015

Forgive my ignorance, for I'm just starting out with VHDL and digital hardware design in general.

Building a CPU is what I eventually intend to do and I am curious how would you benchmark a hardware design?

What metrics would someone use for judging the effectiveness of a CPU arch? My current understanding is gate count, area occupied by the design, clocks per instruction (CPI) and maximum frequency the design can be clocked at.

snops · on June 19, 2015

A common metric in embedded is DMIPS/MHz. This is considered a bit antiquated (first written for the VAX!), but the Dhrystone benchmark is free and simple to implement. The important part is that its (supposedly) independent of clock speed to show the efficiency of your CPU design, and so is normally run from cache to get zero wait-states. CoreMark[1] is a new replacement, and is becoming increasingly popular.

Once you implement/simulate your design in silicon, power usage becomes a good comparison metric. As the depends on clock frequency, DMIPS/mW is another common comparison benchmark. Since a lot of embedded applications spend most of there time in very low power states with the core stopped, sleep current and wake/sleep time are now becoming very important. This is more of a whole chip benchmark, and is a very popular area for microcontroller manufacturers to fight over right now, as results can vary wildly depending on the application. The makers of CoreMark have tried to come out with a benchmark[2], but it doesn't cover peripherals yet and isn't quite as popular.

1. https://www.eembc.org/coremark/ 2. http://www.eembc.org/ulpbench/

ajross · on June 19, 2015

Verilog is C to VHDL's Pascal, maybe. Better but really not that much different. Check out the RISC-V folks' work in Chisel (a Scala-based DSL) for some great examples of what HDL code should be like.

It's not perfect either, but reading it is vastly more pleasant than plodding through a design in one of the Vs.

domipheus · on June 19, 2015

Cheers for the tips. I'll eventually move to Verilog, but as others point out, I think it's good to walk before I run. Regarding pipelining, I'm getting right on it - and the iterative benchmarking is something I plan to do. Thanks again :)

alain94040 · on June 19, 2015

Also, feel free to contribute to the Verilog tutorial I started for software engineers, it's on WikiBooks at https://en.wikibooks.org/wiki/Programmable_Logic/Verilog_for...

VHDL is dead in the industry(). While VHDL is slightly better for teaching, why not learn directly what everyone else uses? Also, because VHDL is such a pain to support for CAD tools, more tools support Verilog only, or support VHDL as a second-class citizen.

() my european friends hate me each time I say that, but it's true.

hobolord · on June 19, 2015

Yeah I was taught in VHDL but my prof acknowledged that it's mainly because it's better as an educational tool. He argued that it's easier to go from VHDL to Verilog after learning the strict practices, and that it leads to just better programming practices in the industry

p1esk · on June 19, 2015

+1 for Verilog. SystemVerilog is the way to go.

krapht · on June 19, 2015

I like VHDL. Verilog and non-blocking vs blocking assignments can trip you up so easily. As for verbose code, you spend way more time debugging and thinking about how to structure a program than you do writing text on the screen.

blackguardx · on June 19, 2015

I prefer Verilog but what you say is true or at least should be true if you want to end up with a good, easily debuggable result.

mng2 · on June 19, 2015

I like VHDL, but SystemVerilog does have some very nice features.

jamieiles · on June 19, 2015

I've been working on a similar project for a couple of years now (http://jamieiles.github.io/oldland-cpu/) and these are very rewarding projects - there's a great mix of hardware and software so there's always something interesting to work on.

One of the most valuable lessons that I've learnt from this is to treat the FPGA as a validation target, and the FPGA tools purely as a way to produce that image - they're entirely unfriendly to develop in. If you use verilog then verilator gives lightening fast simulations and you can use it to verify the hardware against.

bobp127001 · on June 19, 2015

>bonus points goes to the people who realise there is an odd thing about the form of the baz (branch if Ra is zero) instruction.

Is this because the second argument is another register (presumably containing an address to branch to) instead of a label?

rdc12 · on June 19, 2015

Just before the loop there is this instruction, "load.l r7, $mul16_fin".

Prehaps this is so the assembler/linker/loader doesn't need to resolve what address the label ends up having

domipheus · on June 19, 2015

yes! I've not implemented conditional jump to an immediate offset yet, so you need the branch target in a register.

foobarge · on June 19, 2015

Also note, an other cool project: http://moxielogic.org/blog/ (a ex co-worker.)

"Moxie is a general purpose bi-endian load-store processor, with sixteen 32-bit general purpose registers and a comprehensive ISA consisting of two-operand variable width instructions. There are moxie implementations that run on both Altera and Xilinx FPGA architectures, a number of simulator ports (including QEMU), and a complete GNU toolchain for C/C++ development."

p1esk · on June 19, 2015

Here's a good book guiding through an implementation of a simple 8 bit processor (based on 8080 architecture) with VHDL: http://www.amazon.com/Design-Computers-Complex-Digital-Devic... The same author also wrote another book later where he shows a design of ARM like 16 bit CPU with pipelining.

sklogic · on June 19, 2015

And this is what you can use if you want to implement an instruction or two in Verilog without having to implement the rest of the CPU. Plus a flexible compiler toolchain: https://github.com/combinatorylogic/soc

(never mind it being multicycle, it was done this way for a reason, and dropping in a RISC design should be fairly trivial.)

hharnisch · on June 19, 2015

> Verilog is the C99 of the HDL world, and you can get in quite a mess as a beginner if you don’t understand it well enough.

I found both to have their little quirks, but personally liked Verilog syntax a little better - modules concept made a lot more sense coming from a software background. Do you have an existing instruction set you're planning on supporting?

gjkood · on June 19, 2015

For those interested in other FPGA boards, Digilent makes a whole series of affordable ones.

https://www.digilentinc.com/Products/Catalog.cfm?NavPath=2,4...

JoachimS · on June 19, 2015

I can recommend the boards from TerAsic. Solid design and Readable documentation. Altera ships boards from TerAsic for their courses. Good value for money.

http://www.terasic.com.tw/en/

jmgrosen · on June 19, 2015

As an alternative, this is the one I've been using: http://numato.com/mimas-v2-spartan-6-fpga-development-board-...

It's pretty cheap, but it has a whole bunch of nice things built onto it, and a simple flasher script that actually works on Linux!

codesushi42 · on June 19, 2015

On that note, Mojo is a good board to learn FPGAs. It includes an ATMega32 microcontoller and Xilinx Spartan. Its maker also has some good tutorials online:

https://embeddedmicro.com/tutorials/mojo/

Too bad its dev environment doesn't support OS X though.

eggie5 · on June 19, 2015

I't s a viable concept. In school we built a functional pipelined-MIPS processor in verilog: https://github.com/eggie5/SDSU-COMPE475-SPRING13