For those wondering about why there's both a boot ROM and the boot2 in flash:
The flash chips used support both a basic SPI mode, and an advanced QSPI mode. There is a well-defined standard protocol for basic SPI mode, so virtually all chips will respond to the same read command for simple slow byte-by-byte reading. The only thing left to try is the four SPI modes (Does clock idle high or low? Do we transfer on the full pulse, or on the half pulse?) - hardware often even supports two of them, and there's only one set which actually makes sense.
QSPI, on the other hand, is more of a wild-west. You need to run a bunch of chip-specific commands to enter QSPI mode, and there are quite a few possible variations for QSPI read commands, not to mention a lot of different timing requirements. Trying out all of them isn't really possible, hence the chip-specific boot2 segment.
Staying in SPI mode isn't really viable either because the application code is stored in the flash chip. To give an example, jumping to a random instruction would incur a 1280 ns read with a W25Q80BW flash chip operating in SPI mode (realistically x10 due to a lower safe clock frequency), whereas QSPI mode can reliably do that in as little as 125 ns. With the RP2040 running at 133MHz a 16-cycle delay for a random jump or a read from a data block is not too bad, but a 170 or even 1700-cycle delay is just way too much.
TBH, a lot of the details are fairly standardized across vendors, and/or are discoverable through SFDP. Parsing the SFDP tables would take a nontrivial amount of program memory, though; I don't know as I'd want to embed that logic in ROM.
It's always a bit funny to me how much of the Synopsis IP ends up in different chips. As an example a decade+ ago I implemented a from-scratch USB Peripheral stack on an STM32 microcontroller, basically because the vendor SDK wasn't capable of doing what I needed or being readily modified to do what I needed. A couple of years ago I was debugging some firmware for a chip from a completely different vendor and noticed that the USB registers looked... familiar. Looked back at the original project and was somewhat surprised to discover that it was exactly the same registers in the same order just mapped to a different spot in memory.
This complexity forced me to abandon it for learning and switch to STM32. I was able to write blinky with few dozens of assembly instructions for STM32. I spent like month reading about SPI, QSPI, flash chips and still was not able to understand how to proceed with RP2040 other than copy&paste their "bootloader" as an opaque blob.
May be I'm weird, but for me RP2040 was terrible chip for learning ARM. STM32 on the other hand just worked and I gradually learned to blink the chip, to write linker script, to write UART, to use C, to use CMSIS and so on. In the end I was able to write a commercial firmware with it.
I understand that if I would just use their SDK with cmake, that wouldn't be a problem, but I'm not going to use their SDK. I hate cmake and I need to understand everything from the ground up.
I think that at this moment I can grok this bootloader and write my own version of it, because I know much more about it, but it wouldn't serve its purpose as a chip for learning.
IMO that's a flawed approach: to throw infinitely complex tools onto a beginner. It's much easier to start simple, with just an assembler and may be linker. And a chip for beginners must not require those complex initialization procedures.
This chip is also incredibly complex with its two cores and PIO cores. It's absolutely cool thing, but it's absolutely not for beginners, it's for experienced engineers. I'd prefer something simple, like STM32, with built-in flash, but with proper documentation and without any compromises. Like flexible voltage source, on-board programmer, plenty of hardware blocks, not cheap price (because who cares if chip cost is $1 or $10 for hobby).
RP2040 documentation is superb, I must admit. That's what they did perfectly.
> May be I'm weird, but for me RP2040 was terrible chip for learning ARM. STM32 on the other hand just worked
My experience was opposite of yours. I found RP2040 refreshing compared to the complexity of dealing with proprietary toolchains that other devices required for me to start working with their chips. Nearly every part of RP2040 was documented in great detail and usable exclusively with the tools I could find in Arch Linux repo (when using Linux) and Homebrew (when using Mac). I could drop down to assembly or move up to C++, or even Rust or Python depending on whether I wanted to tool around or just get things done.
Even more impressive was that I was able to use debugger with another RP2040 Pico acting as SWD debug probe (Google Picoprobe) which again worked the same across Mac and Linux with the software I already had (gdb) and saved me from buying yet another piece of JTAG hardware with questionable software support.
Oh, and every single software with RP2040, including UF2 boot-rom, second stage bootloader and examples are on Github, which allowed me to go as deep as I wanted and more importantly, just get on with what I needed to do when I wanted things to just work.
I've worked with uC on and off, but never I have worked with a uC that just worked with just worked with tools I already had. I now work exclusively with RP2040, even when I find other chips much more capable (ESP32 in this case). RP2040 allows me to futz around as long as I want, as deep or shallow as I have time on hand, plus when I stop futzing around, it allows me to just flash new ROM over USB and get on do what I /need/ to do.
Oh.. and I just love the USB mass-storage mode – no more custom flasher tools, just `cp blah.uf2 /mnt/RP2040/` and off I go. I can smoke it, but I can't brick it! Plus, when I need quick iteration I can just use PicoProbe and do `code - flash - debug - code` almost as fast as I can hit keyboard buttons.
I don't want to invest in one processor only to find out later that I needed USB2.0 instead of USB1.1, and then needing to read 500 pages of datasheets to move to a different platform.
Reading datasheets was nice at one point, but now it feels more like filling out tax forms.
It depends on what you're looking for. If you're doing hobbyist stuff, using Arduino libraries (or even Linux single-board computers) will get you a processor agnostic solution. However, if you're dealing with production in volume, using 95% of the capabilities of a 45 cent chip is much better than using 50% of the capabilities of a 2 dollar chip, and there's nothing that'll get you there besides dealing with hardware specific features (and therefore datasheets).
Volumes have to be very large to make up for R&D costs. For consumer stuff, maybe yes. In industry, it is quite normal to make <1000 pieces of some electronic instrument. In that case, optimizing on cents doesn't make sense.
The problem is that processor agnostic platforms have to follow the lowest common denominator. An agnostic USB library which works on both USB2.0 and USB1.1 microcontrollers is going to be limited to USB1.1 features.
Why? If you use only 1.1 features on a 2.0 platform it would work just like on 1.1 but faster. If you're on a USB1.1 platform and want to use a 2.0 feature you could just use a 2.0 platform without any changes to the existing code
Isn't the toolchain for the esp almost fully open source (ESP-IDF)? The only part that aren't super open on the esp32 are the radio related firmware and blobs, but those are just not even there in the rp2040 anyways and not related to the toolchain itself.
If I recall correctly, a not-insignificant issue is (was?) that the ESP is based on the relatively obscure Xtensa microarchitecture - which is poorly supported (if at all) by the regular open-source toolchain. This means you have to use forks provided by Espressif, rather than just using the standard ones provided by your OS.
It's still open-source so a lot better than having to use a proprietary compiler or IDE, but it's a lot more involved than just your regular bundle of C libraries you can use with your normal tooling.
Yup. The original ES32, ESP32-S2, and ESP32-S3 use Xtensa, but the ESP32-C2, -C3, -C6, and -H2 use RISC-V.
Unfortunately they don't all have the same feature set, so you'll often still see the Xtensa variants in the wild as they are are simply a better product overall.
Having worked quite a bit with both, I think the average beginner would find the rp2040's cmake based SDK more accessible than stm32cube, which I dislike immensely. The CircuitPython support is also really interesting for somebody who isn't a programmer but wants to experiment.
But if you're already an experienced programmer and want to roll your own stuff, I absolutely agree stm32 is a better way to go. This is a little example: https://github.com/jcalvinowens/ledboard
I will say, I think the generalized PIO engine the rp2040 has is incredible. I hope everybody starts doing that.
I’m a programmer but appreciate CircuitPython. I used the KB2040 to build two bespoke mini keyboards. It took just a few dozen lines of Python. Couldn’t be happier with it.
If you're a beginner who wants to learn how chips work, something like the ATmega328 or the ATmega32u4 would by far be the best choice. It's pretty much a textbook chip, with enough peripherals to be useful yet not so many it gets confusing, and a datasheet which is quite readable.
Once you get into ARM territory it inherently gets very complex very quickly. They are massive chips made of multiple IP blocks from different vendors which have been glued together. Full documentation easily gets into the thousands of pages. The RP2040 has excellent documentation, and even that is barely enough to be usable.
With the exception of a very small group of people at the design company, essentially nobody is hand-writing assembly. There's just no point: it's incredibly complicated, and it takes orders of magnitude longer than just using the provided SDK. This makes hand-writing it only an option for hobbyists: nobody wants to pay for their engineer to spend a lot of time doing it in a worse way. Turns out "I hate cmake" isn't a very good reason to waste tens of thousand of dollars in engineer-hours.
But even for the hobbyist the SDK is probably the better choice. The one provided for the RP2040 is quite well-made, and even if you hate cmake just copy/pasting its C code into your own cmake-less toy SDK probably makes more sense than reinventing the wheel yourself.
I don't totally agree with this perspective. Adafruit ships these in dev boards with a CircuitPython layer ready to go - you can have it up and doing something in 90 seconds if you're the Arduino type of hobbyist. You don't need to know a thing about the bootloader at all except maybe to hold down the bootstrap line with a pushbutton to reflash the system if it's bricked. The USB loader is incredibly slick and modern.
All this bootstrap sequencing is pretty typical for an ARM Cortex unit, and it's not as overburdened with options like, say, a TI Sitara. They're still unbricking with TFTP.
For $0.70 in onesies this is a pretty nice piece of silicon.
I don't think the person you're responding to is "the Arduino type of hobbyist", though.
I feel the same, I didn't like Arduino's abstractions and how they hid what was actually going on. It has its appeals for people who don't care about the inner workings and just want to use a microcontroller to "quickly do something for which a microcontroller would be handy right now", but it won't get you much further than that, in my opinion.
Missing the point. There's a huge amount of us who are between "complete and bare metal understanding of the soc" and "using python". STM32 nailed it, RP2040 is gaining a reputation for complexity.
I get the perspective. And I agree that RP2040 needs the equivalent of STMCube or even CubeMX. But they're not there yet. Are they banking on the community to provide that with the same amount of love that RPi got? I don't see that happening for multiple reasons.
I think their aggressive price point is at odds with their mission.
The mission used to be "unit of computing for education and makers at a super low pricepoint". This feels more like "create the lowest pricepoint possible".
It's peculiar. I think the 'educational' mission of RPi flew the coop a long time ago when they found they were selling piles of Linux SBCs at or below a cost that was realistic. Now they're a COTS part used in industry. (Ever buy a $25,000 heat staking machine to discover that they used an Raspberry Pi as the primary control unit? I have.)
Arduino is now going through the same experience, they have a "pro" line where they're trying to compete with existing Linux SBCs in the industrial space like Phytec, Variscite, or Kontron. But they can't match on cost yet.
What really sets apart RP2040 is the amount of SRAM you get for the price. Other "outsiders" like Espressif are also generous.
Mainstream MCU manufacturers really skimp on it, even if you don't need Cortex-M4/7 to run a simple GUI you have to buy a whiz-bang part with huge pinout, very rich peripherals you won't need and a matching price tag for those.
I believe a big reason for this is that the RP2040 is manufactured on a relatively modern process node. Mainstream MCU manufacturers use ancient nodes, which means using the same amount of SRAM is a lot more expensive area-wise.
It probably also helps that the RP2040 (and most Espressif chips too!) don't include any onboard flash. Adding a nontrivial amount of on-chip flash is quite expensive, so they just used that area budget for extra SRAM instead. If you want more than a few hundred K of flash you need to use an external chip anyways, so why bother with on-chip flash at all?
AFAIK Espressif parts that do include flash just co-package it ;) From what I know, ESP8266 is TSMC 40 nm and TSMC offers embedded flash down to 28 nm (their website must be outdated).
I'm just here to remind everyone that doesn't already know that the "full" raspberry pi models are also quite accessible[1] as bare metal platforms today. If you need an MCU with 8GB of ram and 1TB of flash, you can absolutely have it!
I'd like to try to find an example project where I could do something with a bare metal Pi Zero 2W.
Have a look at the Teensy 4.1, it comes with 7936K Flash, 1024K RAM and you can even add up to 2x8Mb PSRAM iirc. Of course it's a bit more expensive but not much.
> RP2040 documentation is superb, I must admit. That's what they did perfectly.
In the end this is what really matters anyway; I'll take a fully documented complex chip over a black box whenever it matters.
Also while I appreciate everything that the RPi foundation does in their documentation efforts, I think they dont do a good job of indicating the difference between a lack of documentation and a lack of /public/ documentation. I would like to know a LOT more about the pcie interface between the Pi5 CPU and the custom peripheral controller for instance, but there's zero docs on it outside of kernel sources which are just a bunch of magic numbers and other undiscernable enchantments anyway.
This is still a huge step up compared to the broadcom chips in raspberry 1-4.
On the big brothers, you have this odd CPU + GPU mish-mash and everything (including interrupt and mem layout) can look different depending on which one is currently running the show.
> Power is applied to the chip, and the RUN pin is high. The chip will be held in reset for as long as RUN is not high.
Is that a typo, or do I not understand this? Should it be "as long as RUN is high"? Because I assume that the RUN pin is active low, so as long as it is high, the chip will be held in reset?
> Global asynchronous reset pin. Reset when driven low, run when driven high. If no external reset is required, this pin can be tied directly to IOVDD.
RST pins tend to be active-low, which is the same thing. Perhaps Raspberry Pi decided to call it the "run" pin instead of the "reset" one to avoid possibly confusing hobbyists?
That doesn't seem unusual to me, given that to get to this page you either have to be searching for the specific terms already (and know what they are) or come from the homepage -> RP2040 (Raspberry Pi Pico) projects -> Custom serial bootloader for the RP2040 -> Preliminary reading RP2040 boot sequence
I came to that page directly from the front page of HN. I think it's reasonable to assume a significant portion of their traffic today directly to this page didn't already know what RP2040 is. Missed opportunity to educate readers.
I mean, in as much as it is annoying when an article about, say, NTFS internals does not explain what NTFS stands for, or what a filesystem is. If you're the target audience, you'll know already.
The flash chips used support both a basic SPI mode, and an advanced QSPI mode. There is a well-defined standard protocol for basic SPI mode, so virtually all chips will respond to the same read command for simple slow byte-by-byte reading. The only thing left to try is the four SPI modes (Does clock idle high or low? Do we transfer on the full pulse, or on the half pulse?) - hardware often even supports two of them, and there's only one set which actually makes sense.
QSPI, on the other hand, is more of a wild-west. You need to run a bunch of chip-specific commands to enter QSPI mode, and there are quite a few possible variations for QSPI read commands, not to mention a lot of different timing requirements. Trying out all of them isn't really possible, hence the chip-specific boot2 segment.
Staying in SPI mode isn't really viable either because the application code is stored in the flash chip. To give an example, jumping to a random instruction would incur a 1280 ns read with a W25Q80BW flash chip operating in SPI mode (realistically x10 due to a lower safe clock frequency), whereas QSPI mode can reliably do that in as little as 125 ns. With the RP2040 running at 133MHz a 16-cycle delay for a random jump or a read from a data block is not too bad, but a 170 or even 1700-cycle delay is just way too much.