The 64-core Parallella is alive

fragsworth · on April 28, 2014

They are making a critical mistake here by not letting users pre-order the next batch. It says "Sold Out" and you can't do anything else. They could make hundreds or thousands of sales over the next day or two due to their free launch publicity, but they're fucking blocking everyone from paying for it.

Most of the folks like me who would have bought something today while reading about it will just forget about it later. These guys are missing out on a huge opportunity.

runeks · on April 29, 2014

> They are making a critical mistake here by not letting users pre-order the next batch.

It's possible that we can thank the FTC for this:

The Federal Trade Commission’s (FTC’s) Mail or Telephone Order Rule covers all merchandise ordered by mail, phone, over the internet, or via the fax machine. It stipulates that, if a merchant does not promise a specific delivery time, the merchandise ordered must be delivered within 30 days of the merchant’s receipt of the order (or the date merchandise is charged to your credit card). If the company is unable to ship within the promised time, the company must give the buyer the choice of agreeing to the delay or canceling the order and receiving a prompt refund. However, if you are applying for credit to pay for your purchase and a company doesn't promise a shipping time, the company has 50 days to ship after receiving your order.

http://www.hcs.harvard.edu/~scas/wp/wordpress/?page_id=24

In other words, in the United States, it is not legal to take pre-orders, and incur a delay, without offering customers their money back. So 100,000 pre-orders could come in, and in case of a delay -- since they are forced to offer a refund -- they could lose, potentially, all of their funding.

Imagine ordering parts for 100,000 boards and having 50% of your customers take their money and run, in case of a delay. That's a fairly unmanageable risk.

aninhumer · on April 29, 2014

>Imagine ordering parts for 100,000 boards and having 50% of your customers take their money and run, in case of a delay. That's a fairly unmanageable risk.

Well assuming they're planning to produce another batch, they're already incurring some risk anyway. Allocating units out of that batch as pre-orders doesn't cost them anything, and probably improves overall sales.

ensignavenger · on April 29, 2014

Set the delivery date with plenty of extra padding, and have your software move the date forward as more pre-orders come in.

leo_santagada · on April 29, 2014

^- that. Simple and let users have a choice

greyfade · on April 30, 2014

Oh, this explains why people are so pissed at Butterfly Labs. I wasn't aware of this requirement.

Thanks for pointing this out.

vidarh · on April 29, 2014

Consider that the Parallella is not their intended product. The Parallella is a dev platform/proof of concept to get people to design the Epiphany into new products.

They're not likely making money on the Parallella - in fact a lot of the delay for the Kickstarter campaign was due to issues related to cost (e.g. the design is cut to the bone, and they managed to eventually get very good pricing from Xilinx for the Zynq etc.). It looked like they were in trouble for a while until they got a cash injection from Ericsson and Carmel Ventures early this year.

As such, while they'd certainly benefit from more exposure, and getting it in the hands of more people, they also have every reason to manage the process so building boards doesn't get in the way of actually evolving their chip designs etc. too.

qdog · on April 29, 2014

I just received the shipping confirmation for my 16-core kickstarter board.

Since they are relatively low volume, it seems to be pretty hard for them to get the necessary parts from suppliers reliably. I think they need a huge customer for just the chip (and hence plenty of working capital), or a large investment infusion to be able to deliver more boards in high volumes. They seem to be focused on finishing out what they have sold so far before committing to any new sales, which is probably a good thing since it's taken them so long to just fill the kickstarter orders.

I would expect to see them on HN again in the future if anything positive happens, it's how I found their kickstarter, after all.

onemore360 · on April 29, 2014

...or at least collect email addresses to email when the product is in stock.

Istof · on April 29, 2014

http://www.parallella.org/parallella-updates-registration/

icelancer · on April 29, 2014

They also massively shit the bed on the Parallella launch and have been delayed by 12+ months. I doubt they want to deal with it again.

PaulKeeble · on April 29, 2014

Being one of the backers with a 2x 16 cores on the way I can attest to the fact I wouldn't buy from them again. This is a dishonest company at its heart, its not the delay its the reason why it was delayed and the lies they told about it.

jdubs · on April 29, 2014

I would also like to view their shop.... nvm, I can't read.

http://shop.adapteva.com/

dsl · on April 29, 2014

You should go check out the bitcoin mining hardware market if you want to see how well things go for a company that slips a ship date by even a week... charge backs, BBB complaints, mail and wire fraud investigations, lawsuits, bad press...

I'm still waiting on my Coin.

awda · on April 28, 2014

If you haven't heard of Parallella before (like me) and want to learn more: http://www.parallella.org/

Edit: Ok, here's my quick summary. Please correct me if I'm wrong:

This looks like a small PCB (raspberrypi-alike) that sits the main attraction: a 16- or 64-core Epiphany coprocessor, as well as an ARM cpu to run the OS. Not sure how these relate in performance to other coprocessors (GPUs with OpenCL?). Power draw seems low (5W). Would love to read more about the architecture, why Epiphany processors are special, etc.

vidarh · on April 29, 2014

It's a dual ARM core Zynq SoC + the Epiphany. The Zynq has an on-die FPGA. Part of the FPGA is used for "glue" for the Epiphany, but you can update the FPGA config as well, with some care.

The main CPU is substantially faster than the Pi, but it doesn't have HW accelerated graphics, so it's not a speed daemon for desktop/workstation type use.

As for the Epiphany, assume that it'll be slower than most GPU's for tasks that GPU's are good for. That is, if you can make do with few instruction streams, the Epiphany is not well suited for it, as most GPUs will blow the current chips out of the water in terms of performance.

If, on the other hand, your problem is poorly suited for GPUs due to lots of independent instruction streams, it may be better suited.

One of the most interesting aspects of the Epiphany is that is can also be connected into a grid - each chip has four high speed links that can be connected to other Epiphany chips, or be used to interface with the main CPU or off-chip memory.

The cores can all access each-others memory without any special instructions, including that of the cores on other Epiphany chips that are hooked up via the external links - the only difference between in-core and out-of-core memory access is the speed.

chroem · on April 29, 2014

Do the epiphany cores have storage and network I/O access, or are they something more similar to a GPU?

In other words, could I write an epiphany webserver?

wtracy · on April 29, 2014

No, external I/O has to go through the ARM CPU.

MrBuddyCasino · on April 29, 2014

Excuse my ignorance, but I've read all the comments and I still don't quite understand what this is good for.

I get that GPUs are a bad fit for many problems, but how fast can this thing be if it has no local cache? Will this really be faster than a high-powered Xeon, or will it just consume less energy per operation? Is it a teaching tool? A tech demo? Something that makes Erlang/Haskell magically run much faster?

sitkack · on April 29, 2014

I am waiting in anticipation for all the excellent blog posts about running Haskell on the Ephiphany cores.

MichaelGG · on April 29, 2014

If you set up a grid, how is cache coherency handled and what's the impact on performance?

vidarh · on April 29, 2014

The Epiphany cores have no cache.

EDIT: And if you want to read/write the same memory areas from multiple cores, it's your own responsibility to either get the timing right, or use other means (e.g. you can trigger interrupts in another core) to signal when it is safe for another core to access data.

watmough · on April 29, 2014

I'd imagine it works like a Transputer, and the only accessible memory is local to the chip, with communications including data coming down the serial links.

vidarh · on April 29, 2014

The chips have a flat address space that covers any Epiphany chips that have been interconnected via the serial links, and the main memory of the CPU.

You can address core-local memory, memory in another core, or main memory the same way - the difference is speed.

There's no cache, and you're responsible for avoiding race conditions in memory access yourself.

mintplant · on April 28, 2014

More info about Epiphany and its architecture can be found on Adapteeva's Wikipedia page [1].

[1] http://en.wikipedia.org/wiki/Adapteva#Products

aaron695 · on April 29, 2014

Kickstarter page is also a good round up

https://www.kickstarter.com/projects/adapteva/parallella-a-s...

Everlag · on April 28, 2014

I was informed a few days ago that my 16 core Parallella has shipped; I had hoped, when I ordered, that it would come earlier in the year before exams but the fact that it shipped- several kickstarters which did not deliver have made me wary- has me ecstatic to hold it in my hands.

I have a great amount of respect for the Parallella team: to be able to kickstart a custom chip, that promises very interesting applications, and deliver it within several months of the estimated delivery date with the setbacks they have had is absolutely astonishing for me. While I can't comment on the quality of the final product yet, I would say that they know how to run an excellent campaign.

teh_klev · on April 28, 2014

Mine shipped a few days ago as well. Sadly they seemed to have completely ignored my address change request and the board is shipping to my old address in another country. Grrr.

wsh91 · on April 29, 2014

They did the same thing to me, too. I live in California and my board has arrived safely in DC where I used to live. Unbelievable given the difference in timeline from what they originally set out.

sitkack · on April 29, 2014

I received my boards last week. The heatsink is for the zynq chip is kinda humorous.

avmich · on April 29, 2014

> the 64-core Parallella is still setting the standard in terms of energy efficiency. In fact, it could be argued that it’s the most efficient computer in the world today

I'd be curious if it beats GreenArrays (http://www.greenarraychips.com/) numbers of picojoules per operation. I wonder if those numbers are published for Parallela?

daniel-cussen · on April 29, 2014

I work a lot on GreenArrays, and I highly, highly doubt this claim. An F18 core has 1152 bits of SRAM on it, much less than these cores, which I believe I read have 32 KB. Moreover, while the Parallella is clocked (like almost every computer out there) the F18 is asynchronous.

From reading the parallella docs, it looks like that chips runs 5 W on a "typical workload" while the GA144 runs .25 W at an absolute theoretical maximum, for a 20x difference in energy consumption.

http://www.parallella.org/board/

solarexplorer · on April 29, 2014

The parallella supports floating point though. I don't know how fast the GreenArrays run, but a factor of 20 doesn't seem impossible for a floating point intensive program. More on-chip memory can be beneficial too, if it helps to avoid accesses to off-chip memory. Of course, the unqualified claim from parallella is pretty useless…

avmich · on April 29, 2014

That's why I'm asking. GA144 has 144 cores - but they are 18-bit wide, and the instruction set is rather unorthodox, even multiplication usually has to be programmed. Code density is pretty high, it's a Forth chip after all - but I suspect it is hard to achieve a good uniform load across the whole device... which might also be the case with Parallela. So we have to consider each claim differently. In general, though, they seem to be quite different chips.

daniel-cussen · on April 29, 2014

Parent asked about energy consumption, not speed.

You can program in floating point (I'm doing just that, with 6 cores performing Karatsuba-3 multiplication of 54-bit elements) but that is quite a bit slower than hardware DP multing (which Parallella boards lack too). FP multing will likewise be slower than hardware FP multing, which Parallella does have.

solarexplorer · on April 29, 2014

Sure, you can do floating point in software. My point is that dedicated hardware is very likely more power efficient at it. The same goes for memory: accesses to off-chip memory cost much more energy than accesses to on-chip memory. It would be very interesting to get energy and performance numbers for some real world application running on both chips.

vidarh · on April 29, 2014

I'd be surprised if it did. The Parallella cores are far larger, with far more in-core RAM.

wcchandler · on April 29, 2014

Pardon the ignorance but are all 64 cores available to the OS -- as in, if I run htop, will I see 64 little bars at the top of the terminal? I would think not if I'm understanding this architecture correctly...

wmf · on April 29, 2014

I don't think an OS is intended to run on the Parallela chip at all; it's more of an accelerator.

tostitos1979 · on April 29, 2014

Dumb question. How is this different from the cell processor in the ps3?

vidarh · on April 29, 2014

The Cell SPE's are not as general purpose. They're SIMD processors (single-instruction, multiple-data), and don't have transparent access to host memory or the other cores (for some of the Parallella demo's, you can exit the main program and watch the Epiphany cores continue to DMA data straight to the frame-buffer).

They're more similar to a GPU than to the Epiphany. Each SPE is more powerful in terms of Gflops, but the Epiphany CPU's offer more independent instruction streams. If your problem is basically well suited for a GPU (easy to vectorize) chances are it will probably do better on a Cell than the current Epiphany's. If your problem has lots of independent branching, the Epiphany stands a better chance.

zhemao · on April 29, 2014

The 64-core coprocessor has a different architecture than the main ARM processor. Kernel code and regular userspace programs won't run on it.

brucehart · on April 29, 2014

I'm looking forward to getting mine. I ordered it back in October and they recently said the Zynq-2010 based boards would be shipped in mid-May. I wish they would have been more up front about the delays. There were long periods where there was no communication from the company. I didn't order through Kickstarter but directly from Adapteva.

JoelHobson · on April 29, 2014

What do you intend to use it for? Is there a hobbyist application for these?

brucehart · on April 29, 2014

I am interested in parallel computation and computer architectures and thought it would be fun to experiment with. I have worked on some projects developing signal processing algorithms for large GPU arrays and I want to see how the Parallela compares. I plan on implementing a real time audio watermarking algorithm and some image/video processing algorithms.

fit2rule · on April 29, 2014

I'd love to see someone design a unique synthesizer (audio) algorithm for these things - hope you'll keep HN updated with any progress you make in that direction.

bitL · on April 28, 2014

Excellent! Congratulations! Can't wait to get my hands on the 64 core version! :-)

damian2000 · on April 29, 2014

I was hoping to buy a few of their 16-core model for a commercial application, and have been in touch with Adapteva but got no reply. It seems like they really don't have the ability to supply the demand for their product.

izietto · on April 29, 2014

What do you think about this + Haskell video encoder / decoder? I'm a little ignorant about media processing, but I guess it to be processors scalable with the right encoder / decoder, am I wrong?

dnautics · on April 29, 2014

as soon as julia gets running on ARM this is going to make for some monster high-level scientific programming kit.

markvdb · on April 29, 2014

Clueless newbie here.

I see this thing does OpenCL on a completely different architecture. Recent tesseract ocr versions supports opencl. Will I be able to run tesseract on this thing? Would I even want to?

foxhill · on April 29, 2014

in principle, yes, but the parallela board isn't exactly the fastest of all the embedded boards.

also, OpenCL apparently isn't the preferred programming model for the epiphany, so performance wont be as good as bare-metal (but that is practically a tautology for any OpenCL device..)

kefka · on April 29, 2014

As a theoretical pie-in-the-sky question, what is the minimum amount of energy required to do an operation ?

jacquesm · on April 29, 2014

If you are willing to reverse it the amount of energy can be 0 or a close approximation.

See: http://en.wikipedia.org/wiki/Reversible_computing

Note that this is the computing equivalent of a straight line.

kefka · on April 29, 2014

Would that be related to the reversible computing I heard about a few years ago, in terms of reserving the energy?

jacquesm · on April 29, 2014

Recovering would be a better term I think. You could read the wikipedia article to see if it is what you remembered.

andyl · on April 28, 2014

Is Parallella alive? I put in an order more than 18 months ago and still nothing.

Related question: is Erlang running on Parallella yet?

I remain very interested - 64 or even 16 cores on a small form factor would be incredible.

acomjean · on April 29, 2014

I got mine today.

I don't have the video cable yet and still have to attach the heat sink. They'd really like you to have a fan, so I have some work to assemble one.

http://www.parallella.org/quick-start/

vidarh · on April 29, 2014

It does work with just the heat sink, but if you want to keep it on for extended periods, the fan is probably a good idea.

I can recommend this kit for a case/fan: http://shop.abopen.com/ - that fan is enough to keep everything feeling cool to the touch.

The design is on Github too, if you want to do your own (and if nothing else the instructions shows you how to hook up a 5v fan directly to the board)

It takes some assembly and very light soldering, but not more than that you can get away with any crappy soldering iron and ideally a pair of wire cutters (but scissors will do)

acomjean · on April 29, 2014

Very useful, especially the 5v on the board. Might have missed that. Thanks

jevinskie · on April 29, 2014

Uhh... what am I supposed to do with the case that Adapteva has shipped (awaiting arrival) me? The quick start says I need a fan but I do not see any place on the Adapteva case for such a fan. [0,1] Am I supposed to cut and craft the first-party case to work within the documented requirements? =[

[0]: https://cdn.shopify.com/s/files/1/0230/6005/products/Case4_g...

[1]: https://cdn.shopify.com/s/files/1/0230/6005/products/case3_3...

Edit: I just noticed that the two cases I posted actually differ. Perhaps the design was not yet settled and they have added provisions for a fan. Does anyone know what the cases that are actually shipping look like?

ccozan · on April 29, 2014

Well Erlang could run, after all they supply a Ubuntu distro for it. However, you cannot run Erlang on the Epiphany, you need to code some BIFs to take advantage of the multitude of cores.

wtracy · on April 28, 2014

Mine arrived last Saturday, so I can confirm it does exist!

rposborne · on April 28, 2014

Yup, they just shipped mine on saturday. Think they are about to backer 3000 now.

qdog · on April 29, 2014

I can confirm they have made it to backer #3811 as of last Friday for shipping, hoping it arrives sometime this week.

vidarh · on April 29, 2014

I have received my two. They're still working through the backlog.