Hacker News new | past | comments | ask | show | jobs | submit login

Author here. While the essay comes from the side of smartphones, it's not really limited to them. As I mention, even some laptops use setups that require complex infrastructure to support. libcamera itself is also used in the Raspberry Pi, and the interfaces in the Linux kernel are used by the Axiom camera, which is truly a photo camera.

https://www.apertus.org/axiom

The problem of camera diversity is not limited to open source either, because a similar infrastructure to handle all the different cases must be replicated by closed drivers as well. I don't know about Macs, but the Surface laptop is a Windows beast.




Can you reuse some of the algorithms provided by open source RAW image processing pipelines for SLRs?

Many SLRs are already well supported, though the open source stuff doesn't focus on low latency conversions, which is needed for viewfinders, or focus control, etc.


Cutting edge cell phone camera performance is absolutely mad.

I did a side-by-side comparison of a Micro Four Thirds camera (4/3" sensor) and an iPhone SE (1/3" sensor) and the performance was... pretty much the same.

And I'm not talking about some ML interpolation wizardry or automatic face beautification; I was photographing barcodes and testing how many were readable in the resulting images - hardly something Apple would have specially optimised for.

The iPhone has a much smaller sensor, a much smaller lens, costs less, and manages to pack in a bunch of non-camera features. To be competitive in the modern cell phone market your camera has to be straight up magic.


It actually wouldn't surprise me if they optimized for bar/qr code readability. I wrote something years ago that used industrial cameras to read QR codes as well as very precise metrology features. I had to optimize the optical/lighting setup for the feature measurement and then wrote some finely-tuned operations to identify the QR code, window down to the code only, clean up edges/features with expensive convolutions (mostly median filter) and then finally read the code. None of this was visible to the operator, but if you saw the final image of the QR code it was essentially binary color space and looked a bit cartoon-like.


They might have some optimisation for photographing documents, it's true.

But when I say the performance is good, I'm not just declaring the images good because the portraits have simulated bokeh, or face-detecting autoexposure, or image stabilisation, or tasteful HDR, or a beauty mode that airbrushes out blemishes and makes photos of sunsets really pop.

Even in applications where none of those features come into play, the iPhone, can still go toe-to-toe with cameras with much larger sensors.


Did you do your tests in good lighting?

Did you compare raw sensor output, or post-processed?

Big sensors capture more light and have more bokeh. With enough light, the first doesn't matter, and bokeh is not a thing for QR codes.

If you didn't have enough light, then it's probably the question of how denoising was done, and what details have been guessed by fancy algorithms. Geometrical shapes are easy to guess, but when I look at pictures of landscapes, they typically devolve into a painting after 1000×1000px if taken on a phone.


Totally, RAW processing is planned for after resolution changing works correctly. Do you have a recommendation about which implementation is easy to understand and work with?

If I have to rewrite stuff for low latency, I'd rather start it as an independent library so that other projects can reuse the code.


If I were looking for such a thing, I'd check out darktable and go upstream through its RAW-processing pipeline. Whatever they're using may not be the best, but I'd imagine that it is average or better...


I don’t understand why diversity is a problem for Purism. They choose the camera modules they want to buy. Where does the diversity come in?


At Purism, our goal is not to build a phone or two, but to contribute to the ecosystem as well. That means the Linux kernel, and the Linux camera infrastructure. Now, we have two choices: either contribute support for our hardware, or use some hardware that was already supported.

In reality, UVC is not suitable for a phone, so we can't leverage that. There are some camera drivers in the kernel already, but not necessarily ones that we could buy or meeting our expectations. Even if there were, that still leaves us with the problem of connecting the cameras to applications in a standard way, so we can't really avoid working on libcamera.


> we can't really avoid working on libcamera

So why not fork it and make libcamera2 (or whatever) and concentrate solely on Purism’s needs?

My gut instinct would be that unless you ruthlessly narrow the scope of the project, progress will be too slow.


That's because we need a standard, and without libcamera, there is no standard.


There is a standard that all the normal desktop userspace apps are already using, it's v4l2 and in particular a single /dev/video# device use case with all the highlevel controls exposed on this device directly.

For likes of Librem and Pinephone, the highlevel controls either don't exist on HW level, or they exist there, but are not exposed on the video device itself, but on various v4l2 subdevices that form the video pipeline.

One way to support all the already existing apps would be to implement what they already expect. (see above) That is, to be able to control the video device by usual means they already posess. Instead of extending all the apps to use libcamera, and leaving the rest behind, we could simply proxy the video controls from ioctls where all apps expect them to be to some userspace daemon, that would then configure the complex HW specific media pipeline behind the scenes (basically all the media system subdevices for sensors, sensor interfaces, ISP, etc.).

In other words to implement what's implemented by some USB microcontroller in UVC webcams in some userspace daemon, but keep the existing userspace interface expectations for simple camera usage.

This is kinda orthogonal to the libcamera effort, I guess. Just wanted to say that there already is a standard. :)


It's not orthogonal. In fact, it's a very good observation, and even libcamera itself recognizes it by providing v4l2 emulation.

It could be a viable way to get the basic use case of video streaming where special controls are not needed. It's worth considering, although then it makes sense to leverage work already in libcamera to implement the extra layer.


How long will it take for Librem 5 support to be similar to Android smartphones?


It's hard to tell because I don't know what Android phones do exactly. Does it vary by manufacturer? Do we include AI tricks and high speed video encoding?

I think it's going to be a long way to get there, but also the openness of the drivers will let us find our own strengths (I have high hopes for super-high FPS recording).


> (I have high hopes for super-high FPS recording)

That would be very cool. Google’s phone does 4K HDR stabilized video at 60 fps. Their slo-mo is 240 fps but I don’t know what resolution that would be.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: