Author here. While the essay comes from the side of smartphones, it's not really limited to them. As I mention, even some laptops use setups that require complex infrastructure to support. libcamera itself is also used in the Raspberry Pi, and the interfaces in the Linux kernel are used by the Axiom camera, which is truly a photo camera.
The problem of camera diversity is not limited to open source either, because a similar infrastructure to handle all the different cases must be replicated by closed drivers as well. I don't know about Macs, but the Surface laptop is a Windows beast.
Can you reuse some of the algorithms provided by open source RAW image processing pipelines for SLRs?
Many SLRs are already well supported, though the open source stuff doesn't focus on low latency conversions, which is needed for viewfinders, or focus control, etc.
Cutting edge cell phone camera performance is absolutely mad.
I did a side-by-side comparison of a Micro Four Thirds camera (4/3" sensor) and an iPhone SE (1/3" sensor) and the performance was... pretty much the same.
And I'm not talking about some ML interpolation wizardry or automatic face beautification; I was photographing barcodes and testing how many were readable in the resulting images - hardly something Apple would have specially optimised for.
The iPhone has a much smaller sensor, a much smaller lens, costs less, and manages to pack in a bunch of non-camera features. To be competitive in the modern cell phone market your camera has to be straight up magic.
It actually wouldn't surprise me if they optimized for bar/qr code readability. I wrote something years ago that used industrial cameras to read QR codes as well as very precise metrology features. I had to optimize the optical/lighting setup for the feature measurement and then wrote some finely-tuned operations to identify the QR code, window down to the code only, clean up edges/features with expensive convolutions (mostly median filter) and then finally read the code. None of this was visible to the operator, but if you saw the final image of the QR code it was essentially binary color space and looked a bit cartoon-like.
They might have some optimisation for photographing documents, it's true.
But when I say the performance is good, I'm not just declaring the images good because the portraits have simulated bokeh, or face-detecting autoexposure, or image stabilisation, or tasteful HDR, or a beauty mode that airbrushes out blemishes and makes photos of sunsets really pop.
Even in applications where none of those features come into play, the iPhone, can still go toe-to-toe with cameras with much larger sensors.
Did you compare raw sensor output, or post-processed?
Big sensors capture more light and have more bokeh. With enough light, the first doesn't matter, and bokeh is not a thing for QR codes.
If you didn't have enough light, then it's probably the question of how denoising was done, and what details have been guessed by fancy algorithms. Geometrical shapes are easy to guess, but when I look at pictures of landscapes, they typically devolve into a painting after 1000×1000px if taken on a phone.
Totally, RAW processing is planned for after resolution changing works correctly. Do you have a recommendation about which implementation is easy to understand and work with?
If I have to rewrite stuff for low latency, I'd rather start it as an independent library so that other projects can reuse the code.
If I were looking for such a thing, I'd check out darktable and go upstream through its RAW-processing pipeline. Whatever they're using may not be the best, but I'd imagine that it is average or better...
At Purism, our goal is not to build a phone or two, but to contribute to the ecosystem as well. That means the Linux kernel, and the Linux camera infrastructure. Now, we have two choices: either contribute support for our hardware, or use some hardware that was already supported.
In reality, UVC is not suitable for a phone, so we can't leverage that. There are some camera drivers in the kernel already, but not necessarily ones that we could buy or meeting our expectations.
Even if there were, that still leaves us with the problem of connecting the cameras to applications in a standard way, so we can't really avoid working on libcamera.
There is a standard that all the normal desktop userspace apps are already using, it's v4l2 and in particular a single /dev/video# device use case with all the highlevel controls exposed on this device directly.
For likes of Librem and Pinephone, the highlevel controls either don't exist on HW level, or they exist there, but are not exposed on the video device itself, but on various v4l2 subdevices that form the video pipeline.
One way to support all the already existing apps would be to implement what they already expect. (see above) That is, to be able to control the video device by usual means they already posess. Instead of extending all the apps to use libcamera, and leaving the rest behind, we could simply proxy the video controls from ioctls where all apps expect them to be to some userspace daemon, that would then configure the complex HW specific media pipeline behind the scenes (basically all the media system subdevices for sensors, sensor interfaces, ISP, etc.).
In other words to implement what's implemented by some USB microcontroller in UVC webcams in some userspace daemon, but keep the existing userspace interface expectations for simple camera usage.
This is kinda orthogonal to the libcamera effort, I guess. Just wanted to say that there already is a standard. :)
It's not orthogonal. In fact, it's a very good observation, and even libcamera itself recognizes it by providing v4l2 emulation.
It could be a viable way to get the basic use case of video streaming where special controls are not needed. It's worth considering, although then it makes sense to leverage work already in libcamera to implement the extra layer.
It's hard to tell because I don't know what Android phones do exactly. Does it vary by manufacturer? Do we include AI tricks and high speed video encoding?
I think it's going to be a long way to get there, but also the openness of the drivers will let us find our own strengths (I have high hopes for super-high FPS recording).
> (I have high hopes for super-high FPS recording)
That would be very cool. Google’s phone does 4K HDR stabilized video at 60 fps. Their slo-mo is 240 fps but I don’t know what resolution that would be.
One of the things that I came across when I was looking for an RPi camera replacement was Samsung's "ISOCELL Plug and Play" - A turn-key solution with prepackaged lens, EIS, image pretuning, and VCX* objective camera evaluation. The image looked good enough compared to cutting edge at the time, and I imagined it would be perfect for small startups and companies who wanted a quick time to market.
Unfortunately I never got one and it now looks to be unavailable. I don't know if Samsung would have had interest in Open Source efforts to support the modules.
I assume most people have no visibility of the effort of hundreds of engineers who work on behalf of the big smartphone companies tuning for what happens in the image pipeline between camera/raw Bayer data to what your app receives.
There is a reason the quality of pictures in phones has got so good - lots of tuning, quite a bit of magic, as well as software and hardware co-optimization.
Free Software is so cool - I love how the solutions to tough problems can be shared so that everyone benefits from the work. It's a long journey, but I'm glad that there are people doing the hard work. This is the reason that I'm happy to support Purism!
I've been looking at cheap, low-end movie cameras recently like the Blackmagic Pockets, Sigma FP etc. I'm sure this is a naive question but I've been wondering why I can't buy a good sensor module, attach it to a Raspberry Pi or similar, put it in a nice case with an SSD and a lens mount etc and call it a movie camera. I guess that's kind of what Apertus are trying to do with the Axiom, but they still don't seem to have shipped anything real after quite a lot of years.
The low end of professional-ish cameras is great in lots of ways, but they all have annoying trade-offs. To be fair I'm sure any janky POS I built would too.
In general, modular products are always going to cost more and tend to lag in features/quality. It's easy to get stuck in a death spiral of worse product -> low volume -> very high price due to no economy of scale -> low volume.
> And once we get the functionality we need working, it will be available to others using libcamera: applications and other cameras will have them out of the box.
In the meantime is there a Librem phone without a camera? Because the camera is like the lowest priority thing for me on a smartphone. I could easily live without it.
Would it make sense to optionally also have a service that exposes these cameras through v4l2, so legacy applications can still make use of them, without patching them to make use of libcamera?
edit:
This would also resolve the dilemma of having complex camera drivers either in kernel or in a library. It could be in userspace while the kernel provides a small kernel module to make it work (like v4l2-loopback, or not unlike FUSE for filesystems).
These cameras are already exposed via v4l2 they are just exposed by many v4l2 subdevices (for a separate parts of the HW camera and image processing pipeline) and not just as a single video device having all the necessary highlevel controls.
So the interface is the same, it's just split over many device and platform specific subdevices, which normal apps can't really be expected to understand.
Sure, but then a libcamera based service could expose a single virtualized v4l2 device with all the necessary high level controls, and normal legacy apps could use that.
Is complicated the right word here? It seems to me that the specs/datasheets might perhaps be somewhat difficult to follow, but otherwise it sounds like basic plumbing.
OpenCamera is just a frontend, it has nothing to do with hardware support. All the interesting stuff happens in lower layers, which are usually closed, undocumented and Android-specific.
https://www.apertus.org/axiom
The problem of camera diversity is not limited to open source either, because a similar infrastructure to handle all the different cases must be replicated by closed drivers as well. I don't know about Macs, but the Surface laptop is a Windows beast.