libopenraw Rust with C API

2023.08.28 :: Hubert Figuière

As I previously talked about, I started porting libopenraw to Rust. It is now in a state where it has more feature than the original.

When I started writing this post, I didn't have 100% of the code Rust, but since I have removed the last bit of C++, for which I had cut corners to make sure to have a functional API for C.

The only C++ code left is the various utilities and the C++ test suite to validate.

The goal

The goal of the Rust rewrite is to have the digital camera raw parsing library written in Rust instead of C++, while still being available with a C API.

The goal of the updated C API is to be close to the original API. However it's also a good time to do some breaking changes. Notably most of the "objects" are now immutable.

Rewrite

I did the rewrite in a branch, aiming to provide the same level of functionality as the C++ code. One of the first step was to rewrite the test suite to use the same data set. Doing so did allow verifying the consistency in behaviour, a progressively implement all the features and formats.

The C++ code use inheritance a bit to override some of the behaviours based on which file format it was. One of the thing most camera raw file have in common is that they are mostly a TIFF file. However Rust doesn't do object-oriented. It has traits that allow implementing some level of polymorphism, but in that case the concept had to be rethought a little bit.

Benefits

This is where Rust shines.

First there are a lot of things Rust, or the ecosystem around, did for me. In C++ I had re-implemented a stream system to allow IO on buffers and handling endianess. In Rust you can just read from a slice (a fixed size array of data) with the Read trait, and there are built-in functions to read endian-specific values. I still have the memory of trying to figure out which include to add depending on the UNIX flavour, between Linux, macOS and the BSD.

Second, the Debug trait. Just derive Debug (it's a simple macro) and you can "print" an type like that, using string formats. In C++, well, it's a lot of work. And on enum types it will print a human readable value, ie the one you write in code.

Third, the safety. A lot of the safety feature of Rust prevent mistake. And at runtime, the bounds checking catch of a lot of things too. Being required to use Option<> or Result<> to handle cases where in C++ you'd get some null value.

At that point, a few bugs I identified porting / rewriting the code made their way to the 0.3.x series in C++.

Drawbacks

libopenraw requires a Rust compiler (technically it already did, but I know some packagers that did disable Rust¹. This mean that non current architecture might not be supported. To me it's not a big issue, I target modern computers.

Testing

As mentioned I rewrote the test suite with the same data set. This is an essential part of making sure the features work as previously.

I also rewrote ordiag and the dumper ordumper. The dumper allow analyzing the structure of the file, and I didn't have this one in C++ (instead I had a more primitive exifdump). The dumper is heavily inspired from exifprobe that has served me well and that I forked. Really I can see the amount of work I didn't need to do with Rust.

To add to this I wrote libopenraw-viewer, a small Rust application to view the content of camera raw files. This allow much more easily to see the output. This has helped me to find fundamental bugs in some of the parsing that led to some fixes "upstream", namely into the C++ version (0.3.x branch). I should have done that a long time ago. This also allow me to test the Rust API.

C API adaptation

Last but not least I had to provide a C API. This allow using the library.

Rust to C ffi has limitations. You can basically pass numbers, pointers, and eventually simple structures.

The libopenraw API did return several types of "ref" which are just pointers to opaque types. The idea was always to only have opaque types through the API, which internally were C++ instances.

One of the key change is that the only object that can be explicitly be created with the API is ORRawFileRef, because it's the entry point. Very few need to be released, most are held by the containing objects.

Some other constraints of the C API directed some choices in the Rust API.

Raw files are no longer Box<> but Rc<> due to the need to retain them for the iterator.²
Metadata::String() contain a Vec<u8> instead of a String to allow for the NUL terminated strings in the C API. They are maybe not NUL terminated but ASCII.

New features

Meanwhile I added a few new features:

extracting the white balance on most files (saveral Nikon are not yet handled).
color converting from the camera space to sRGB on rendering.
a multi stage processing pipeline.

This is still not enough to have a complete processing pipeline, but it's a start. It's going towards the only two issue left in the issue tracker. Not that I don't expect more but it's a nice goal post.

Integration

I submitted a PR for Miniaturo to use the Rust version of the library. This is not ready to be merged, but this actually allowed me to fix a few API bits. The new Rust API is relatively close to the old API that was the Rust to C bindings.

See issue 13

This is likely to change when I make libopenraw multi-thread compatible.