GSoC 2024: Raw Photograph Decoding in Rust #1771
Replies: 13 comments
-
Week 1 ReportGeneral InformationAll RAW formats are essentially an TIFF file. TIFF is a format to specify metadata along with the raw pixel data in the form of pairs of tag and value. But the exact tags used varies with manufacturer. Even the raw data is stored in a format that depends on camera model and manufacturer. A good analogy would be compare TIFF with JSON where each manufacturer has different JSON schema. EXIF is an extension to TIFF keeping the same format but adding additional tags to support more use cases. The entire list of tags can be found at EXIF Tags The raw data does not represent the image we are used to. Instead of having 3 channels, it has only a single channel which is able to provide information about all 3 colors. The information is arranged in a format known as the Bayer Color Filter Array which is shown in the figure below. The process of converting this Bayer CFA to an RGB image is known as Debayering or Demosaicing. To convert a RAW image to an image bitmap, the following steps need to be performed:
Community Bonding Period
Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 2 ReportGeneral InformationSony's RAW images follows an EXIF-based format known as ARW (Alpha RAW). This format has gone through multiple versions with each version storing it a different manner. At this point in time, Raw-rs has added support for 3 versions of ARW: ARW 1This format allows for 12 bits per sample and stores the Bayer CFA in an interleaving format so that all the pixels of same color are together. It also uses a form of differential encoding where only pixel differences of consecutive pixels of same color are stored to reduce space. ARW 2.3.1This format also allows for 12 bits per sample and stores the Bayer CFA in an interleaving format. But the major difference is that it uses lossy compression to store the values. The file also provides data to generate a tone curve. This tone curve is used to apply a mapping that is similar to (but not exactly) an exponential/logarithmic transformation (exponential for decoding and logarithmic for encoding). Such transformation reduces the range of values to be stored in the file thereby allowing it to be stored by fewer bits. ARW 2.3.5This format allows for 12 bits or 14 bits depending the camera model. It stores the values of Bayer CFA as it is, in a byte-aligned manner with no interleaving or compression. Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 3 ReportTasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 4 ReportGeneral InformationMoving forward in the pipeline, the next steps after decoding the raw data within the TIFF file are as follows: Raw to ImageThis is a simple step where the raw data is converted to the standard form of 3-channel RGB image. But only one channel contains the information for each pixel. This steps also handles cropping of the image. The dimensions of cropping is usually provided in the metadata. Subtract BlackEvery camera's sensor will always have some kind of zero error. Essentially even when no light passes through the sensor, the sensor could still output a value that is greater than 0. This error is usually removed by subtracting a value from the pixels taking it back to 0. This value is known as the Black Level. This information is usually present in the metadata within the raw file. But this is usually a single value for all pixels. If more correctness is required, then an dark frame image is taken by placing the lens cap to allow no light. This dark frame is used for subtracting values for each individual pixel. Scale ColorsThe color intensities recorded by the sensor is different from what is felt in the eyes. You can consider it as if camera sensors record colors in a different color space than the standard RGB. This steps scales the colors in a way so to match the standard RGB color space. The transformation matrix is different for every camera model. The values in the raw file can use 12 or 14 bits per sample, but the default RGB image uses a 8 bits to represent a single color. This step also scales the values to the expected bps. Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 5 ReportTasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 6 ReportJournalAfter taking a break of 2 weeks, I resumed my work of creating raw-rs. In the first half of the week I was able to complete the algorithm for linear demosaicing. It simply takes the average of the neighboring pixels which have the color which is supposed to be computed. I ran the program through sample images to check the output, but the images were much darker than expected. After spending some time with the codebase, I realized that I had not done a color space conversion that happens in post-processing. This was also the time that I realized that even libraw does not output correct images by default. By passing a parameter that states to use camera's white balance instead of the one derived from I moved on to part of code that loads the Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 7 ReportJournalIn the previous week, I created a macro to load the camera data from all the toml files. But these toml files are not yet created. This week, I continued on by creating a script to extract the camera data from DNG files and store it in the form of toml files. This will also make it easier to add new models later on. Now the only step left is to get the DNG files in the first place. To achieve this, I downloaded sample RAW images of various camera models so as to convert them into DNG files using Adobe DNG Converter. Currently only RAW images of half of all the models is currently downloaded. I will do the rest of them in the next week. I was not able to give 30 hours this week due to starting of a new semester in my college. I plan to compensate for it by putting more hours in the next week. Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 8 ReportJournalPreviously, only half of all the models were downloaded. This week I have downloaded all the remaining ones. Some camera models don't even have a RAW sample and therefore I have skipped them. Then I ran all the DNG files through the script to extract the color matrix. Some models have "MODEL-NAME" in their model tag. Since such information is of no use, I have have skipped them. In total, color matrix for 40 camera models of Sony have been extracted. The images used in the test suite is a part of these 40 models so no extra effort was required there. Finally I resolved some small errors in #1796 made it ready for review. Since the start of my college I was not able to give 30 hours per week in a consistent manner, which is why I have decided to extend the deadline for GSoC so only 20 hours per week will be required from me from here on. Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 9 ReportJournalI made some changes to the existing PR #1796 based on review wherein the camera matrix will stored in decimal format instead of integers. Henceforth, the long standing PR was finally merged. I continued to work in the next phase: Post-processing by implementing the Convert to RGB step. But the result was not I was expecting. The final image is still darker than required. I am yet to find exact reason for this which is what I will be doing in the next week. Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 10 ReportJournalIn the past week, I was able to find out the reason for incorrect images. I had applied the gamma correction step after the scaling from 16 bits to 8 bits whereas the opposite order should have been followed. Hence I added a new step in the pipeline named as gamma correction. It is slightly sophisticated than a simple exponentiation. It involves calculating the histogram of the image which is used for generating the gamma curve table. This is used to apply the transformation and finally convert from 16 bits to 8 bits with a simple bit shift. With all the above steps done, The entire pipeline is complete and raw-rs is ready to be used for generating final images for a very small portion of Sony Cameras. Here is a output from raw-rs for the the file blossoms.arw which is used as a test case in the CI: Now that the pipeline is complete, my next steps would be to implement some cases in previous steps that I missed like using white balance data from the metadata of file or fixing the orientation of the image. Eventually support for almost all Sony camera will be added and each step in the pipeline will become its own Node in Graphite's Node Graph. Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 11 ReportJournalIn the past week I implemented code to extract the white balance parameters from the camera metadata and use them instead of deriving it from the camera matrix within this PR #1941. The PR has been already merged. The same blossoms.arw image now looks much better with more accurate colors: After this, I started implementing the image transforms that needs to be applied based on the orientation of the camera like flip and rotation in #1954. Only half of this task is currently complete. The code for extracting the camera orientation from metadata is complete and applying the transformation is left which will be done in the next week. Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 12 ReportJournalIn the past week I completed the step to transform the image and its corresponding PR #1954 had also got merged. Henceforth, I had a discussion with Keavon and TrueDoctor regarding the final architecture of the steps in raw-rs to maximize its performance while still keeping it modular enough for different kinds of use cases. The structure of its equivalent Graphite Node was also discussed. Currently all the steps require a full loop through the image to do the operation. The new architecture will minimize the number of loops by grouping operations in closures similar to how rust iterators work under the hood. From here on the focus will be shifted to maximize performance and add support for as many cameras as possible. The time taken to run a single test image was very high and would definitely not scale well when many more are added. Hence I changed the code to run the tests in a parallel manner in #1968. This along with the increased performance benefits by the new architecture, should make it easy to scale. Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
Week 13 ReportJournalAfter taking a leave for 3 weeks due to exams in my collage, I have continued to work where I had left off. This week I changed how each step in the pipeline functions. Previously each step was a function that took a The final API looks like: let subtract_black = raw_image.subtract_black_fn();
let scale_white_balance = raw_image.scale_white_balance_fn();
let scale_to_16bit = raw_image.scale_to_16bit_fn();
let raw_image = raw_image.apply((subtract_black, scale_white_balance, scale_to_16bit));
let convert_to_rgb = raw_image.convert_to_rgb_fn();
let mut record_histogram = raw_image.record_histogram_fn();
let image = raw_image.demosaic_and_apply((convert_to_rgb, &mut record_histogram));
let gamma_correction = image.gamma_correction_fn(&record_histogram.histogram);
if image.transform == Transform::Horizontal {
image.apply(gamma_correction)
} else {
image.transform_and_apply(gamma_correction)
} The scale colors step has also been split into two different steps: scale_white_balance and scale_to_16bit. This was done since scale_to_16bit is a compulsory step before demosaicing without much customization but scale_white_balance could be heavily customized by the user. Henceforth the PR #1972 was made ready to review. Tasks completed this week
Tasks for next week
|
Beta Was this translation helpful? Give feedback.
-
My name is Elbert Ronnie and I am excited to start contributing to Graphite. Throughout the 3 months of GSoC, I will be working on creating a RAW Photograph decoder in Rust which will help to convert raw photographs from different cameras to image bitmaps.
Synopsis
Most cameras capture photos in RAW format before being processed into JPEG and PNG. The most famous open source library for loading RAW files is LibRaw, but it is written in C++. All RAW decoding libraries in the Rust ecosystem are GPL-licensed and therefore cannot be used in Graphite which uses the Apache 2 license. This project aims to create a new Rust library to provide an alternative to LibRaw so that Graphite can directly import RAW files.
Benefits
Users of Graphite will be able to directly import RAW files into the editor without going through a conversion process from any external tools. This will also provide a permissively licensed RAW parser into the rust ecosystem, which could benefit many other image processing applications in the rust ecosystem.
Deliverables
GSoC 2024 Final Report
Create new library Raw-rs including a basic TIFF decoder #1757
In this PR, all the general code required to parse and extract metadata from TIFF files was committed into the repository by creating a new library with name
raw-rs
. Like-wise the code for reading the raw data of uncompressed ARW files was also included. A Test runner for raw files was also created that checked if the decoded raw data from this library matches exactly with the output from libraw.Raw-rs: make decoder for ARW1 and ARW2 formats #1775
After creating the decoder for uncompressed ARW format, this PR continues the work from the previous PR wherein it added support for decoding ARW 1 and ARW 2 formats. This PR also included some notable changes in API of the TIFF decoder. A new derive macro was created that could be used to specify all the tags to be extracted at once.
Raw-rs: Add preprocessing and demosaicing steps #1796
This PR adds the subtract black step, scale colors step (scale white balance + scale to 16bit) and the demosaicing step in the raw image processing pipeline. The Linear demosaicing algorithm was used to implement the demosaicing step. For the implementation of scale colors step, the white balance needs to be derived from a matrix that converts camera's color space to sRGB. This matrix is constant for a particular camera model. This PR also adds the matrices of 40 Sony camera models in the form of toml files. A new procedural macro was also created that loads the data from the toml files and includes it as a part of the binary.
Raw-rs: add post-processing steps #1923
This PR adds convert to RGB step and the gamma correction step in the raw image processing pipeline. With this PR merged, the entire core of the raw image processing pipeline was complete. This library could be used to convert actual raw image into image bitmaps although only for a few models. A sample image that was used in the tests is given below:
Raw-rs: use camera white balance when available #1941
This PR adds code to read the white balance data of the camera from the metadata of the TIFF file. Some files have this information contained within while others don't. This PR changes the white balance selection strategy to use the white balance information from the metadata if it available and fallback to calculating it from the color space conversion matrix if unavailable. The same image after using white balance from the metadata is given below:
Raw-rs: Flip and rotate image based on camera orientation #1954
This PR adds a last optional step in the image processing pipeline which is transform step. This step reads the orientation from the metadata and appropriately rotates and flips the image to orient the image accordingly.
Raw-rs: Refactor to run multiple steps in a single loop #1972
This PR improves the performance of the processing steps by reducing the number of times it has to loop throughout the image. The external API to use the library was changed in such a way that it tries to combine multiple steps in a a single loop wherever possible while still providing the same level of flexibility to the user. The final API is shown below:
Beta Was this translation helpful? Give feedback.
All reactions