cs180: proj1 | Notion

Images of the Russian Empire — Colorizing the Prokudin-Gorskii Photo Collection

Overview / Process

Goal: Given 3 glass negatives stacked vertically, recreate the full colored image!

I divided the stack into its 3 image channels (BGR from top to bottom) and cropped by an arbitrary amount to get rid of borders. To calculate the displacement vectors, I started off using the sum of squared differences (SSD) for smaller images. I used exhaustive search over a small window and chose the displacement vector for each target channel that minimized the SSD value. For the bigger images, I implemented the image pyramid technique. I created new levels for each target channel until their dimensions were less than 128 pixels. For some of the large images, this meant depths of 6-7. To calculate the displacement vectors, I originally used SSD. SSD worked great for most of my images, but for some (like self_portrait.tif), there were issues with different brightnesses. I decided to instead use normalized cross correlation (NCC) to better address this, choosing displacement vectors that maximized NCC (aka maximized similarity) between the source and target. I then went back and used NCC to process the smaller images but found it to be near identical to SSD’s results.

One debugging practice I found extremely useful was displaying my source, overlay, and aligned overlay images at each pyramid level. This served as a sanity check as well as helped me see around what image size by displacement vector computation started going awry.

I was curious to see what would happen if I didn’t crop before running SDD/NCC and found it greatly affect my alignments. The black and white pixels of the borders definitely interfered with my calculations, which cemented my intuition for cropping in the pre-process stage.

Hiccups

When working with the smaller images, I immediately found substantial alignment differences with which channel I used as my source. The project overview suggested using the blue channel as the source and overlaying green to blue then red to blue. With my implementations of SSD and NCC, using the green channel as my source produced better results. Here’s an example with cathedral.jpg

source: blue channel

source: red channel

With the bigger images, I repeatedly ran into formatting errors at the end when stacking my three channels. With some debugging, I realized I needed to normalize my aligned image as I had pivoted to using NCC.

Here are some random images I produced along the way, many of which came down to algorithmic issues with determining the pyramid levels’ displacement vectors.