Project 1: Colorizing the Prokudin-Gorskii photo collection

Evan Huang

Approach:

I tried various methods to align the channels, including various metrics and algorithm parameters. Some metrics I tried out included L2 distance, normalized cross-correlation, structural similarity, and phase cross-correlation. Structural similarity and phase cross-correlation were most effective for single-scale images, but were often much too computationally expensive for the larger .tif files. My solution to this was to use structural similarity on the base layer of the image pyramid, so it only applied to the coarsest (and thus smallest) image on the stack. From there, I used other metrics that were less computationally demanding on the larger images. On the larger images, I first used NCC, but the results were often blurry so I used structural similarity again but restricted the images to only a 500x500 window in the center of the image to avoid arduous computation (I refer to this as SSIM_small). All images are cropped constant amounts before any processing is done.

My pyramid algorithm recursively downsamples the input image with a factor of 0.8 until the image is smaller than 300x300 pixels. Then it checks a [-30,30] window for this smallest image to maximize SSIM (disregarding a 30-pixel border). Using the optimal displacements from this layer, each following layer checks a 2-pixel window to maximize either NCC or SSIM_small. I experimented with many different window sizes and number of layers and found that this combination gave a good balance of accuracy and speed. Processing all of the images took <10 minutes in total, even with edge detection.

I also utilized edge detection to improve the similarity metrics. I used a Sobel filter on relevant layers to achieve this. This helped some of the images, but made little difference in others.

Results for SSIM with/without edge detection

SSIM_small	SSIM_small w/ edge detection
SSIM - Cathedral R: (12, 3) G: (5, 2)	SSIM w/ Edge Detection - Cathedral R: (12, 3) G: (5, 2)
SSIM - Church R: (26, -7) G: (23, 4)	SSIM w/ Edge Detection - Church R: (28, -2) G: (24, 4)
SSIM - Emir R: (17, -325) G: (47, 18) The red channel is still skewed.	SSIM w/ Edge Detection - Emir R: (88, 34) G: (50, 23) Using edge detection fixed the red channel's issue.
SSIM - Harvesters R: (102, 6) G: (57, 18)	SSIM w/ Edge Detection - Harvesters R: (102, 15) G: (49, 18)
SSIM - Icon R: (61, 23) G: (40, 18)	SSIM w/ Edge Detection - Icon R: (76, 24) G: (40, 18)
SSIM - Lady R: (86, 1) G: (47, 7)	SSIM w/ Edge Detection - Lady R: (97, 9) G: (49, 1)
SSIM - Melons R: (158, 8) G: (81, 10)	SSIM w/ Edge Detection - Melons R: (158, 14) G: (74, 6)
SSIM - Monastery R: (3, 2) G: (-3, 2)	SSIM w/ Edge Detection - Monastery R: (3, 2) G: (-3, 2)
SSIM - Onion Church R: (88, 37) G: (50, 26)	SSIM w/ Edge Detection - Onion Church R: (95, 37) G: (50, 26)
SSIM - Sculpture R: (120, -27) G: (33, -10)	SSIM w/ Edge Detection - Sculpture R: (121, -27) G: (34, -10)
SSIM - Self Portrait R: (149, 34) G: (76, 26)	SSIM w/ Edge Detection - Self Portrait R: (151, 32) G: (66, 19)
SSIM - Three Generations R: (97, 11) G: (52, 16)	SSIM w/ Edge Detection - Three Generations R: (91, 6) G: (52, 16)
SSIM - Tobolsk R: (6, 3) G: (3, 3)	SSIM w/ Edge Detection - Tobolsk R: (6, 3) G: (3, 2)
SSIM - Train R: (74, 22) G: (42, 7)	SSIM w/ Edge Detection - Train R: (66, 34) G: (44, 3)

Results for NCC with/without edge detection

NCC	NCC w/ edge detection
NCC - Cathedral R: (12, 3) G: (5, 2)	NCC w/ Edge Detection - Cathedral R: (12, 3) G: (5, 2)
NCC - Church R: (27, -17) G: (24, 1)	NCC w/ Edge Detection - Church R: (28, -3) G: (24, 4)
NCC - Emir R: (15, -325) G: (43, 3) The red channel is very skewed here. I suspect it is due to the repeated pattern on the shirt inflating SSIM, since I found this issue when I directly tested the base (coarsest) layer.	NCC w/ Edge Detection - Emir R: (94, 39) G: (50, 23) Using edge detection fixed the red channel's issue.
NCC - Harvesters R: (102, 7) G: (58, 16)	NCC w/ Edge Detection - Harvesters R: (102, 14) G: (60, 17)
NCC - Icon R: (63, 23) G: (40, 17)	NCC w/ Edge Detection - Icon R: (78, 23) G: (42, 17)
NCC - Lady R: (88, 1) G: (47, 8)	NCC w/ Edge Detection - Lady R: (97, 10) G: (49, 9)
NCC - Melons R: (158, 2) G: (81, 10)	NCC w/ Edge Detection - Melons R: (158, 12) G: (74, 0)
NCC - Monastery R: (3, 2) G: (-3, 2)	NCC w/ Edge Detection - Monastery R: (3, 2) G: (-3, 2)
NCC - Onion Church R: (88, 37) G: (51, 26)	NCC w/ Edge Detection - Onion Church R: (96, 37) G: (51, 26)
NCC - Sculpture R: (121, -26) G: (33, -10)	NCC w/ Edge Detection - Sculpture R: (121, -26) G: (33, -17)
NCC - Self Portrait R: (151, 33) G: (76, 26)	NCC w/ Edge Detection - Self Portrait R: (152, 32) G: (63, 15)
NCC - Three Generations R: (91, 11) G: (54, 14)	NCC w/ Edge Detection - Three Generations R: (91, 8) G: (54, 12)
NCC - Tobolsk R: (6, 3) G: (3, 3)	NCC w/ Edge Detection - Tobolsk R: (6, 3) G: (3, 2)
NCC - Train R: (76, 27) G: (44, 6)	NCC w/ Edge Detection - Train R: (66, 35) G: (43, 8)