Evan Huang
I tried various methods to align the channels, including various metrics and algorithm parameters. Some metrics I tried out included L2 distance, normalized cross-correlation, structural similarity, and phase cross-correlation. Structural similarity and phase cross-correlation were most effective for single-scale images, but were often much too computationally expensive for the larger .tif files. My solution to this was to use structural similarity on the base layer of the image pyramid, so it only applied to the coarsest (and thus smallest) image on the stack. From there, I used other metrics that were less computationally demanding on the larger images. On the larger images, I first used NCC, but the results were often blurry so I used structural similarity again but restricted the images to only a 500x500 window in the center of the image to avoid arduous computation (I refer to this as SSIM_small). All images are cropped constant amounts before any processing is done.
My pyramid algorithm recursively downsamples the input image with a factor of 0.8 until the image is smaller than 300x300 pixels. Then it checks a [-30,30] window for this smallest image to maximize SSIM (disregarding a 30-pixel border). Using the optimal displacements from this layer, each following layer checks a 2-pixel window to maximize either NCC or SSIM_small. I experimented with many different window sizes and number of layers and found that this combination gave a good balance of accuracy and speed. Processing all of the images took <10 minutes in total, even with edge detection.
I also utilized edge detection to improve the similarity metrics. I used a Sobel filter on relevant layers to achieve this. This helped some of the images, but made little difference in others.
SSIM_small | SSIM_small w/ edge detection |
---|---|
NCC | NCC w/ edge detection |
---|---|