DCSH - Matching Patches in RGBD Images

Yaron Eshet1, Simon Korman1, Eyal Ofek2, Shai Avidan1

DCSH: (a) Each image pixel represents a point in 3D space and a normal direction according to the depth map. (b) Each world patch (pi) is projected to some quadrilateral (pi) on the image plane, by some homography (Hi). The fact that world patches are repetitive in the scene is used to guide the search of similar (projected) patches in the image plane.


[1] School of Engineering, Tel-Aviv University
[2] Microsoft Research, Redmond

Paper

"DCSH - Matching Patches in RGBD Images" [pdf]
Yaron Eshet, Simon Korman, Eyal Ofek, Shai Avidan
ICCV 2013, Sydney



abstract

We extend patch based methods to work on patches in 3D space. We start with Coherency Sensitive Hashing (CSH), which is an algorithm for matching patches between two RGB images, and extend it to work with RGBD images. This is done by warping all 3D patches to a common virtual plane in which CSH is performed. To avoid noise due to warping of patches of various normals and depths, we estimate a group of dominant planes and compute CSH on each plane separately, before merging the matching patches. The result is DCSH - an algorithm that matches world (3D) patches in order to guide the search for image plane matches. An independent contribution is an extension of CSH, which we term Social-CSH. It allows a major speedup of the k nearest neighbor (kNN) version of CSH - its runtime growing linearly, rather than quadratically, in k. Social-CSH is used as a subcomponent of DCSH when many NNs are required, as in the case of image denoising. We show the benefits of using depth information to image reconstruction and image denoising, demonstrated on several RGBD images.




Table of Contents: (click on relevant section)

Further details on fronto-parallel view simulation

Reconstruction Examples

Denoising Examples



Further details on fronto-parallel view simulation



For each (green) pixel a in the image, which is the projection of a world location ra with normal na, we simulate its local appearance, as if the surface was captured from a fronto-parallel view at a distance of zref. We construct a homography Ha for sampling the appearance from this viewpoint. At a first stage, a simple homography ha is computed, to which an in-plane orientation noramlization operation is added, giving the final homography Ha.

1] Calculating ha: Consider an origin based world-patch (blue, on left), which faces the z axis. We rotate it in 3D to face the associated normal na, translate it to the associated 3D location ra, and finally - project it to a quadrilateral on the image plane. The combined transformation defines a homography ha.

2] Calculating Ha: The inverse homography ha−1 can be used to sample a normalized patch (green) around the pixel. We then (in-plane) rotate the normalized patch (by a rotation matrix Ra) such that it faces its dominant RGB texture orientation (white arrow). A new normalized patch can be sampled using Ra−1·ha−1.



Reconstruction Examples


In each of the 5 reconstruction examples below, the 'source image' (top left) is reconstructed in a simple Non-Local-Means (NLM) fashion (see Section 5.3 in paper) using only the 'target image' (top middle) and a (dense) Nearest-Neighbor-Field (NNF) between the source and the target images.

We compare 3 different kinds of NNFs, within this reconstruction pipeline:

The reconstruction results are shown in 'reconstructed source image' (bottom row) together with the resulting 'reconstruction RMSE' (top right), where darker blue signifies lower reconstruction errors.

Note: Black areas in the reconstruction image are areas where depth information wasn't available.


Building

source image target image reconstruction RMSE




reconstructed source image


pattern1

source image target image reconstruction RMSE




reconstructed source image


pattern2

source image target image reconstruction RMSE




reconstructed source image


dress

source image target image reconstruction RMSE




reconstructed source image


rocks

source image target image reconstruction RMSE




reconstructed source image





Denoising Examples (iPhone data-set)


In each of the denoising examples below, the 'noisy image' (top middle) was created by adding to each color channel of the 'clean image' (top left) white gaussian noise of STD σ = 25 graylevels.

We compare 4 different denoising pipelines (see paper for the details):

The denoising results are shown in 'denoised image' (bottom row) together with the resulting 'denoising RMSE' (top right), where darker blue signifies lower denoising errors.


friends

clean image noisy image denoising RMSE





denoised image


Cup

clean image noisy image denoising RMSE





denoised image


paper

clean image noisy image denoising RMSE





denoised image


dress

clean image noisy image denoising RMSE





denoised image


mosaic

clean image noisy image denoising RMSE





denoised image


book

clean image noisy image denoising RMSE





denoised image


tree

clean image noisy image denoising RMSE





denoised image