DCSH: (a) Each image pixel represents a point in 3D
space and a normal direction according to the depth map. (b) Each
world patch (pi) is projected to some quadrilateral (pi) on the image plane, by some homography (Hi). The fact that world patches
are repetitive in the scene is used to guide the search of similar
(projected) patches in the image plane.
We extend patch based methods to work on patches in 3D space. We start with Coherency Sensitive Hashing
(CSH), which is an algorithm for matching patches between two RGB images, and extend it to work with RGBD images.
This is done by warping all 3D patches to a common virtual plane in which CSH is performed. To avoid
noise due to warping of patches of various normals and depths, we estimate a group of dominant planes and compute
CSH on each plane separately, before merging the matching patches. The result is DCSH - an algorithm that
matches world (3D) patches in order to guide the search for image plane matches. An independent contribution is an extension
of CSH, which we term Social-CSH. It allows a major speedup of the k nearest neighbor (kNN) version of CSH
- its runtime growing linearly, rather than quadratically, in k. Social-CSH is used as a subcomponent of DCSH when
many NNs are required, as in the case of image denoising. We show the benefits of using depth information to image reconstruction
and image denoising, demonstrated on several RGBD images.
For each (green) pixel a in the image, which is the projection of a world location ra with normal na, we simulate its local appearance, as if the surface was captured from a fronto-parallel view at a distance of zref. We construct a homography Ha for sampling the appearance from this viewpoint. At a first stage, a simple homography ha is computed, to which an in-plane orientation noramlization operation is added, giving the final homography Ha.
1] Calculating ha: Consider an origin based world-patch (blue, on left), which faces the z axis. We rotate it in 3D to face the associated normal na, translate it to the associated 3D location ra, and finally - project it to a quadrilateral on the image plane. The combined transformation defines a homography ha.
2] Calculating Ha: The inverse homography ha−1 can be used to sample a normalized patch (green) around the pixel. We then (in-plane) rotate the normalized patch (by a rotation matrix Ra) such that it faces its dominant RGB texture orientation (white arrow). A new normalized patch can be sampled using Ra−1·ha−1.
In each of the 5 reconstruction examples below, the 'source image' (top left) is reconstructed in a simple Non-Local-Means (NLM) fashion (see Section 5.3 in paper) using only the 'target image' (top middle) and a (dense) Nearest-Neighbor-Field (NNF) between the source and the target images.
We compare 3 different kinds of NNFs, within this reconstruction pipeline:
'CSH-NLM' - a NNF between stadard 8x8 square patches, implemented with the CSH algorithm (with k=1)
'DCSH-NLM' - a NNF between 3d normalized 8x8 square patches, implemented with the DCSH algorithm (with k=1), without the texture orientation normalization (step 1(c) of the DCSH algorithm)
'DCSH-NLM-Oriented' - same as DCSH-NLM, but including the orientation normalizing step
The reconstruction results are shown in 'reconstructed source image' (bottom row) together with the resulting 'reconstruction RMSE' (top right), where darker blue signifies lower reconstruction errors.
Note: Black areas in the reconstruction image are areas where depth information wasn't available.