DCSH: (a) Each image pixel represents a point in 3D
space and a normal direction according to the depth map. (b) Each
world patch (pi) is projected to some quadrilateral (pi) on the image plane, by some homography (Hi). The fact that world patches
are repetitive in the scene is used to guide the search of similar
(projected) patches in the image plane.
We extend patch based methods to work on patches in 3D space. We start with Coherency Sensitive Hashing
(CSH), which is an algorithm for matching patches between two RGB images, and extend it to work with RGBD images.
This is done by warping all 3D patches to a common virtual plane in which CSH is performed. To avoid
noise due to warping of patches of various normals and depths, we estimate a group of dominant planes and compute
CSH on each plane separately, before merging the matching patches. The result is DCSH - an algorithm that
matches world (3D) patches in order to guide the search for image plane matches. An independent contribution is an extension
of CSH, which we term Social-CSH. It allows a major speedup of the k nearest neighbor (kNN) version of CSH
- its runtime growing linearly, rather than quadratically, in k. Social-CSH is used as a subcomponent of DCSH when
many NNs are required, as in the case of image denoising. We show the benefits of using depth information to image reconstruction
and image denoising, demonstrated on several RGBD images.
For each (green) pixel a in the image, which is the projection of a world location r_{a} with normal n_{a}, we simulate its local appearance, as if the surface was captured from a fronto-parallel view at a distance of z_{ref}. We construct a homography H_{a} for sampling the appearance from this viewpoint. At a first stage, a simple homography h_{a} is computed, to which an in-plane orientation noramlization operation is added, giving the final homography H_{a}.
1] Calculating h_{a}: Consider an origin based world-patch (blue, on left), which faces the z axis. We rotate it in 3D to face the associated normal n_{a}, translate it to the associated 3D location r_{a}, and finally - project it to a quadrilateral on the image plane. The combined transformation defines a homography h_{a}.
2] Calculating H_{a}: The inverse homography h_{a}^{−1} can be used to sample a normalized patch (green) around the pixel. We then (in-plane) rotate the normalized patch (by a rotation matrix R_{a}) such that it faces its dominant RGB texture orientation (white arrow). A new normalized patch can be sampled using R_{a}^{−1}·h_{a}^{−1}.
In each of the 5 reconstruction examples below, the 'source image' (top left) is reconstructed in a simple Non-Local-Means (NLM) fashion (see Section 5.3 in paper) using only the 'target image' (top middle) and a (dense) Nearest-Neighbor-Field (NNF) between the source and the target images.
We compare 3 different kinds of NNFs, within this reconstruction pipeline:
'CSH-NLM' - a NNF between stadard 8x8 square patches, implemented with the CSH algorithm (with k=1)
'DCSH-NLM' - a NNF between 3d normalized 8x8 square patches, implemented with the DCSH algorithm (with k=1), without the texture orientation normalization (step 1(c) of the DCSH algorithm)
'DCSH-NLM-Oriented' - same as DCSH-NLM, but including the orientation normalizing step
The reconstruction results are shown in 'reconstructed source image' (bottom row) together with the resulting 'reconstruction RMSE' (top right), where darker blue signifies lower reconstruction errors.
Note: Black areas in the reconstruction image are areas where depth information wasn't available.
In each of the denoising examples below, the 'noisy image' (top middle) was created by adding to each color channel of the 'clean image' (top left) white gaussian noise of STD σ = 25 graylevels.
We compare 4 different denoising pipelines (see paper for the details):
'2D' - (named 'CSH-PCA' in the paper)
'3D' - (named 'DCSH-PCA' in the paper)
'3D filetered' - (named 'DCSH-PCA-BI' in the paper)
'BM3D
The denoising results are shown in 'denoised image' (bottom row) together with the resulting 'denoising RMSE' (top right), where darker blue signifies lower denoising errors.