# Co-occurrence Filter - Supplementary Material

### Outline

#### Figures

We present the figures from our submission in full resolution and without cropping:

Figure 5: The effect of window size
Figure 6: The effect of soft quantization
Figure 7: Comparison of iterative vs. rolling
Figure 8: Applying CoF Itterativly
Figure 9: Comparison to other Methods
Figure 10: Applications

#### Mathematical Derivation

Equation 15: Co-occurrence: from Hard to Soft [also available as pdf]

To switch between images please use the colored buttons on the left.

### Figures:

#### Figure 5: The effect of window size:

The figure shows the effect of the filters' support size, where ws denotes half its size in both dimensions. The smaller the filter, the smaller the texture that it smooths. For ws = 3, only the smallest imperfections of the stones are smoothed. When ws = 5 the bricks on top of the background building starts to smooth as well. Finally, ws = 15 soften the borders between stones.

#### Figure 6: The effect of soft quantization:

The figure shows the effect of soft quantization. Hard quantization means that each pixel is assigned to a single cluster. In soft quantization we assign each pixel to all clusters. Its contribution is (inverse) proportional to its distance to the cluster mean. Hard quantization introduces artifacts because nearby pixels are mapped to different clusters. Take for example Barbara's hands. Soft quantization eliminates this problem.

#### Figure 7: Comparison of iterative vs. rolling:

The figure shows two different approaches for running co-occurrence filter multiple times. In the iterative scheme, the co-occurrence matrix is kept constant whereas in the rolling scheme we update it at each iteration. Clearly the iterative scheme is superior. The rolling scheme introduces artifacts because as we smooth, pixels are getting closer to their clusters' mean. As result, we get more co-occurring values around an object's boundary. In the iterative scheme pixels are assigned to clusters upfront, and more iterations simply smooth co-occurring pixels even more. This effect is clearly visible around the tree tops.

#### Figure 8: Applying CoF Itterativly:

This figure shows the result of smoothing an image multiple times, in an iterative fashion. In the paper we show that as we increase the number of iterations, the difference between the resulting images decreases. The process converges after a few iterations.

#### Figure 9: Comparison to other Methods:

##### Comparison with Domain Transform and Guided Image Filter:
Both Domain Transform and Guided Image Filter methods preserve edges. Flicker between their results and the input, and observe that the hut is much smoother. In addition the leaves' color pop up nicely: the red / yellow and green shades converge to a mean red / yellow and green values. In addition, the boundary between the hut and the leaves is preserved. Our method, on the other hand, picks the texture of the leaves and smooths it out. Flicker between our result and the input to see the smoothing effect. Notice, that the fact that we smooth the texture of the leaves does not hurt the clear boundary between the hut and the leaves. Our filter preserves boundaries not edges.
##### Comparison with L0 Smoothing and Rolling Guidance Filter:
L0 smoothing solves a global optimization problem. It respects strong edges. L0 works very well for image simplification. In the following image, the olive texture includes strong edges within it, which remain in the filtered image. Rolling Guidance Filter (RGF) allows the user to define a scale. Textures with texton size smaller than this scale will be smoothed out. We used the largest olive in the image to determine the scale. Indeed, RGF managed to nicely smooth the piles of olives. Some of the green leaves that appear between the piles are smaller than the largest olive. As a result they too were smoothed out. This is most evident at the top left corner of the image, as well as on the price signs. Our filter managed to smooth the olives while keeping the boundaries between the piles sharp and leaving both the leaves and preserving the structure of the price signs. Our filter respects textures at various scales.
##### Comparison with Semantice Filtering:
Semantic filtering uses the domain transom. Instead of applying it to the actual pixel vales, an edge detector is used to down-weight the effect of non-edge pixels. This works extremely well when the edge detector is correct. Flicker between the semantic filter and the input and observe the trees. However, when the edge detector fails, for example within the sunflower field, results are not as plausible. Our filter produces consistent results.
##### Comparison with Weighted Least Squares + Diffusion Maps:
Diffusion maps rely on the spectral decomposition of an affinity matrix. The affinity matrix is based on the intensity difference between pixel values in Lab space. Therefore, Diffusion distance is low if the random walk between the colors is highly probable. Diffusion maps are a way of approximating the diffusion distances. The distance is computed as a weighted sum of the eignvectors of the diffusion maps where the eignvalues are used as weights. This means that by looking at the eignvectors one can observe which pixels are likely to be smoothed together. Click on the dominant eignevectors to see the top 3 eignvectors. Observe that for all of them the black and white strips of the zebra are mapped to different clusters. This is because there are few pixels colored between the black and white, which causes the random walk between them to be highly improbable. Our filter, on the other hand, manages to smooth the zebra into a single color. This is due to the fact that our filter is about textures eve if the inner gradients are large.

#### Figure 10: Applications:

The figure shows three images taken from the DAVIS dataset [15]. First we show the input, then we show the manually provided scribble and the mask that we get by smoothing the scribble. CoF refers to running the co-occurrence filter using the entire image statistics. Flicker between it and the input to observe that we truly smooth textures while preserving the edges between them. Next we show the results of filtering the image with statistics collected from only the foreground or only the background. The foreground image smooths the object of interest nicely while preserving the background sharp. We didn't get the complementary effect with the background co-occurrence matrix. This is because our mask isn't tight around the object. Hence we provide the Foreground - background version FBCoF. See equation 16 in the submission. Finally, we provide BWCoF that keeps the object in full color while turning the background into gray.
 Input Sequence Output Sequence
 Input Sequence Output Sequence

### Mathematical Derivation

#### Equation 15: Co-occurrence: from Hard to Soft

We derive the connection between the co-occurrence matrix using hard and soft clustering. The former is faster to compute, but the latter is more accurate. We suggest an approximation that maintains the speed of the hard clustering approach with the visual quality of soft clustering. Recall that calculating co-occurrence matrix using hard and soft assignments is given by:
C_{hard}( \tau_{a}, \tau_{b} ) = \sum_{i,j} exp(-\frac{d_{ij}^2}{\sigma^2}) [i \in \tau_{a}] [j\in \tau_{b}]
(1)
C_{soft}( \tau_{a}, \tau_{b} ) = \sum_{i,j} exp(-\frac{d_{ij}^2}{\sigma^2}) Pr(i \in \tau_{a}) Pr( j\in \tau_{b} )
(2)

Since $d_{i,j}$ decays exponentially, we compute equation (2) for $i,j$ that are at most $r$ pixels apart ( we use $r=3 \cdot \sigma$ ). In practice, this means that for each pixel, we evaluate $r^{2}$ pairs, and for each pair $k^{2}$ products of cluster assignment probabilities. This amounts to $O(n \cdot r^{2} \cdot k^{2} )$. In contrast, when evaluating (1) we have per pixel only $r^2$ non zeros pairs, which makes the complexity merely $O(n \cdot r^{2})$.

Normally, $Pr(i \in \tau_{a})$ is modeled as $K(p_i, \tau_{a})$ where $K$ is a kernel function (3). We want a coarser model for $Pr(i \in \tau_{a})$ that will maintain the complexity of hard clustering. To do so, we assume that we have a hard clustering assignment $i \rightarrow \tau(i)$ and make the following approximation:

Pr( i \in \tau_{a} ) = K(p_i, \tau_{a}) \approx K( \tau(i), \tau_{a})
(3)

In words, the distance between pixel value $p_i$ and cluster $\tau_{a}$ is approximated by the distance between $\tau(i)$ and cluster $\tau_{a}$. Using this model we have:

C_{soft}(\tau_{a},\tau_{b})
\quad \quad = \quad \sum_{i,j} exp({-\frac{d_{ij}^2}{2\cdot\sigma^2}}) \cdot Pr( i \in \tau_{a} ) \cdot Pr( j \in \tau_{b} )
\quad \quad \underset{i}{\approx} \quad \sum_{i,j} exp({-\frac{d_{ij}^2}{2\cdot\sigma^2}}) \cdot K(\tau_{a},\tau(i)) \cdot K(\tau_{b},\tau(j))
\quad \quad \underset{ii}{=} \quad \sum_{i,j} exp({-\frac{d_{ij}^2}{2\cdot\sigma^2}}) \cdot \sum_{\tau_{k_{1}}} [i \in \tau_{k_{1}}] \cdot K(\tau_{a},\tau_{k_{1}}) \cdot \sum_{\tau_{k_{2}}} [j \in \tau_{k_{2}}] \cdot K(\tau_{b},\tau_{k_{2}})
\quad \quad \underset{iii}{=} \quad \sum_{\tau_{k_{1}},\tau_{k_{2}}} K(\tau_{a},\tau_{k_{1}}) \cdot K(\tau_{b},\tau_{k_{2}}) \cdot \sum_{i,j} exp({-\frac{d_{ij}^2}{2\cdot\sigma^2}}) \cdot [i \in \tau_{k_{1}}] \cdot [j \in \tau_{k_{2}}]
\quad \quad \underset{iv}{=} \quad \sum_{\tau_{k_{1}},\tau_{k_{2}}} K(\tau_{a},\tau_{k_{1}}) \cdot K(\tau_{b},\tau_{k_{2}}) \cdot C_{hard}(\tau_{k_{1}},\tau_{k_{2}})
(4)

where:

1. assign the approximation in equation (3).
2. $[i \in \tau_{k_{1}}]$ equals $1$ only for $\tau_{k_{1}} = \tau(i)$ and $0$ otherwise.

3. rearrange summations.
4. use equation (1) for hard quantization.