Throughout our documentation, visualizations are performed on a test image of a black rectangle within a white rectangle.
Pixel Clustering and Segmentation
We started by converting the input image into black-and-white. From here, we used 2-D convolution to approximate the gradient of the image, using the Scharr Operator (convolving a special 3-by-3 matrix and its transpose with the image to calculate the X and Y gradient). From here, we calculated the magnitude and direction of the image gradient.
Once we had the gradient, we grouped pixels which had similar gradients. To do this, we created a graph of the image, where each pixel shares an edge with each of its adjacent neighbors. This edge is assigned a weight equal to the difference in gradient direction between them. We also assigned each pixel to its own "cluster". We then sorted the edges from the least weight to most weight. In order of increasing weight, we take the two clusters the two pixels of that edge belong to, and consider combining them (if they are already in the same cluster, we skip the edge and continue). We combine the two clusters if:
1. The range of gradient directions in the resulting cluster is less than or equal to the range of directions in cluster A or cluster B (whichever is smaller), plus some constant divided by the number of pixels in the resulting cluster. Formally, for clusters m and n , we combine them if and only if: D ( n ∪ m ) ≤ min ( D ( n ) , D ( m ) ) + K D | n ∪ m | , where D ( c ) gives the range of gradient directions in cluster c
2. The range of gradient magnitudes in the resulting cluster is less than or equal to the range of magnitudes in cluster A or cluster B (whichever is smaller), plus some constant divided by the number of pixels in the resulting cluster. Formally, for clusters m and n , we combine them if and only if: M ( n ∪ m ) ≤ min ( M ( n ) , M ( m ) ) + K M | n ∪ m | , where M ( c ) gives the range of gradient magnitudes in cluster c
As recommended by the paper, we used values of 100 and 1200 for K D and K M , respectively, though they claim that a wide range of values should work.
Repeating this for every edge will result in groups of pixels which follow straight line edges (or are in patches of very consistent color).