Jump directly to main content

Update 2019: Check out my new approach for background removal.

Recently I’ve been playing around with OpenCV and Python to try and automate the process of removing background from an image of an object.

Sample 1 (050.jpg) Sample 2 (078.jpg)


From the images you can see that the background is close to plain white. You can also see that the second image shown is pretty blurred and not well illuminated.


OpenCV has several ways to remove background (like watershed algorithm, canny edge), but none of them seems to work good (out-of-the-box at least) on the images I was using. (Which is surprising, by the way, since OpenCV is quite popular.)


Here I show you how to do segmentation for “simple” images like these. The algorithm that I have used is as follows:

  1. Run an edge detection algorithm on the image (like Sobel, Scharr or Prewitt)
  2. Reduce noise on the resulting edge image (using a simple trick I found from Octave forge/Matlab)
  3. Run contour detection over the edges, return the contour in hierarchical order and pick the contours in the first level heirarchy.
  4. Now either pick the largest contour (in this case) or ignore all contours whose area is less than 5% of the total image area.


Before going further into details, let me first mention the several limitations of this approach:

  1. contour detection would work poorly if object edges are touching the edges of the image.
  2. contour detection would work poorly if there are multiple objects with edges touching each other (like for example if one partly covers an object by placing another object on top) or if background texture is very coarse.
  3. If background color is similar to object color, the edges probably won’t be detected well.

So in other words, this approach works only if background is relatively plain, contrasting in color (to the object) and the object stands separated from other objects and image edges.


Now I’ll dive into more details and the code.

Step 0: First begin with preprocessing the image with a slight Gaussian blur to reduce noise from the original image before doing an edge detection.

import numpy as np
import cv2

img = cv2.imread('078.jpg')

blurred = cv2.GaussianBlur(img, (5, 5), 0) # Remove noise


Step 1: Next we do the edge detection. The most commonly use method to do edge detection is to use the Sobel operator.

The Sobel operator calculates the difference between intensities (aka “gradients”) of pixels in a specific direction (x-axis or y-axis).

We have to compute the Sobel edge on both axis and then get the magnitude of gradients combined; which is done by finding their euclidian distance (sqrt(x^2 + y^2)) on each corresponding pair of values.

Doing that with loops in Python would be slow. Fortunately numpy gives np.hypot() method that can do that for us quickly.

def edgedetect (channel):
    sobelX = cv2.Sobel(channel, cv2.CV_16S, 1, 0)
    sobelY = cv2.Sobel(channel, cv2.CV_16S, 0, 1)
    sobel = np.hypot(sobelX, sobelY)

    sobel[sobel > 255] = 255; # Some values seem to go above 255. However RGB channels has to be within 0-255


Since we are dealing with color images, the edge detction needs to be run on each color channel and then they need to be combined.

The way I am doing that is by finding the max intesity from among the R,G and B edges. I’ve tried using average of the R,G,B edges, however max seems to give better results.

edgeImg = np.max( np.array([ edgedetect(blurred[:,:, 0]), edgedetect(blurred[:,:, 1]), edgedetect(blurred[:,:, 2]) ]), axis=0 )


Sobel of 078.jpg

Step 2: However the image has lots of noise (you can’t see it here because I think image resizing removed much of the noise. But if you run the code you’ll see it more clearly). I’ve found an easy trick from Octave forge/Matlab that will reduce noise considerably.

The trick to reduce noise is to zero out any intensity that is less than the mean of all the intensities of the edge image.

So add the following after calling edgedetect() function:

mean = np.mean(edgeImg);
# Zero any value that is less than mean. This reduces a lot of noise.
edgeImg[edgeImg <= mean] = 0;


Sobel of 078.jpg with noise reduction

Step 3: Next we do contour detection. This one is pretty straight-forward. The thing to understand here is “heirarchical” contours.

What that means is, any contour (c1) enclosed inside another contour (c2) is treated as a “child” of c2. And contours can be nested to more than one level (So the structure is like a tree). OpenCV returns the tree as a flat array though; with each tuple containing the index to the parent contour.

def findSignificantContours (img, edgeImg):
    image, contours, heirarchy = cv2.findContours(edgeImg, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

    # Find level 1 contours
    level1 = []
    for i, tupl in enumerate(heirarchy[0]):
        # Each array is in format (Next, Prev, First child, Parent)
        # Filter the ones without parent
        if tupl[3] == -1:
            tupl = np.insert(tupl, 0, [i])
            level1.append(tupl)

    ...

We are not done yet with findSignificantContours(). Next we remove any contour that doesn’t take up at least 5% of the image in area. This reduces the rest of the noise.


    # From among them, find the contours with large surface area.
    significant = []
    tooSmall = edgeImg.size * 5 / 100 # If contour isn't covering 5% of total area of image then it probably is too small
    for tupl in level1:
        contour = contours[tupl[0]];
        area = cv2.contourArea(contour)
        if area > tooSmall:
            significant.append([contour, area])

            # Draw the contour on the original image
            cv2.drawContours(img, [contour], 0, (0,255,0),2, cv2.LINE_AA, maxLevel=1)

    significant.sort(key=lambda x: x[1])
    #print ([x[1] for x in significant]);
    return [x[0] for x in significant];
edgeImg_8u = np.asarray(edgeImg, np.uint8)

# Find contours
significant = findSignificantContours(img, edgeImg_8u)

I’ve drawn the detected contour on the original image. Let’s see how it looks.


Contour of 050.jpg Contour of 078.jpg


That looks nice. The contours need smoothing though, which I’ll mention later. A point to note is that if the images are resized down from ~700x900 to ~250x300 then the contours are going to be smoother and even noise reduces a lot.


Finally one can remove the background by creating a mask to fill the contours.

# Mask
mask = edgeImg.copy()
mask[mask > 0] = 0
cv2.fillPoly(mask, significant, 255)
# Invert mask
mask = np.logical_not(mask)

#Finally remove the background
img[mask] = 0;

Contour Smoothing

There are two ways to smoothen the final contour. One is to use cv2.approxPolyDP and the other is to use an algorithm called “Savitzky-Golay filter”.

Unfortunately neither works well for all circumstances. The first one works well when the object has straight edges, with sharp corners. Whereas Savitzky-Golay filter works well when the object has curved edges/boundaries. If an object has a mix of both straight and curved edges, then neither works perfectly.


With that said the code for approxPolyDP is as follows:

epsilon = 0.10 * cv2.arcLength(contour, True)
# or epsilon = 3, so slighter contour corrections
approx = cv2.approxPolyDP(contour, epsilon, True)
contour = approx

You can play a bit with the multiplicative factor which I set to 0.1. I am not sure whether 0.1 will work for all image resolutions.


For Savitzky-Golay smoothing, one has to first install scipy and scipy.signal. The code is as follows (output image also shown):

from scipy.signal import savgol_filter
...
# Use Savitzky-Golay filter to smoothen contour.
window_size = int(round(min(img.shape[0], img.shape[1]) * 0.05)) # Consider each window to be 5% of image dimensions
# or window_size = 3 for slighter contour corrections
x = savgol_filter(contour[:,0,0], window_size * 2 + 1, 3)
y = savgol_filter(contour[:,0,1], window_size * 2 + 1, 3)

approx = np.empty((x.size, 1, 2))
approx[:,0,0] = x
approx[:,0,1] = y
approx = approx.astype(int)
contour = approx


Smoothened contour of 078.jpg

That’s all folks.


Update Nov 2016: I updated the article. I wrongly assumed scharr edge operator to be way better than sobel because the intensities given by scharr are stronger. But it turned out that it doesn’t affect contour detection and in fact sobel is slightly more accurate.

Also I found out that the noise reduction trick is better applied after combining the edge images for all three channels (RGB) rather than my previous approach of applying noise reduction on each channel.