Back to Basics: Background Subtraction using Stationary Camera

Detecting background of an environment can be very hard depending to the actual point of view of camera. There may be shadows, different weather conditions, sun light related problems, occlusions, so many things, right? Especially for an outdoor camera, you may not know what will come. But instead thinking all of these possibilities, lets focus on main problem of background detection. It is very simple: Separate background from foreground.

Well, you may say it is not that easy as you wrote, but lets just not think about all other parts of problem. Lets only think about a stationary camera and a small interval of video. So, what we only need to do is calculating background with some part of this video. Then later use that background in rest of the video.

Define our approach:

1- Divide video into two piece.

2- First piece is for calculating background.

3- Second piece is for doing the actual foreground detection.

That’s it.

So, what we need for it. What is our intuition on finding background. Not a big deal! Just take average of all images in first part of video stream. This will be our background.

How will we detect foreground is easier, it is just current image minus background. That simple!

Now we have something called foreground, but it is noisy and pixels still not become chunks of binaries. Here, it looks like, we need a constant. OK, create a constant to threshold some pixels to zero, and others to most intense value.

So far, I don’t see anything hard to understand and not applicable.

Just to be more clear, visualize it and see:

meanfilterapproach

For achieving our purpose lets find an example to apply our approach. This way, we can see the boundaries of method. This is actually little bit what computer vision is. Because the field is so vast, and we cannot find very generalized algorithms for applying to any kind of video. Instead what we are doing, playing around the input visual data, whatever is it, and get quick results about how much we can be successful on our goal. Remember, in Computer Vision field, there is no ground truth. This means, even though your approach solves your problem, it may suck in another input data. So, keeping scope small and enough for you is way to go.

OK, I found couple of background problems, and I am going to use this one for applying our discussed technique.

Here for making image processing easier in C#, I used OpenCV. But this doesn’t means we used background subtraction methods. Only for image reading and matrix operations.

These images are giving some clue about how our background will be like.

First Image in Data
First Image in Data
Last Image in Data
Last Image in Data

 

 

 

 

 

 

 

 

And these images are preprocessed ground truth data, and as you can see from the downloaded data, from 1 to 469 all images are full gray. This means, there is no ground truth there, and use them in your training process.

First Image Ground Truth
First Image Ground Truth
Last Image Ground Truth
Last Image Ground Truth

 

 

 

 

 

 

 

 

Here is how we take the data:

Now lets do some loops to train our background.

Training was super easy by using arithmetic methods of OpenCV. But when you look at the code, you really can understand what is the intention here if you would do the math instead OpenCV.

What defines background for us is sequentially average of every images that gets into training phase. Here the output of our background:

Background 1
Background 1
Background 469
Background 469
Background 200
Background 200

 

 

 

 

 

 

 

 

You see how the vehicles slowly disappearing from the image, frame by frame. Now, time to compute foreground by using our background image. Here, we use a global threshold and see how it bounds our algorithm actually. One small lesson.

When we look at the results of foreground data, with compared to ground truth there will be very much difference, but it’s OK, because we didn’t contour the binary blobs like in the ground truth data.

gt-470
Ground Truth 470
foreground-470
Computed Foreground
difference-470
Difference Between Ground Truth and Foreground

 

 

 

 

 

 

 

 

gt-1000
1000

foreground-1000

difference-1000

 

 

 

 

 

 

 

 

gt-1700
1700

foreground-1700 difference-1700

 

 

 

 

 

 

 

 

Here is the full code of this example and by the way don’t forget the add EmguCV x64 and x86 folders to under your project. I used EmguCV port of OpenCV library.

Summary

I am hoping that we learned something from this very simple and intuitive approach. Sometimes thinking very complex things in the most basic way, creates an opportunity for us to handle it good or bad. Then, we evaluate the results and improve our aspects to new ideas. This is a good way to hack things and playing with them.

References

1- Mean Filter Wikipedia

2- OpenCV Background Subtraction

3- Change Detect Dataset

4- EmguCV Installation and Project

 

I am csharp developer, mathematics graduated, visionary coder, tennis player, bad english speaker, blog reader, blog writer, and very lazy person. I will be sharing my personal thoughts, experiences, hobbies that I'd like to do and different news that takes my interest as a simple, regular person. Sometimes in English, sometimes in Turkish.
  • Geucimar

    Thank you! Good example.