Understanding Soccer

Rating: 5
June 3rd, 2007

soccer_ball.gifIn 1998 (I can’t believe it was so long ago!) I was working at MATE (Media Access Technologies), an Israeli start-up at the time, specialising in face recognition and video searching. It was then that I came to know one of the coolest tricks in image recognition - the Hough Transform (pronounced “huf transform“).

In this article, I will demonstrate the algorithm with the following example:

Say you want to write a program that watches soccer. The input to your program is a sequence of frames from the game. The output should be the locations of all the players at all times. Given a frame of soccer, you first need to locate the players in the frame (i.e. you should know that a certain player occupies a certain set of pixels). Next, you need to know what location in the world is represented by those pixels. In order to do that, you need to figure out where the camera is looking at. In this article I will focus on this part of the problem (understanding the view point and direction). I will not deal with the (easier) part of determining what pixels are occupied by players.

Consider the following image of Ronaldinho:

nike-ronaldinho.jpg

I manually transformed it to this:

nike-ronaldinho_small.jpg

(the portion of the image covered by grass). I think it is clear how this step can be automatized.

Next I applied an automatic algorithm that got me this:

bw.gif

(the same image converted to black and white, for all the pixels that were bright enough). My algorithm simply turned all the pixels that are brighter than the color (80,80,20) (Hough Thresh 1) on a per channel basis to white, and all the other pixels to black. Note that my threshold color requires a high value of the red component, which is not present in the green grass, in order to turn the grass itself to black.

Lets return to our original problem - determining the point and direction of view (POV). We have big clues in the picture for determining the POV, namely the lines on the ground. Note that from the lines on the ground in the black and white picture, it is easy to see that Ronaldinho is standing in the center of the field. Can we somehow detect them?

Lets formulate the following abstact problem:

You are given a matrix M over Z2 (i.e. a matrix of bits). The matrix represents an image. Pixel Pij of the image is white if Mij is 0 and black if Mijis 1. The image contains straight lines. The lines are not perfect (i.e. they may overlap, they may not be continuous and there is a lot of noise). Your goal is to develop an efficient algorithm that receives the matrix and outputs the line equations.

Please take some time to think about this before continuing to read. If you spend enough time thinking about it, you will probably reach the conclusion that this is a really though problem.

A solution to this problem is the Hough Transform. The Hough Transform consists of noticing a duality of the Euclidean plane. We Consider a the straight line equation:

y = m * x + b

In this equation m and b are constants and x and y are variables. But what happens if we make x and y constants and m and b variables?

Well, the set of m’s and b’s that satisfy the equation represents all the lines that pass through our point (x, y)!

It is easy to see that for each point in the x-y plane we get a line in the b-m plane, namely, the line:

b = -x * m + y

Now each line in the x-y plane is matched with a point in the b-m plane! The duality depicted:

Duality

So the algorithm basically says:

  • Run through the original matrix.
  • For each pixel (x,y) that is white draw the line that matches it in the dual plane.
  • Then count the number of line intersections in the dual plane.
  • All points at which there is an intersection of more than T lines (these correspond to lines in the x-y plane on which we have more than T points in our original image) are considered valid lines.

To better illustrate the algorithm, I jotted down a few lines of python. The results follow (I must say that I was really surprised that it worked perfectly the first time around! All the threshold values I picked were somehow correct on the first go ;-) ).

Note that there is an implementation problem with using the b-m plane - Precision issues. For example, a near vertical line has huge values of m. In order to solve this, I used another duality (which is completely equivalent): the duality of the x-y plane and the a-c plane, where a is the angle of the line with the x axis (i.e. m = tan(a)), and c is b*cos(a).

The Hough map (i.e. the result of the Hough Transform) my script produced is:

hough_org.gif

Since it is kind of hard to distinguish the different color values, I generated a more human-friendly version (all the values are multiplied by 15):

hough.gif

And finally, the result (after converting back to the x-y plane):

res.gif

And the same result overlayed on the original image:

merged.gif

A perfect match! 

This is a big subject, and I only covered a small part of it. I will update this article in the near future with a more detailed analysis. If you have specific questions please post them as comments and I promise I will answer!

Meanwhile, let me just point out that the same method can be used to detect other shapes (e.g. circles). The generalization is trivial.

I will finish by showing the results of running the script on another image. Note that in this case, the threshold values are a little off. The results are nevertheless impressive:

a_small.jpg

The black and white image generated was:

a_bw.gif

The Hough map:

a_hough.GIF

The result:

a_res.GIF

And, finally, the overlayed result:

a_merged.GIF

To be updated…

8 Responses to “Understanding Soccer”

  1. Matan Says:

    Some comments:

    1) The suggested algorithm’s compleixty is O(n^2), since you will get in the dual plane n lines and you’ll have to check their intersection

    2) The naive algorithm for the original problem takes O(n^3) - you can find the line equation of each pair of points, and then go over all the other points and check if they fall on this line

    3) You can easily modify the naive algorithm to be O(n^2) - for example, create a hash that maps line equations to a counter. For every pair of points, create the line equation, search for it in the hash, and increment the counter.

    4) In general, the problem of finding intersection points of the lines is exactly the same. It’s naive algorithm would also take O(n^3) because for every pair of lines you’d have to check how many other lines intersect with the same point. You’d have to modify this naive algorithm a bit to find those points in O(n^2). This is a nice algorithm, since it makes a reduction from a problem involving points to a problem involving lines (there are many other examples for this phenomena in Computational Geometry and some of them are much more suprising) In general, since these reductions are very simple, you can always eventually just trivially map the algorithm that works in the dual plane into an equivalent algorithm for solving the primal (original) problem. (and give up the “duality” step of the algorithm)

  2. yaniv Says:

    Hi Matan,

    Thank you for your comments.

    I have a little problem with the way you analysed the complexity of the algorithm: there are actually several parameters that influence the complexity: w - the number of white pixels, n - the number of lines, m - the number of pixels in the original image, and k - the number of pixels in the hough transformed image. There is a certain correlation between the values of w and n (see below).

    Note that finding the intercection of the lines is much easier than what you propose, as when I draw the lines I do not color the buffer with the lines’ color, but rather increase the color value already in the transform image. Finding intersections of more than T lines is then as simple as running through the buffer and checking whether a given pixel has a value of more than T (this is the reason the hough images are not monochromatic).

    The first step - drawing the lines, takes O(w*k’) where k’ is the about sqrt(k) (as drawing a line in an image with k pixels takes about sqrt(k) pixel operations). Then finding the lines takes O(k).

    So the total complexity comes to O(w*k’)+O(k). An approximation to the relation between n and w is:
    w = n*sqrt(m)
    So finally we get that the total complexity is:
    O(n*sqrt(m)*sqrt(k) + k)
    Assuming that k is small compared to the other numbers, we get:
    O(n*sqrt(m))

    The naive algorithm you proposed, in these notations, takes O(w*w*w) (since going through all the pairs of points takes w*w and for each pair you have to iterate through all other points). This equals:
    O(n*n*n*m*sqrt(m))
    Which is much worse than the naive complexity (the ratio is like N to N*N*N).

    The improvement you suggested in comment 3 is indeed very similar to the hough transform, although you still need to iterate over all PAIRS of points (and the complexity stays higher).

  3. yoni Says:

    first of all congratulations on the new blog.

    this strange duality you described is far bigger than it seems, it goes beyond the R2 plane and into the Rn space..
    the best thing about it is that the professor responsible for most of the developments in this field called “Parallel coordinates”
    is teaching in TAU , his name is Alfred Inselberg and this is his page http://www.math.tau.ac.il/~aiisreal/
    there you can find for example of a sphere in 5d …

    enjoy.

  4. Matan Says:

    Okay, now I understand the algorithm (and my mistake) - I thought the input you are getting is a list of N points, each represented as a pair of coordinates (x,y), while in reality the input is an nXn grid of white and black points. The proposed algorithm (the hough duality) works for the non-discrete case, but then you’d be stuck with the equaly hard dual problem of getting a list of N lines and trying to find points where maximum number of line intersect. In the discrete case, the dual problem is actually much simpler to solve than the primal problem.

    Nice.

  5. Dog training Says:

    Very interesting… as always! Cheers from -Switzerland-.

  6. free Says:

    greatings

    exellent

  7. henry Says:

    i love the move and learn it every day and just is afternoon thanks for all you have i used it today on field and is score four goal i and also a football in Nigeria in local club but one day i play like most of the world best e.g messi deco henry e.t.c

  8. TesseractIndic @ foss.in 2008 « Debayan’s Weblog Says:

    […] my intention is to implement skew removal using Hough transforms. Hough transforms are really good in finding staright lines (among other shapes) in images. So all […]

Leave a Reply