Understanding Soccer
June 3rd, 2007
In 1998 (I can’t believe it was so long ago!) I was working at MATE (Media Access Technologies), an Israeli start-up at the time, specialising in face recognition and video searching. It was then that I came to know one of the coolest tricks in image recognition - the Hough Transform (pronounced “huf transform“).
In this article, I will demonstrate the algorithm with the following example:
Say you want to write a program that watches soccer. The input to your program is a sequence of frames from the game. The output should be the locations of all the players at all times. Given a frame of soccer, you first need to locate the players in the frame (i.e. you should know that a certain player occupies a certain set of pixels). Next, you need to know what location in the world is represented by those pixels. In order to do that, you need to figure out where the camera is looking at. In this article I will focus on this part of the problem (understanding the view point and direction). I will not deal with the (easier) part of determining what pixels are occupied by players.
Consider the following image of Ronaldinho:
I manually transformed it to this:
(the portion of the image covered by grass). I think it is clear how this step can be automatized.
Next I applied an automatic algorithm that got me this:
(the same image converted to black and white, for all the pixels that were bright enough). My algorithm simply turned all the pixels that are brighter than the color (80,80,20) (
) on a per channel basis to white, and all the other pixels to black. Note that my threshold color requires a high value of the red component, which is not present in the green grass, in order to turn the grass itself to black.
Lets return to our original problem – determining the point and direction of view (POV). We have big clues in the picture for determining the POV, namely the lines on the ground. Note that from the lines on the ground in the black and white picture, it is easy to see that Ronaldinho is standing in the center of the field. Can we somehow detect them?
Lets formulate the following abstact problem:
You are given a matrix M over Z2 (i.e. a matrix of bits). The matrix represents an image. Pixel Pij of the image is white if Mij is 0 and black if Mijis 1. The image contains straight lines. The lines are not perfect (i.e. they may overlap, they may not be continuous and there is a lot of noise). Your goal is to develop an efficient algorithm that receives the matrix and outputs the line equations.
Please take some time to think about this before continuing to read. If you spend enough time thinking about it, you will probably reach the conclusion that this is a really though problem.
A solution to this problem is the Hough Transform. The Hough Transform consists of noticing a duality of the Euclidean plane. We Consider a the straight line equation:
y = m * x + b
In this equation m and b are constants and x and y are variables. But what happens if we make x and y constants and m and b variables?
Well, the set of m’s and b’s that satisfy the equation represents all the lines that pass through our point (x, y)!
It is easy to see that for each point in the x-y plane we get a line in the b-m plane, namely, the line:
b = -x * m + y
Now each line in the x-y plane is matched with a point in the b-m plane! The duality depicted:
So the algorithm basically says:
- Run through the original matrix.
- For each pixel (x,y) that is white draw the line that matches it in the dual plane.
- Then count the number of line intersections in the dual plane.
- All points at which there is an intersection of more than T lines (these correspond to lines in the x-y plane on which we have more than T points in our original image) are considered valid lines.
To better illustrate the algorithm, I jotted down a few lines of python. The results follow (I must say that I was really surprised that it worked perfectly the first time around! All the threshold values I picked were somehow correct on the first go
).
Note that there is an implementation problem with using the b-m plane - Precision issues. For example, a near vertical line has huge values of m. In order to solve this, I used another duality (which is completely equivalent): the duality of the x-y plane and the a-c plane, where a is the angle of the line with the x axis (i.e. m = tan(a)), and c is b*cos(a).
The Hough map (i.e. the result of the Hough Transform) my script produced is:
Since it is kind of hard to distinguish the different color values, I generated a more human-friendly version (all the values are multiplied by 15):
And finally, the result (after converting back to the x-y plane):
And the same result overlayed on the original image:
A perfect match!
This is a big subject, and I only covered a small part of it. I will update this article in the near future with a more detailed analysis. If you have specific questions please post them as comments and I promise I will answer!
Meanwhile, let me just point out that the same method can be used to detect other shapes (e.g. circles). The generalization is trivial.
I will finish by showing the results of running the script on another image. Note that in this case, the threshold values are a little off. The results are nevertheless impressive:
The black and white image generated was:
The Hough map:
The result:
And, finally, the overlayed result:
To be updated…










June 5th, 2007 at 11:22 am
Some comments:
1) The suggested algorithm’s compleixty is O(n^2), since you will get in the dual plane n lines and you’ll have to check their intersection
2) The naive algorithm for the original problem takes O(n^3) – you can find the line equation of each pair of points, and then go over all the other points and check if they fall on this line
3) You can easily modify the naive algorithm to be O(n^2) – for example, create a hash that maps line equations to a counter. For every pair of points, create the line equation, search for it in the hash, and increment the counter.
4) In general, the problem of finding intersection points of the lines is exactly the same. It’s naive algorithm would also take O(n^3) because for every pair of lines you’d have to check how many other lines intersect with the same point. You’d have to modify this naive algorithm a bit to find those points in O(n^2). This is a nice algorithm, since it makes a reduction from a problem involving points to a problem involving lines (there are many other examples for this phenomena in Computational Geometry and some of them are much more suprising) In general, since these reductions are very simple, you can always eventually just trivially map the algorithm that works in the dual plane into an equivalent algorithm for solving the primal (original) problem. (and give up the “duality” step of the algorithm)
June 5th, 2007 at 3:00 pm
Hi Matan,
Thank you for your comments.
I have a little problem with the way you analysed the complexity of the algorithm: there are actually several parameters that influence the complexity: w – the number of white pixels, n – the number of lines, m – the number of pixels in the original image, and k – the number of pixels in the hough transformed image. There is a certain correlation between the values of w and n (see below).
Note that finding the intercection of the lines is much easier than what you propose, as when I draw the lines I do not color the buffer with the lines’ color, but rather increase the color value already in the transform image. Finding intersections of more than T lines is then as simple as running through the buffer and checking whether a given pixel has a value of more than T (this is the reason the hough images are not monochromatic).
The first step – drawing the lines, takes O(w*k’) where k’ is the about sqrt(k) (as drawing a line in an image with k pixels takes about sqrt(k) pixel operations). Then finding the lines takes O(k).
So the total complexity comes to O(w*k’)+O(k). An approximation to the relation between n and w is:
w = n*sqrt(m)
So finally we get that the total complexity is:
O(n*sqrt(m)*sqrt(k) + k)
Assuming that k is small compared to the other numbers, we get:
O(n*sqrt(m))
The naive algorithm you proposed, in these notations, takes O(w*w*w) (since going through all the pairs of points takes w*w and for each pair you have to iterate through all other points). This equals:
O(n*n*n*m*sqrt(m))
Which is much worse than the naive complexity (the ratio is like N to N*N*N).
The improvement you suggested in comment 3 is indeed very similar to the hough transform, although you still need to iterate over all PAIRS of points (and the complexity stays higher).
June 6th, 2007 at 12:08 am
first of all congratulations on the new blog.
this strange duality you described is far bigger than it seems, it goes beyond the R2 plane and into the Rn space..
the best thing about it is that the professor responsible for most of the developments in this field called “Parallel coordinates”
is teaching in TAU , his name is Alfred Inselberg and this is his page http://www.math.tau.ac.il/~aiisreal/
there you can find for example of a sphere in 5d …
enjoy.
June 9th, 2007 at 9:14 pm
Okay, now I understand the algorithm (and my mistake) – I thought the input you are getting is a list of N points, each represented as a pair of coordinates (x,y), while in reality the input is an nXn grid of white and black points. The proposed algorithm (the hough duality) works for the non-discrete case, but then you’d be stuck with the equaly hard dual problem of getting a list of N lines and trying to find points where maximum number of line intersect. In the discrete case, the dual problem is actually much simpler to solve than the primal problem.
Nice.
November 23rd, 2007 at 4:48 pm
Very interesting… as always! Cheers from -Switzerland-.
November 24th, 2007 at 7:11 pm
greatings
exellent
July 10th, 2008 at 1:48 am
i love the move and learn it every day and just is afternoon thanks for all you have i used it today on field and is score four goal i and also a football in Nigeria in local club but one day i play like most of the world best e.g messi deco henry e.t.c
October 15th, 2008 at 6:09 pm
[...] my intention is to implement skew removal using Hough transforms. Hough transforms are really good in finding staright lines (among other shapes) in images. So all [...]
April 29th, 2009 at 7:18 am
Could you send the python source code to my e-mail?
Your post is quite perfect…
Thank you!
September 11th, 2009 at 12:52 am
Hi! I was surfing and found your blog post… nice! I love your blog.
Cheers! Sandra. R.
March 1st, 2011 at 6:39 am
[...] To detect straight lines in an image, use the naive version of the hough transform. The hough transform takes you from the image space to the Hough space where a line in the image, corresponds to a point and a point in the image corresponds to a line.Naive approach: y = mx + c is the line in the image space. In the hough space, get the lines corresponding to the points (x0, y0) [you can mark these white or something like that]. The line in hough space is just the equation representing: c = m * x0 – y0 (basically all possible m, c that can satisfy x0 and y0). Once you get a line in hough space for every (x,y) on a line in the image space, you observe that through 1 particular x, y, there’s a ton of concurrent lines. This is essentially the m, c pair that correspond to our line in the image space. For more info: http://yaniv.leviathanonline.com/blog/math/understanding-soccer/ [...]