Archive for June, 2007

Out of the Norm

Tuesday, June 19th, 2007

vivaelnorm.jpgI am taking a course about Hilbert Spaces this semester. A very basic notion in a Hilbert Space is that of the norm. For a while now, I kept several questions regarding norms in the back of my mind, and as I finally got to think about them, I wanted to share my conclusions with you.

Introductory Concepts 

Lets start from the beginning (those of you familiar with Normed Spaces can skip this section). A linear space is a set of objects with two operations:

  1. Addition (marked by the symbol +) which acts on two elements of the space and returns an element of the space.
  2. Multiplication (marked by the symbol *) which acts on an element of the space and a real number and returns an element of the space.

The operations must follow these rules (x and y are elements of a linear space, a and b are real numbers):

  1. x + y = y + x                              (+ commutativity)
  2. x + (y + z) = (x + y) + z           (+ associativity)
  3. Existence of a zero element (marked by the symbol 0) such that for every element of the space: 0+x = x+0 = x
  4. For every element x of the space, existence of a negative element (marked by -x) such that: -x + x = 0
  5. (a*b)*x = a*(b*x)                    (* associativity)
  6. For every element x of the space, 1*x = x
  7. a*(x+y) = a*x + a*y                (distributive law 1)
  8. (a+b)*x = a*x + b*x                (distributive law 2)

A simple example of a linear space is the Euclidean space of n dimensions (for any n >= 1).

Note that we define the minus symbol, -, as follows:

x – y = x + (-1 * y)

Now, a norm is essentially a function that somehow tells us how far away an element of our space is from the neutral element (0).

We write ||x|| for the norm of x.

The norm can also tell us how far apart two elements of the space are. We say that the norm generates a distance. This is done as follows:

dist(x, y) = ||x – y||

Formally, a norm is a function from the linear space to R+ (the set of non-negative real numbers), such that (x and y are elements of the linear space, a is a real number):

  1. ||a*x|| = |a| * ||x||                             (homogeneity)
  2. ||x+y|| <= ||x|| + ||y||                      (triangle inequality)
  3. ||x|| >= 0 and ||x|| = 0 i.f.f. x = 0

Examples 

Now lets consider the 2-dimensional euclidean plane as our linear space (call it R2). An example of a norm in R2 is the euclidean norm (where u=(x,y) is a member of R2):

||u|| = sqrt(x^2 + y^2)

We shall call this norm, the l2 norm.

For those of you not familiar with the concept of the norm, verifying that this is indeed a norm may prove insightful.

The l2 norm is very useful as it measures actual distances in a plane.

Now lets consider another norm, called the l1 norm:

||u|| = |x| + |y|

Again, it can be easily verified that this is indeed a norm. This norm obviously generates a different distance than that of the plane. Does it have a “real-world” use? You are encouraged to take a minute of reflection upon this before reading on…

 Well, yes, it does have a use. Lets say that we are standing in Midtown-Manhattan.

manhattan_midtown.gif

What is the distance between us? The aerial distance is the regular euclidean distance (neglecting the roundness of the Earth) but the walking distance is the distance generated by the l1 norm! This is caused by the fact that we can only walk along the streets (i.e. parallel to the axes!). For this reason the distance generated by the l1 norm is sometimes referred to as the Manhattan-Distance.

Now, lets consider a third norm called l∞, defined as:

||u|| = max(|x|,|y|)

Again, this is obviously a norm (check it!).

Now lets start with the fun stuff…

First, can you figure out the meaning of the names of the norms (l1, l2, l∞)?

For all p >= 1, lets define the function fp:R2->R+ as:

fp(u) = (x^p + y^p)^(1/p)

Then fp is a norm on R2 and we call it lp (I feel like I am repeating myself, but again, this can be easily checked).

The names of l1 and l2 become obvious right away, and so does the name of l∞ (after a little thought :-) ).

If you still can’t figure it out - l∞(x) is the point-wise limit of lp(x) as p->∞ for all the elements x of R2. Actually, the functions lp converge to l∞ locally-uniformly (can you prove this?).

This brings up an interesting question: Do norms other than l1 and l2 have any “real-world” meaning? If you have any insights on this, please comment!

Unit Circles

The unit circle of a normed space is the subset of elements of the space such that their norm is equal to 1.

Can you figure out how l1, l2 and l∞ unit circles look like in R2?

As I found it difficult to visualize the unit circles for norms lp, p>2, I jotted down a few lines of C in order to create the following image:

norms.JPG

The red, green, light-blue and blue shapes denote the unit circles for the l1, l2, l3, and l∞ norms respectively. The other colors denote the unit circles for lp norms for various values of p.

In the picture I included values of p that are not whole numbers. This should not pose any problems.

I also included values of p that are less than 1 (the shapes inside the l1 circle). These are there for illustration purposes only, as they do not constitute norms!

Note: the glowing effect was generated by my program to overcome jagged edges caused by precision issues.

Something else to think about is the volume of the unit circles. The volume of the unit circle under the l1 norm is 2, under the l2 norm Π and under the l∞ norm, 4.

Can anyone supply some more values?

That’s it for now!

I might update this article, so check the updates section!

Ants Revamped

Thursday, June 14th, 2007

ant-drawing2.jpgPlease make sure you have read (and solved) the original Ants riddle before reading on!

 easyriddle.gif

Here are two sequel questions to the original Ants riddle:

  1. Which is the last ant to fall off the stick?
  2. What is the maximum number of ant collisions possible?

I want to thank Danny Valevsky for bringing these to my attention!

Divide and Conquer

Saturday, June 9th, 2007

divide_and_conquer.gif Yesterday, Zohar Gilboa, with whom I take a PDE course, asked me the following riddle. I must admit it is probably the riddle I like the least on my site thus far. It is very easy though, and as Zohar told me he really liked it himself, I decided to publish it anyways.

Consider the following image:

divide_and_conquer.gif

Your goals are:

  1. Divide shape 1 (bluish) to two equal parts.
  2. Divide shape 2 (greenish) to three equal parts.
  3. Divide shape 3 (yellowish) to four equal parts.
  4. Divide shape 4 (purplish) to five equal parts.

All the division parts mentioned should be continuous shapes.

The second page contains the solution! Please only visit it after giving the riddle some thought…

Secure Computing – Part I

Thursday, June 7th, 2007

Wheres Waldo?This article is the first among a series of articles describing the subject of Secure Computing. The material was mainly taken from a Cryptography course by Amos Fiat.

Say we have a function F(x1,x2,…,xn)->(y1,y2,…,yn) that takes n inputs and has n outputs. Me and n-1 of my friends (i.e. we are n people in total) have secret numbers. Lets number the people p1,…,pn and the secret numbers x1,…,xn (i.e. person pi knows secret number xi). We want to perform a process after which person pi will know yi, where yi is the i’th output of F computed on the xi’s, but he must not know anything else. Let me give you an example:

Say that there are 3 people, each having a monthly salary of xi dollars (for i=1,2,3). They want person number 1 to know the sum of the salaries, person 2 to know the product of the salaries and person 3 to know the maximum of the salaries. But they do not want to reveal their salaries to each other!

A simple solution would be to find a new and neutral person that all of the men trust. They would then reveal to him all of their salaries, one by one, and he will perform the calculation and tell them the results. The goal of this series of articles is to solve this problem without using such a trusted party.

Before we begin to think of a solution, lets notice some inherent problems:

  • The men can lie about their salaries. Since it is assumed that only person pi knows xi, he can lie about it (i.e. input another number, say xi’ instead of xi). There is obviously nothing that can be done about it. We will therefore assume all the men (while they may try to cheat at other parts of the protocol) will input their true numbers to it.
  • The function itself may leak back information about the secret numbers of the other people. For example, if there are only 2 people, and y1 is the sum of the xi’s and y2 is their product, then after the process each of the men will know the secret number of the other (as the function is reversible). This is simply not considered a problem, as our goal was to do without the trusted party, and if for some reason the function is reversible (or semi-reversible) then the same problem will arise with the trusted party protocol, which is what we are trying to emulate.

The solution will consist of several parts (which will be dealt with in the other parts of this topic):

  1. Performing a Zero Knowledge Proof (ZKP) of a generic NPC problem. 
  2. The “Oblivious Transfer” protocol.
  3. Transforming a general function f(x1,…,xn) to a circuit consisting only of XOR gates and AND gates.
  4. Solving our problem for the specific cases where our function is a XOR gate or an AND gate.

So now, after we have set our goals, lets dive in to the the first step:

Zero Knowledge Proof of an NPC Problem

At this point, I rather not formally define the exact meaning of a ZKP. I will instead try to explain is intuitively. Say I know the answer to a complicated puzzle and you don’t. The process of me convincing you that I know the answer, without you learning anything about it, is called a Zero Knowledge Proof.

Let me give you this neat example (taken from Applied Kid Cryptography). Say we are playing the game of Where’s Waldo? with this picture:

waldo.jpg

After minutes of contemplation, I suddenly find him! The following dialog then takes place between us:

Me: “Yes! I found him! Wow, he was really well hidden.”

You: “I don’t believe you! Its too hard! You must be lying!”

Me: “No, seriously. He’s right there, next to the…”

You: “Go on – Where is he?!”

Me: “Well, I do not want to ruin this for you… This is such a cool puzzle…”

You: “I knew it! You are lying – you haven’t found him.”

The situation becomes frustrating for me! How can I prove to you that I know where Waldo is without revealing his location?

I can give you a general fact about his location (for example, that he is near the center of the picture and not near the borders) and later when you will find him too, you will be able to verify that I was indeed right.

There are obvious drawbacks with this method:

  • While not revealing exactly where Waldo is, a lot of information is still leaked (you now know not to look for him in the sides of the image).
  • I can lie to you with a good probability – even if I haven’t found him my general statement may be right.
  • And finally, the biggest drawback is that you may still not be convinced (i.e. we have to wait for you to find him as well).

Suddenly it hits me! I know what to do! I take a big white piece of paper (much bigger than the original image, say twice as big in every direction). I then place the original image at a random location beneath it and cutout Waldo’s location from the paper. I then place another big piece of paper on top of everything (this time, without the cutout):

waldo_ilust.gif

In my illustration the covering papers have the same size as the original image, but they should  be much bigger in reality.

You then choose a number: 1 or 2. If you choose 1, I remove both pieces of paper, revealing to you the entire image. If you choose 2, I remove only the top piece of paper, revealing to you that I know where Waldo is, as such:

waldo_cover.gif (click thumbnail for big image)

What does this accomplish? Well, first of all, if indeed I know where Waldo is I can answer both of your questions. If, on the other hand, I was lying, and I put the original image under the papers, then I could not have cut out Waldo’s location. In order to avoid this, I could have replaced the underlying image all together (as you do not see it at all!) with another image, in which I know where Waldo is. In this case, should you ask me to show you where Waldo is, I will be able to do so, but I will get busted if you selected 1, and made me reveal to you the entire image.

What we got is this:

  • If I know where Waldo is, I can deliver a correct answer that you can verify 100% of the time.
  • If I am lying and I do not know where Waldo is, you will bust me (no matter what I’ll do) with a probability of at least 1/2.

Well, being busted with a probability of only 1/2 is not very bad, but notice that we can repeat the process any number of times (each time placing the white papers at different locations over the original image). My chances of being correct are independent in each of the times, and so I will get busted with a probability (1/2)^n (where n is the number of iterations). If n is chosen high enough, you can be quite certain that I am not lying (10 iterations suffice for the probability of missing a liar to be less the 1/1000).

Note that the process is not really “Zero Knowledge” as by looking at the cutout version of Waldo you get an idea of his immediate surroundings and of his own posture (indeed, my cutout image above helps a little in finding him).

In the next installment of the series I will further discuss the notion of ZKP, especially in relation to solutions of NPC problems.

To be continued…

Understanding Soccer

Sunday, June 3rd, 2007

soccer_ball.gifIn 1998 (I can’t believe it was so long ago!) I was working at MATE (Media Access Technologies), an Israeli start-up at the time, specialising in face recognition and video searching. It was then that I came to know one of the coolest tricks in image recognition - the Hough Transform (pronounced “huf transform“).

In this article, I will demonstrate the algorithm with the following example:

Say you want to write a program that watches soccer. The input to your program is a sequence of frames from the game. The output should be the locations of all the players at all times. Given a frame of soccer, you first need to locate the players in the frame (i.e. you should know that a certain player occupies a certain set of pixels). Next, you need to know what location in the world is represented by those pixels. In order to do that, you need to figure out where the camera is looking at. In this article I will focus on this part of the problem (understanding the view point and direction). I will not deal with the (easier) part of determining what pixels are occupied by players.

Consider the following image of Ronaldinho:

nike-ronaldinho.jpg

I manually transformed it to this:

nike-ronaldinho_small.jpg

(the portion of the image covered by grass). I think it is clear how this step can be automatized.

Next I applied an automatic algorithm that got me this:

bw.gif

(the same image converted to black and white, for all the pixels that were bright enough). My algorithm simply turned all the pixels that are brighter than the color (80,80,20) (Hough Thresh 1) on a per channel basis to white, and all the other pixels to black. Note that my threshold color requires a high value of the red component, which is not present in the green grass, in order to turn the grass itself to black.

Lets return to our original problem – determining the point and direction of view (POV). We have big clues in the picture for determining the POV, namely the lines on the ground. Note that from the lines on the ground in the black and white picture, it is easy to see that Ronaldinho is standing in the center of the field. Can we somehow detect them?

Lets formulate the following abstact problem:

You are given a matrix M over Z2 (i.e. a matrix of bits). The matrix represents an image. Pixel Pij of the image is white if Mij is 0 and black if Mijis 1. The image contains straight lines. The lines are not perfect (i.e. they may overlap, they may not be continuous and there is a lot of noise). Your goal is to develop an efficient algorithm that receives the matrix and outputs the line equations.

Please take some time to think about this before continuing to read. If you spend enough time thinking about it, you will probably reach the conclusion that this is a really though problem.

A solution to this problem is the Hough Transform. The Hough Transform consists of noticing a duality of the Euclidean plane. We Consider a the straight line equation:

y = m * x + b

In this equation m and b are constants and x and y are variables. But what happens if we make x and y constants and m and b variables?

Well, the set of m’s and b’s that satisfy the equation represents all the lines that pass through our point (x, y)!

It is easy to see that for each point in the x-y plane we get a line in the b-m plane, namely, the line:

b = -x * m + y

Now each line in the x-y plane is matched with a point in the b-m plane! The duality depicted:

Duality

So the algorithm basically says:

  • Run through the original matrix.
  • For each pixel (x,y) that is white draw the line that matches it in the dual plane.
  • Then count the number of line intersections in the dual plane.
  • All points at which there is an intersection of more than T lines (these correspond to lines in the x-y plane on which we have more than T points in our original image) are considered valid lines.

To better illustrate the algorithm, I jotted down a few lines of python. The results follow (I must say that I was really surprised that it worked perfectly the first time around! All the threshold values I picked were somehow correct on the first go ;-) ).

Note that there is an implementation problem with using the b-m plane - Precision issues. For example, a near vertical line has huge values of m. In order to solve this, I used another duality (which is completely equivalent): the duality of the x-y plane and the a-c plane, where a is the angle of the line with the x axis (i.e. m = tan(a)), and c is b*cos(a).

The Hough map (i.e. the result of the Hough Transform) my script produced is:

hough_org.gif

Since it is kind of hard to distinguish the different color values, I generated a more human-friendly version (all the values are multiplied by 15):

hough.gif

And finally, the result (after converting back to the x-y plane):

res.gif

And the same result overlayed on the original image:

merged.gif

A perfect match! 

This is a big subject, and I only covered a small part of it. I will update this article in the near future with a more detailed analysis. If you have specific questions please post them as comments and I promise I will answer!

Meanwhile, let me just point out that the same method can be used to detect other shapes (e.g. circles). The generalization is trivial.

I will finish by showing the results of running the script on another image. Note that in this case, the threshold values are a little off. The results are nevertheless impressive:

a_small.jpg

The black and white image generated was:

a_bw.gif

The Hough map:

a_hough.GIF

The result:

a_res.GIF

And, finally, the overlayed result:

a_merged.GIF

To be updated…

Two Envelopes

Saturday, June 2nd, 2007

EnvelopesYou write down 2 numbers on 2 pieces of paper (one number on each piece). You put each paper in a sealed envelope. I choose one of the envelopes randomly and open it. I then carry out a certain procedure at the end of which I know with a probability greater than 1/2 whether I received the larger or the smaller of the numbers. I can do this even if when you initially wrote down the numbers, you knew my decision procedure!

How can I do that? What is my trick?