The Bookie

Rating: 4
June 6th, 2012

These are two nice and easy gambling related riddles.

Double or Nothing

FC Barcelona  and Real Madrid C.F.  will be playing 7 games against each other this season. I give you $1,000 and ask you to double it in case Barça wins overall (i.e. if they win at least 4 of the 7 games). You are allowed to lose all of it otherwise.

You have a honest bookie which gives you 1-1 odds on each game, but he is only willing to take bets on single games, not on the entire season. How can you double my money?

Note that you are not allowed to take loans, so if for example you bet $30 on Barça in the first game and they win you have a maximum of $1,030 to bet on for the second game, and if they lose, you have a maximum of $970 to bet on the second game.

Beat the Bookie

This second riddle is even easier. There are n horses in a horse race. Your bookie gives you X1,…,Xn – the odds for each horse.

For example, if X2 is 3, for each dollar you place on horse 2 you will get 3 dollars if horse 2 wins.

Under what condition on the Xi‘s can you guarantee to make money?

Thanks to Srulix for these two riddles!

 

Monochrome Lizards

Rating: 2.5
March 29th, 2011

Uniform LizardsThere are 3 types of lizards: yellow, green and blue.

If you rub 2 lizards from different colors, they both change color to the third color. So, for example, if you rub a yellow lizard and a blue lizard, you get 2 green lizards.

Say that you have X yellow lizards, Y green lizards and Z blue lizards. For what values of X, Y and Z can you transform all the lizards to the same color?

E.g. for X=3, Y=3 and Z=1 it is possible (rub the 3 yellow ones with the 3 green ones and get a total of 7 blue ones).

Zeroing an Array in Constant Time

Rating: 4
January 20th, 2011

This one is a really cool (and practical) riddle!

Implement an “integer array” data structure, that supports the following 3 operations:

Init(int n, int k) – initialize the array to be of size n and with all cells set to the value k.

Get(int i) – return the value at cell i.

Set(int i, int k) – set the value of cell i to k.

The catch – all 3 operations should take constant time (not amortized, not probabilistic).

Note – you can assume that malloc or new are constant time, but that the returned memory is filled with (adversarial) random.

Thanks to Nadav Sherman for this riddle!

Differing Neighbors

Rating: 4
January 12th, 2011

Algorithm A

Given an array of N integers, sort the array, and find the 2 consecutive numbers in the sorted array with the maximum difference.
Example – on input [1,7,3,2] output 4 (the sorted array is [1,2,3,7], and the maximum difference is 7-3=4).

Algorithm A runs in O(NlogN) time.

Implement an algorithm identical in function to algorithm A, that runs in O(N) time.

Thanks to Yossi Richter for this riddle!

Prisoners with Bit Sequences

Rating: 3
May 28th, 2010

This is a hard riddle – it took me a couple of days (wall-time) to solve, but it was definitely worth it. Be warned – a certain mathematical maturity is needed…

There are three prisoners and a guard. The guard has an infinite sequence that is either all 1′s (e.g. 1111111111…) or it starts with a finite number of 1′s and eventually turns into 2′s (e.g. 1112222222…). The guard creates 3 bit sequences (i.e. composed entirely of 0′s and 1′s) such that the sum of the i’th bit in the 3 sequences is equal to the i’th element in his sequence (1 or 2).

He then gives each of the 3 prisoners one of the 3 bit sequences. Each prisoner sees only his own sequence and has to guess whether the guard’s sequence consists only of 1′s or whether it turns into 2′s eventually. Devise a method that will make sure the majority of the prisoners (i.e. at least 2) guess correctly.

Thanks Haran for giving me this cool riddle!

The Better Half

Rating: 4
May 19th, 2010

A cute and easy algorithmic riddle.

You have an array of N bit strings each of length M. You know that there is at least one element that appears more than N/8 times in the array. Using O(M+log(N)) memory and O(NM) time, find such an element.

An Easier Version

Well, its actually almost exactly the same, but solve the above riddle in case there is an element that appears more than N/2 times in the array. I managed to find 3 distinct solutions to this easier variation, but only one of which generalizes easily.

Thanks Nemo for giving me this riddle!

Reverse & Clean

Rating: 2.5
April 17th, 2010

The game of Reverse & Clean is a one player game, played on an N by M board, filled with Reversi pieces. Each Reversi piece has a black side and a white side. The game starts with all pieces showing their white side, except for the piece in the lower right corner, which shows its black side. See the figure for a 3 by 3 example.

reversi1.png

In each turn of the game, you remove one of the black pieces and flip all of its (remaining) neighbours (the pieces up, down, left and right of it). The figures below demonstrate 2 possible first moves (note that the very first move of the game is always to remove the bottom right piece, which is the only black piece).

reversi2.png reversi3.png

The goal of the game is to clean the board (i.e. remove all pieces). Note that if the board is 1 by N, you can always win, while if the board is 2 by 2, winning is impossible (make sure you see why).

For what numbers, N and M, is a win possible?

Extra Credit: What about 3-dimensions? D-dimensions?

Thanks to Liron Raz for giving me this one.

Rabbit Season

Rating: 4
March 1st, 2010

easyriddle.gifrabbitThere are 10 cells in a line. A transparent rabbit is in one of them. You have a shotgun, and obviously you want to shoot the rabbit.

If you hit the cell with the rabbit, you kill him (and win). Otherwise, if you shoot an empty cell, the rabbit hears the shot, gets scared of the noise and jumps one cell to the right or one cell to the left. In case the rabbit is in the right-most cell, it can only jump to the left (and similarly, if the rabbit is in the left-most cell, it jumps to the right).

Can you kill the rabbit? If so, what is the minimum number of shots needed to guarantee a kill?

Extra Credit

spoiler.gif

Spoiler Warning – read after solving the riddle above!

Instead of considering the cells in a row, the riddle can be generalized to a graph.

If the graph has cycles, no solution exists (make sure you see why!).

What happens if the graph is a general tree?

Fat Aunts

Rating: 2
January 14th, 2010

HousesTwo aunts are living each in her own (0-dimensional) house. There are two non-intersecting (1-dimensional) roads between the houses.

Last year, both aunts were doing a lot of exercise, and so they were slim (0-dimensional). They managed to walk together from House 1 to House 2, taking different roads, while each was holding one end of a rope of length less than L.

This year, they gained weight, and each became a sphere of radius L/2. One aunt is in House 1 and the other is in House 2. Can they exchange houses without bumping into each other (their centers must always remain on the roads)?

Really equal? Naturally!

Rating: 3.5
January 3rd, 2010

You have a set of 2N+1 natural numbers, with the following property: if you remove any one element, you can partition the remaining 2N elements into two sets A and B, each of size N, such that the sum of the N numbers in set A equals the sum of the N numbers in set B. Prove that all the numbers in the original set are equal.

Try to solve the riddle in the more general case, where the numbers are not necessarily natural, but arbitrary reals (some knowledge of algebra is helpful here).

Il Buono, il Brutto, il Cattivo

Rating: 3.5
July 5th, 2009

Good Bad UglyFor inspirational music, click here.

Il Buono (nice and easy)

Prove that in a subset of size n+1 of the set {1,2,…,2n} there are two numbers such that one divides the other.

Il Cattivo (beautiful and hard)

Prove that in a sequence of r*s+1 distinct numbers, there is either a monotonically increasing sub-sequence of length r+1 or a monotonically decreasing sub-sequence of length s+1.

Il Brutto (just easy)

Prove that in a subset of size n+1 of the set {1,2,…,2n} there are two relatively prime numbers (i.e. numbers whose gcd is 1).

Il Non Collegato (another easy one)

Prove that a degree 7 polynomial with integer coefficients, that receives the values +1 or -1 on 7 integer points is irreducible over the integers.

Unusually, I decided to omit the second page, so post your solution related comments below.

Monty Hall Revised

Rating: 2.5
December 19th, 2008

This riddle is my take on the Monty Hall problem. If you know the original version, this one should be very easy for you.

You are watching the TV series “Lets Make a Deal”.

lets_make_a_deal.gif

You are very excited as you know your friend Heidi is participating today. You watch attentively throughout the show, only to find out that it ends without Heidi ever appearing in it.

You call Heidi, several times during the following day, but you keep getting her answering machine. Maybe something happened to her?

You recall the Rules of the Show:

The host presents to each candidate three closed doors. Behind one of these doors hides a big prize, behind the other two there is an empty bucket. The candidate chooses one of the doors (this choice is random as he has no information whatsoever on the location of the prize). Then the host opens one of the other doors behind which there is no prize. Then the candidate is given the option to remain with his initial choice or to switch doors and choose the other closed door. Then the remaining doors are opened, and the candidate is awarded whatever is behind the door finally chose. The goal of the candidate is (obviously) to choose the door with the prize.

Pondering about the rules, you develop the following theory:

Maybe the host does not know in advance which of the doors contains the prize. In that case, if the host happens to choose to reveal the door with the prize (thus ruining the show) the candidate is killed (finally, some proper riddle scenario) and his scene is cut-out during the editing of the show.

Maybe that is what happened to Heidi?

Assuming you have a lot of episodes (as many as you need), how can your theory be ruled out or strengthened?

Hats in a Line

Rating: 4.5
November 16th, 2008

hats_small2.jpgThis riddle is a very cool extension of a well known (and easy) riddle, involving people with hats waiting in a line.

So, lets begin with the original:

The Original Riddle

100 men are standing in a line, such that each of them sees all those that are in-front of him (so the last man sees the 99 others, etc.). Each guy has a hat on his head colored either black or white. Each of the men guesses the color of his own hat, out-loud. The guy standing last (the one that sees all the other ones) guesses first, then the guy in-front of him, etc. The men’s task is to make sure no more than one of them is wrong.

Just to make things explicit – the men are allowed to agree upon an algorithm before actually being given the hats, but once the task starts they cannot communicate in any way other than the fact that each of them hears the guess of the guys standing behind him.

Also, to conform to the universal standards of riddles, note that should more than one man be mistaken, they will all be brutally killed.

If you do not know this riddle yet, take a minute to think about its solution before reading on.

The Solution of the Basic Riddle

I give here the solution of the basic riddle so that there are no misunderstandings regarding the much more interesting extension. It goes like this – The last man (the one that guesses first) sacrifices himself (with a probability of 0.5). He regards each black hat as a 1 and each white hat as a 0. He then guesses out-loud that he has a black hat if the sum of all the hats of the other people is odd, and guesses white otherwise. The other people can all guess correctly (make sure that you understand why).

Some Trivial Extensions

Obviously the numbers 100 and 2 (the number of possible hat colors) in the original riddle are completely arbitrary. If m,n are two natural numbers then the riddle with m men and n possible hat colors is a trivial generalization of the previous result.

Now, note that if we let n, the number of possible hat colors, be equal to infinity (well, aleph 0, the cardinality of the set of natural numbers) then again we have a trivial generalization (make sure you see why).

The Real Deal

Now the extension of the riddle we are interested in consists of letting m, the number of people, be equal to aleph 0. For simplicity, lets assume that n equals two (the hats are either black or white).

Is the riddle still solvable? I.e. can the men devise an algorithm such that there will be only one mistake?

As usual, post solution related comments to the second page.

The Difference Between Area and Volume – Part I

Rating: 5
May 16th, 2008

I haven’t written new posts for a while now, as I have been very busy lately.

I think this is a very interesting post and I hope it will make up for the lack of updates. I also want to take this opportunity to thank all of you for posting lots of interesting comments and for sending me many ideas and riddles, thank you!

In this post (and its sequel) I will describe Hilbert’s 3rd problem and show how it is solved. I based the posts mainly on a lecture by Prof. David Gilat.

Lets begin by considering a polygon in the Euclidean plane. Here is an example of such a polygon:

polygon.gif

We want to devise a method for somehow “measuring” the area of this and similar polygons.

In order to do this, we begin by defining the area of a rectangle that is parallel to the coordinate axes as the product of its horizontal side by its vertical side.

We now define the area of a polygon to be the infimum of the sum of all the areas of sets of rectangles that constitute a covering of the polygon. An example of such a covering is depicted here: 

polygon_covering.gif

This is the Lebesgue measure of the polygon. This area happens to have some very desirable properties:

  • It is invariant to translations (i.e. a polygon’s area will remain the same if it is moved to a different location in the plane).
  • It is invariant to rotations (i.e. a polygon’s area will remain the same if it is rotated).
  • It is invariant to reflections (i.e. a  polygon’s area will remain the same if it is reflected over any straight line).
  • It has the finite-additivity property (explained below).

I will not prove these properties here as it is very easy and somewhat tedious to do so.

Lets focus on the last property, namely finite-additivity. Finite-additivity means that if we take two polygons that are disjoint (actually, that have disjoint interior) and regard their union as one new polygon, then the area of the new polygon will be equal to the sum of the areas of the two original polygons.

Lets call two polygons, P and Q, congruent in parts, if we can divide P to polygons p1,…,pn and Q to polygons q1,…,qn such that pi can be obtained from qi by a finite number of translations, rotations and reflections.

The properties of area listed above, give the following important (though a little obvious) conclusion:

Theorem 1: Two polygons P and Q that are congruent in parts have the same area.

It suffices, by finite-additivity, to show that the areas of the polygons pi and qi are the same for all i. But by the invariance of the area under rotations, translations and reflections, this is obvious.

Now an interesting question arises: Given two polygons P and Q that have an equal area, are they congruent in parts?

The question can be given an affirmative answer. Lets give an outline of the proof:

  1. Every polygon can be divided to finitely many triangles. This is obvious in the case of a convex polygon. In the case of a concave polygon, select any diagonal that lies completely inside the polygon. This diagonal divides the polygon into two polygons each with fewer sides than the original. By induction we can divide the polygon into finitely many triangles. It still remains to be shown that we can always find a diagonal that lies completely inside our polygon, but that is not hard (I will leave this as an exercise to the reader :-) ).
  2. Every triangle is congruent in parts to a rectangle (see illustration below).
  3. Every rectangle is congruent in parts to a square.
  4. Two squares are congruent in parts to a square whose area is the sum of their areas.

An illustration showing that a triangle is congruent in parts to a rectangle follows:triangle_rectangle.gif

We have shown that every polygon is congruent in parts to a square of the same area. We thus have the inverse of Theorem 1, namely,

Theorem 2: Two polygons that have the same area are congruent in parts.

On to three dimensions. We define a polyhedron’s volume (a polyhedron is the three dimensional analogue of a polygon), it’s Lebesgue measure, in a completely analogous way to the two dimensional Lebesgue measure. Given a box whose facets are parallel to the coordinate plane we define its volume to be the product of three of its perpendicular sides. Given any polyhedron we now define its volume as the infimum of the sum of the volumes of all the boxes in a covering of the polyhedron, where the infimum is taken over all the countable coverings of the polyhedron by boxes with facets parallel to the coordinate planes.

We say that two polyhedrons are congruent in parts if we can divide them into polyhedrons p1,…,pn and q1,…,qn such that the pi’s have disjoint interiors (and similarly for the qi’s) and such that qi can be obtained from pi by translations, rotations and reflections (around planes).

The three dimensional Lebesgue measure has the same properties as the two dimensional Lebesgue measure mentioned above. Thus Theorem 1 can be generalized to:

Theorem 3: Two polyhedrons that are congruent in parts have the same volume.

It was conjectured by Gauss that Theorem 3′s inverse (the generalization of Theorem 2 to three dimensions) does not hold. This question later became known as Hilbert’s Third Problem (see Hilbert’s Problems).

Lets state it again, for clarity:

Hilbert’s Third Problem: Are two polyhedrons that have the same volume necessarily congruent in parts?

One might try to answer the question affirmatively by employing a method similar to the one we used for Theorem 2. I.e. find a “generic” 3D polyhedron (the 3D analogue of the 2D triangle) and divide the polyhedrons to finitely many such “generic” polyhedrons. The problem is that, contrary to the 2D triangle, the existence of such a “generic” polyhedron is not obvious.

The problem was indeed given a negative answer by Hilbert’s student Max Dehn. Dehn managed to construct a function D of polyhedrons that remained invariant under congruence in parts (i.e. two polyhedrons that are congruent in parts produce the same value of D). He managed to show that the values of D on the regular tetrahedron and on the cube are different even if they have the same volume.

Dehn’s invariant, D, is defined as follows:

D(P) = Σl(m)f(q(m))

Where the Σ is taken over all edges m of the polyhedron, l(m) denotes the length of the edge, q(m) denotes the angle between the two planes formed by the facets m connects, and finally f:R->Q is a linear transformation (from the real line into the field of rational numbers, where R is regraded as an infinite dimensional vector space over Q) with the following properties:

  1. f(1) = 0
  2. f(qx) = qf(x), q in Q, x in R
  3. f(x+y) = f(x) + f(y)
  4. f(pi) = 0
  5. f(x) != 0, if x is not linearly dependent on pi and 1 (again, regarding R as a vector space over Q).

Cautious readers might notice that the existence of such a function depends on the axiom of choice (as it relies on the possibility of constructing a base for R, when regarded as a vector space over Q). Such a construction is not really necessary, i.e. the theorem does not depend upon the axiom of choice. I leave the matter of arranging a finite dimensional vector space over Q that is sufficient for the proof as an exercise to the reader.

In the next installment of the series I will prove that Dehn’s invariant is indeed an invariant (under congruence in parts), and calculate its value for the cube and for the tetrahedron, thus finishing the proof.

To Be Continued…

Valid Planar Pairing

Rating: 3
November 24th, 2007

pairing_icon.gifThere are 100 red dots and 100 blue dots on the plane (a lot of planar riddles lately). The dots are arranged such that no three are on the same line.

pairing of the red dots and the blue dots is a one-to-one function that assigns one blue dot to each red dot.

A valid pairing is a pairing such that when paired dots are connected with a straight line segment, no line intersections occur. This is depicted here:

pairing.gif

Prove that there exists a valid pairing.

I do not really like the solution I found for this riddle. If you find an elegant one, please post it.

Thanks to Nadav Sherman for giving me this one.

Uncountable Union

Rating: 4.5
November 21st, 2007

easyriddle.gifA very interesting riddle for those of you with some basic background in Set Theory.

Prove or Disprove the Following Claim:

There exists a subset B of P(N), such that |B| = A, and B is completely ordered by the subset relation.

Notes:

N denotes the set of natural numbers.

A denotes the cardinality of the continuum (i.e. A = aleph = 2^aleph 0).

“B is completely ordered by the subset relation” means that for every two elements a, b of B, either a is a subset of b or b is a subset of a.

Expanding Frogs

Rating: 3.5
November 21st, 2007

frog.jpgA very easy riddle. Four frogs are sitting on the corners of the unit square (i.e. they have coordinates (0,0), (0,1), (1,1) and (1,0) ). Each turn, a frog can jump over any other frog, thereby transferring itself to the symmetrical point on the other side of the static frog. For example, if the frog at (0,0) jumps over the frog at (1,1) it will land on (2,2).

easyriddle.gifProve that the frogs cannot transfer themselves to a square with a side of length 2.

I know of two different solutions to this one (both easy ;-) ).

Thanks to Nadav Sherman for asking me this riddle.

Peons

Rating: 3.5
November 5th, 2007

peons_explain.jpg This is a cute puzzle. Consider an infinite checkerboard divided in two with an infinite line lying along the x-axis, as depicted below:

peons_board.jpg

You have several peons at your disposal. In each turn you can move the peons by making one of them jump over the other, thereby killing it (removing it from the board). The peon movement is demonstrated here:

peons_explain.jpg

When the game starts, all the peons are below the line. Your goal is to place a peon above the line. The example shows how to accomplish this starting with two peons. Indeed it is obvious that in order to make a peon cross the line, two peons are needed (clearly one is not enough as it cannot move at all).

How many peons are needed in order to not only cross the line, but place a peon two squares above it? As is demonstrated below, four peons are sufficient, and indeed this is the minimum (check that you see why 2 and 3 peons cannot do it).

peons_one_square.jpg

Now for the riddle. How many peons are needed (again, starting below the line) in order to place a peon 5 squares above the line?

Set it Straight!

Rating: 3.5
October 31st, 2007

sat_choice.jpgIn this post I will give a basic example of the necessity (or uselessness – its up to you to choose – pun intended) of the axiom of choice. I will also present a less known proof of the Cantor–Bernstein–Schroeder theorem.

The Cantor–Bernstein–Schroeder Theorem

Let A and B be sets. Let f:A->B and g:B->A be injective functions. Then there exists a bijection h:A->B.

In order to proof the theorem, lets consider the following definition:

A function f:P(A)->P(A) is called monotonic if X is contained in Y => f(X) is contained in f(Y), for all X,Y in P(A).

Lemma 1

Every monotonic function f:P(A)->P(A) has a fixed point, i.e. a set X in P(A) such that f(X) = X. (The proof of this lemma is very simple, if you have the time, try to prove it yourself before reading on).

Proof of Lemma 1 

Let T be the set { X | F(X) is contained in X }. Then T is not empty (as A is in T). Let S be the intersection of all the elements of T (it exists as T is not empty). Then since S is contained in all X in T, F(S) is contained in F(X) for all X in T (by the monotonicity of F). But, by the very definition of T, F(X) is contained in X for all X in T, so F(S) is contained in X for all X in T, so F(S) is contained in the intersection of all the elements of T, so F(S) is contained in S, giving that S is an element of T. Now, by the monotonicity of F, F(F(S)) is contained in F(S) so F(S) is also an element of T. This gives that S in contained in F(S) (as S is the intersection of all the elements of T) and so S = F(S) and S is a fixed point of F.

Now we are ready to proof the Cantor–Bernstein–Schroeder theorem.

Proof of the Cantor–Bernstein–Schroeder Theorem

First, note that it is enough to prove that if A1 is contained in B, and B is contained in A, and f is a bijection from A to A1, then there exists a bijection g from A to B (make sure you understand why this claim and the C-B-S theorem are indeed equivalent).

In what follows I shall use the notation A U B to denote the union of A and B.

Define h:P(A)->P(A) as h(X) = {f(x) | x is in X} U (A\B). It is trivial that h is monotonic. By lemma 1, there is a fixed point C of h. Define a function g:A->B as

g(x) = f(x) if x is in C, x otherwise.

Then g is a bijection between A and B. Indeed note that g stabilizes both C and A\C (i.e. g(x) is in C i.f.f. x is in C). Indeed, if x is in C, then g(x) = f(x) which is in f(C), which is contained in f(C) U (A\B) = h(C) = C. So x is in C => g(x) is in C. Otherwise x is not in C, and so g(x) = x, and obviously g(x) is not in C.

Now, if g(x) = g(y), by what was said above, either both x and y are in C, or both are in A\C. Either way, this gives x = y. So g is injective.

To show sujectivity of g, we note that if x is in B and in C, then by the structure of C (= f(C) U (A\B)), x is in f(C). Now g(C) = f(C), so x is in the image of g. Otherwise, if x is in B and not in C, then g(x) = x and again x is in the image of g. We showed that g is surjective. Surjectivity and injectivity imply that g is a bijection, and the C-B-S theorem is proved.

Let N denote the set {0, 1, 2, …} of natural numbers. Lets begin by showing that the sets N and NxN (i.e. the set of pairs of natural numbers) are of the same cardinality (i.e. that there exists a bijection f:NxN->N). We call sets of the same cardinality as N countable (note that sets whose cardinality is the same or of lesser than that of N are called at most countable). I will give two proofs of the following theorem (which can be rephrased to: “NxN is a countable set”).

Theorem 1

The sets N and NxN are of the same cardinality.

Proof 1 - Using the C-B-S Theorem

Consider the functions, f:N->NxN and g:NxN->N given by: f(n) = (n,0) and g(n,m) = (2^n)*(3^m). It is obvious that f and g are injective. By the C-B-S theorem N and NxN are of the same cardinality.

Proof 2 - Actual Construction

Lets build such a function f. Let f(m,n) = (2*n+1)*2^m – 1. Now 

f(m,n) = f(u,v)

=> (2*n+1)*2^m – 1 = (2*v+1)*2^u – 1

=> (2*n+1)*2^m = (2*v+1)*2^u

=> (2*n+1)*2^m = (2*v+1)*2^u (mod 2^m)

=> 0 = (2*v+1)*2^u (mod 2^m)

=> 0 = 2^u (mod 2^m)

=> m <= u.

Symmetrically, u <= m, so u = m. Then,

=> (2*n+1) = (2*v+1)

=> n = v

So f is injective. If n is in N \ {0}, then n = s*2^k, where s is odd. Then there exists t, such that s = 2*t + 1. Then f(k,t) = n. If n = 0, then f(0,0) = 1-1 = 0 = n. So f is surjective.

We have shown f to be a bijection between NxN and N and so these sets are of the same cardinality.

The Axiom of Choice

The reason I have gone through all of the above, although I know that to many of you it may seem trivial, is to emphasize the need for the axiom of choice. Lets consider the following theorem:

Theorem 2

A union of countably many disjoint countable sets is countable.

Now, a little reflection may suggest that this is a trivial consequence of what was proved before. We have a countable family of sets A = {A1, A2, …} where each set Ai is countable. So for each i, there is a bijection Fi:N->Ai. So there is a bijection g:NxN->U(Ai) given by g(i,j) = Fi(j). g is indeed a bijection as the Ai’s are disjoint.

Well, surprisingly the above does not constitute a proof of theorem 2!

Not only that, I claim that theorem 2 cannot be proved without employing the axiom of choice. Before going on any further, lets write down this axiom of choice we keep talking about:

The Axiom of Choice

Let S be a family of non-empty sets. Then there exists a function f:S->U(S) such that f(S) is in S.

Note that U(S) denotes the union of all the elements of S.

This may seem trivial. After all, since all the elements of S are non-empty, they each contain at least one member. So there must be a function that maps to each set one of its elements!

Indeed, for the case that S is finite, we can prove the constructability of the function f in the axiom of choice (thus the axiom of choice for the case that S is finite, no longer needs to be an axiom). Say S = {A1, A2, …, An}, where each Ai  is non-empty (1 <= i <= n). Then there exist elements {a1, a2, …, an} such that ai is in Ai. Now define f(Ai) = ai.

As it turns out, for infinite S, that this construction is possible is no longer a consequence of the other axioms of set theory. Say S = {A1, A2, …} and A1, A2, … are non-empty so that ai is a member of Ai. We can define a function f:S->U(S) such that f(Ai) = a1 for all i. We can also define a function f, such that f(A1) = a1, and f(Ai) = a2 for all i not in {1}. We can continue with this process for any finite number of steps, creating a function f:S->U(S) such that f(Ai) = ai for all i < n, and f(Ai) = a1 for i >= n. The point here is that we cannot perform this process for an infinite number of steps.

Some Examples

To put the axiom of choice in very popular terms: we can perform any finite number of arbitrary choices of elements from sets, but we cannot perform an infinite number of arbitrary such choices. 

For example, say that S = {A1, A2, …} and each set Ai={1,i}. Then the following functions f,g:S->U(S) exist without assuming the axiom of choice

f(Ai) = 1, g(Ai) = i.

If it is know that each set Ai is of size 2 containing the element 1, then the following functions can also be shown to exist w/o assuming the axiom of choice:

f(Ai) = 1, g(Ai) = the unique element of Ai\{1}.

Now, if it is known that the sets Ai are disjoint and of each contains exactly two natural numbers, then, again, the functions

f(Ai) = max(Ai), g(Ai) = min(Ai)

can  be shown to exist without assuming the axiom of choice. This principal can be extended to the case where the sets Ai are contained in a set A having an order relation with the property that each non-empty subset of A has a least element (we say that the order relation is a well ordering of A). Then we can define:

f(Ai) = least element of Ai.

If on the other hand all that is known on the sets Ai is that they contain exactly two members, the existence of a function f as in the axiom of choice is not a consequence of any of the other axioms.

The previous two examples suggest an alternative formulation for the axiom of choice, namely that we can define a well ordering on any set. If this were true, we could define a well ordering on the set U(Ai) and define f as above. Indeed it turns out that the axiom of choice is equivalent to the principal that we can define a well ordering on any set (in the sense that using the other axioms of set theory and any one of the above we can prove the other).

On the Existance of Sets

To avoid paradoxes (such as Russell’s paradox), we must constrain ourselves to assuming the existence of only those objects which can be shown to exist from the axioms. For example, let S be the set of all sets (i.e. every set is a member of S). Then by Cantor’s theorem (that P(X) is always of a larger cardinality than X) S cannot exist!

The point to note here is that the mere description of a set does not guaranty its existence.

Now, the same applies to functions (for the purists of you – this is simply a special case of the preceding sentence as all functions are actually sets). The mere description of a function does not guaranty its existence.

As it turns out, the existence of the function f which appears in the axiom of choice, in the case that S is infinite, does not follow from the other axioms of set theory (it can in fact be shown that instead of accepting the axiom of choice, accepting an axiom declaring the in-existence of the function f does not lead to any contradictions with the other axioms of set theory!).

Now, lets consider a valid proof of theorem 2 using the axiom of choice.

Proof of Theorem 2

S = { A1, A2, … } is a countable set of disjoint countable sets. That Ai is countable is equivalent to the statement: Fi = { f | f is a bijection between N and Ai } is a non-empty set. Consider the set T = {F1, F2, …}. T is a set of non-empty sets, so by the axiom of choice there is a function g:T->U(T) such that g(Fi) is in Fi.

Define h:NxN->S as follows:

h(i,j) = g(Fi)(j)

Then obviously h is a bijection.

The reason h does exist, is subtle. It relies on the fact that for each input a clear procedure for calculating the output of the function can be given. I know this is vague, and should you be interested, I will give a full construction of h, demonstrating its existence, in a future post.

Some Thoughts

So, does the above example demonstrate that the axiom of choice really is necessary?

Not really. Consider the following theorem:

Theorem 2b

Let S be a countable family of pairs {(Ai,Fi)} such that the Ai’s are countable and disjoint and Fi is a bijection from N to Ai. Then U(Ai) is countable.

Proof w/o the Axiom of Choice

Define f:NxN->U(Ai) as follows:

f(i,j) = Fi(j).

Obviously f is injective and surjective. Thus we have constructed a bijection from NxN to U(Ai), and by theorem 1, U(Ai) is countable.

Now, in all practical cases (that I can think of – please comment onthis if you have a good counter example) we never use the fact that the set is countable without explicitly having a bijection between the set and N. So, in all practical cases, the weaker version of theorem 2 (namely theorem 2b) can be used.

Final Note 

The acceptance of the axiom of choice seems very logical. Nevertheless, it should be mentioned that it results in some very undesirable consequences, such as unmeasurable sets and the Banach-Tarski paradox (which is very interesting and on which I shall write in a future post).

Finally, I included the axioms of set theory, for your convenience:

zfc.gif

Find the Duplicate

Rating: 4
October 13th, 2007

Dany Valevsky gave me this very cool riddle.

You are given a vector of size N, the elements of which are numbers in the range 1,…,N-1. I.e. there is at least one repeating element. Give an algorithm that finds a repeating element (it does not matter which one, in case there are several) with O(N) time complexity and O(1) memory complexity.

NOTE - the time and memory complexities are calculated in integers. I.e. the input is of size N, not N*logN.

That’s it!

Pirates!

Rating: 3
October 13th, 2007

easyriddle.gifA ship in the plane has integer coordinates. It also has integer velocity (again in ZxZ). Each turn the ship advances according to its velocity. Here is an example of a ship with velocity (3, 1).

pirates.gif

You, as a pirate, want to blow up the ship. You do not know the original position of the ship nor its velocity. You can only shoot one bomb per turn, which blows up exactly one point of the plane. What strategy should you employ in order to be sure you will eventually hit the ship?

It is very recommended to watch to this video while solving the riddle. It provides great inspiration!

Thanks to Nadav Sherman for giving me this riddle.

Turing Machines in Action

Rating: 2.5
October 7th, 2007

tmbinaryaddreverse.gif In this post I will define turing machines and demonstrate a simple one in action.

Definition of a Turing Machine

A turing machine (TM) description consists of a tuple of 7 objects: a set of states Q, an input alphabet I, a tape alphabet T, a transition function F, a set of accept states Qaccept, a set of reject states Qreject and a start state Qstart.

The following requirements must hold: I is contained in T. ‘ ‘ (the space symbol) is a member of T but not of I. F is a function from QxT->QxTx{1,-1}. Qaccept and Qreject are disjoint subsets of Q. Qstart is a member of Q.

The turing machine itself consists of: its description (as defined above), an infinite tape, a head and a current state.

The tape is an infinite list of symbols from T (it is only infinite to the right, i.e. it has indexes 0, 1, 2, … but no negative indexes). The tape initially contains the input (a string of symbols from I) starting at index 0 followed by infinitely many ’ ‘ (space) symbols. As we required that the ‘ ‘ symbol is not a member of I the TM can detect the end of the input.

The head is simply a pointer to the tape. It initially points to tape position 0.

The current state is a member of Q, and is initialized to Qstart.

Operation

A TM with a description {Q, I, T, F, Qaccept, Qreject, Qstart} is initialized as indicated above. The result of its operation consists of the contents of the tape as well as an additional flag which can be either ACCEPT or REJECT. In the description below, the action accept means setting this flag to the ACCEPT state and halting. The same goes for the reject action.

The TM operates as follows:

  1. If current_state is in Qaccept, accept. If it is in Qreject, reject.
  2. Read a symbol from the tape at the position indicated by the head. Denote it as current_symbol.
  3. Set next_state, next_symbol, next_head_pos to F(current_state, current_symbol).
  4. Write next_symbol to the tape at the current head position.
  5. If next_head_pos is 1 move the head one position to the right.
  6. If next_head_pos is -1 move the head one position to the left. If the head tries to move to the left from position 0, do not move the head.
  7. Set current_state to next_state.

And that’s it!

An Example

The following is the description of a TM that multiplies two integers in unary notation. For example, if its input is of the form ’111*11=’ it will accept with an output tape of ’111*11=111111′.

The code of the machine:

TMUnaryMultiply = {
    'description' : """The input must be of the form '11*111='. The output tape will look like '11*111=111111'.""",
    'start_state' : 'mark_start',
#input verification
    'mark_start' :
        {
            '1':['verify_input_1','#',1],
        },
    'verify_input_1' :
        {
            '1':['verify_input_1','1',1],
            '*':['verify_input_*','*',1]
        },
    'verify_input_*' :
        {
            '1':['verify_input_2','1',1]
        },
    'verify_input_2' :
        {
            '1':['verify_input_2','1',1],
            '=':['verify_input_ ','=',1]
        },
    'verify_input_ ' :
        {
            ' ':['move_head_to_left',' ',-1]
        },
    'move_head_to_left':
        {
            '1':['move_head_to_left','1',-1],
            '=':['move_head_to_left','=',-1],
            '*':['move_head_to_left','*',-1],
            '#':['read_ones','1',-1] # this will try to move the head off the tape
        },
# actual multiplication
    'read_ones' :
        {
            '1':['wait_for_*', '#', 1],
            '*':['halt','*', -1],
        },
    'wait_for_*' :
        {
            '1':['wait_for_*', '1', 1],
            '*':['copy_ones', '*', 1]
        },
    'copy_ones' :
        {
            '1':['wait_for_=', '#', 1],
            '=':['move_left2','=', -1],
        },
    'wait_for_=' :
        {
            '1':['wait_for_=','1',1],
            '=':['wait_for_ ','=',1]
        },
    'wait_for_ ' :
        {
            '1':['wait_for_ ','1',1],
            ' ':['move_left','1',-1]
        },
    'move_left':
        {
            '1':['move_left','1',-1],
            '=':['move_left','=',-1],
            '#':['copy_ones','1', 1]
        },
    'move_left2':
        {
            '1':['move_left2','1',-1],
            '*':['move_left2','*',-1],
            '#':['read_ones','1', 1]
        },
    'halt' : {}
}

I think the format of the description is self explanatory (for the python connoisseurs among you, it consists of some nested dictionaries). It can be fed into my Python TM Emulator (the code of which is included at the end of this article). The emulator can also automatically generate a flow graph of the TM. Here is a sample ouput:

tmunarymultiply.gif

Running the emulator (in verbose mode) with the TMUnaryMultiply description above on the input ’111*11=’ results in the following:

Initialing TM [The input must be of the form '11*111='. The output tape will look like '11*111=111111'.]...
Running tm with input [111*11=]...
                          V
<          mark_start>: ['1', '1', '1', '*', '1', '1', '=', ' ']
                               V
<      verify_input_1>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                    V
<      verify_input_1>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                         V
<      verify_input_1>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                              V
<      verify_input_*>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                                   V
<      verify_input_2>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                                        V
<      verify_input_2>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                                             V
<      verify_input_ >: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                                        V
<   move_head_to_left>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                                   V
<   move_head_to_left>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                              V
<   move_head_to_left>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                         V
<   move_head_to_left>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                    V
<   move_head_to_left>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                               V
<   move_head_to_left>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                          V
<   move_head_to_left>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                          V
<           read_ones>: ['1', '1', '1', '*', '1', '1', '=', ' ']
                               V
<          wait_for_*>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                    V
<          wait_for_*>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                         V
<          wait_for_*>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                              V
<           copy_ones>: ['#', '1', '1', '*', '1', '1', '=', ' ']
                                                   V
<          wait_for_=>: ['#', '1', '1', '*', '#', '1', '=', ' ']
                                                        V
<          wait_for_=>: ['#', '1', '1', '*', '#', '1', '=', ' ']
                                                             V
<          wait_for_ >: ['#', '1', '1', '*', '#', '1', '=', ' ']
                                                        V
<           move_left>: ['#', '1', '1', '*', '#', '1', '=', '1']
                                                   V
<           move_left>: ['#', '1', '1', '*', '#', '1', '=', '1']
                                              V
<           move_left>: ['#', '1', '1', '*', '#', '1', '=', '1']
                                                   V
<           copy_ones>: ['#', '1', '1', '*', '1', '1', '=', '1']
                                                        V
<          wait_for_=>: ['#', '1', '1', '*', '1', '#', '=', '1']
                                                             V
<          wait_for_ >: ['#', '1', '1', '*', '1', '#', '=', '1']
                                                                  V
<          wait_for_ >: ['#', '1', '1', '*', '1', '#', '=', '1', ' ']
                                                             V
<           move_left>: ['#', '1', '1', '*', '1', '#', '=', '1', '1']
                                                        V
<           move_left>: ['#', '1', '1', '*', '1', '#', '=', '1', '1']
                                                   V
<           move_left>: ['#', '1', '1', '*', '1', '#', '=', '1', '1']
                                                        V
<           copy_ones>: ['#', '1', '1', '*', '1', '1', '=', '1', '1']
                                                   V
<          move_left2>: ['#', '1', '1', '*', '1', '1', '=', '1', '1']
                                              V
<          move_left2>: ['#', '1', '1', '*', '1', '1', '=', '1', '1']
                                         V
<          move_left2>: ['#', '1', '1', '*', '1', '1', '=', '1', '1']
                                    V
<          move_left2>: ['#', '1', '1', '*', '1', '1', '=', '1', '1']
                               V
<          move_left2>: ['#', '1', '1', '*', '1', '1', '=', '1', '1']
                          V
<          move_left2>: ['#', '1', '1', '*', '1', '1', '=', '1', '1']
                               V
<           read_ones>: ['1', '1', '1', '*', '1', '1', '=', '1', '1']
                                    V
<          wait_for_*>: ['1', '#', '1', '*', '1', '1', '=', '1', '1']
                                         V
<          wait_for_*>: ['1', '#', '1', '*', '1', '1', '=', '1', '1']
                                              V
<           copy_ones>: ['1', '#', '1', '*', '1', '1', '=', '1', '1']
                                                   V
<          wait_for_=>: ['1', '#', '1', '*', '#', '1', '=', '1', '1']
                                                        V
<          wait_for_=>: ['1', '#', '1', '*', '#', '1', '=', '1', '1']
                                                             V
<          wait_for_ >: ['1', '#', '1', '*', '#', '1', '=', '1', '1']
                                                                  V
<          wait_for_ >: ['1', '#', '1', '*', '#', '1', '=', '1', '1']
                                                                       V
<          wait_for_ >: ['1', '#', '1', '*', '#', '1', '=', '1', '1', ' ']
                                                                  V
<           move_left>: ['1', '#', '1', '*', '#', '1', '=', '1', '1', '1']
                                                             V
<           move_left>: ['1', '#', '1', '*', '#', '1', '=', '1', '1', '1']
                                                        V
<           move_left>: ['1', '#', '1', '*', '#', '1', '=', '1', '1', '1']
                                                   V
<           move_left>: ['1', '#', '1', '*', '#', '1', '=', '1', '1', '1']
                                              V
<           move_left>: ['1', '#', '1', '*', '#', '1', '=', '1', '1', '1']
                                                   V
<           copy_ones>: ['1', '#', '1', '*', '1', '1', '=', '1', '1', '1']
                                                        V
<          wait_for_=>: ['1', '#', '1', '*', '1', '#', '=', '1', '1', '1']
                                                             V
<          wait_for_ >: ['1', '#', '1', '*', '1', '#', '=', '1', '1', '1']
                                                                  V
<          wait_for_ >: ['1', '#', '1', '*', '1', '#', '=', '1', '1', '1']
                                                                       V
<          wait_for_ >: ['1', '#', '1', '*', '1', '#', '=', '1', '1', '1']
                                                                            V
<          wait_for_ >: ['1', '#', '1', '*', '1', '#', '=', '1', '1', '1', ' ']
                                                                       V
<           move_left>: ['1', '#', '1', '*', '1', '#', '=', '1', '1', '1', '1']
                                                                  V
<           move_left>: ['1', '#', '1', '*', '1', '#', '=', '1', '1', '1', '1']
                                                             V
<           move_left>: ['1', '#', '1', '*', '1', '#', '=', '1', '1', '1', '1']
                                                        V
<           move_left>: ['1', '#', '1', '*', '1', '#', '=', '1', '1', '1', '1']
                                                   V
<           move_left>: ['1', '#', '1', '*', '1', '#', '=', '1', '1', '1', '1']
                                                        V
<           copy_ones>: ['1', '#', '1', '*', '1', '1', '=', '1', '1', '1', '1']
                                                   V
<          move_left2>: ['1', '#', '1', '*', '1', '1', '=', '1', '1', '1', '1']
                                              V
<          move_left2>: ['1', '#', '1', '*', '1', '1', '=', '1', '1', '1', '1']
                                         V
<          move_left2>: ['1', '#', '1', '*', '1', '1', '=', '1', '1', '1', '1']
                                    V
<          move_left2>: ['1', '#', '1', '*', '1', '1', '=', '1', '1', '1', '1']
                               V
<          move_left2>: ['1', '#', '1', '*', '1', '1', '=', '1', '1', '1', '1']
                                    V
<           read_ones>: ['1', '1', '1', '*', '1', '1', '=', '1', '1', '1', '1']
                                         V
<          wait_for_*>: ['1', '1', '#', '*', '1', '1', '=', '1', '1', '1', '1']
                                              V
<           copy_ones>: ['1', '1', '#', '*', '1', '1', '=', '1', '1', '1', '1']
                                                   V
<          wait_for_=>: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1']
                                                        V
<          wait_for_=>: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1']
                                                             V
<          wait_for_ >: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1']
                                                                  V
<          wait_for_ >: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1']
                                                                       V
<          wait_for_ >: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1']
                                                                            V
<          wait_for_ >: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1']
                                                                                 V
<          wait_for_ >: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1', ' ']
                                                                            V
<           move_left>: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1', '1']
                                                                       V
<           move_left>: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1', '1']
                                                                  V
<           move_left>: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1', '1']
                                                             V
<           move_left>: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1', '1']
                                                        V
<           move_left>: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1', '1']
                                                   V
<           move_left>: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1', '1']
                                              V
<           move_left>: ['1', '1', '#', '*', '#', '1', '=', '1', '1', '1', '1', '1']
                                                   V
<           copy_ones>: ['1', '1', '#', '*', '1', '1', '=', '1', '1', '1', '1', '1']
                                                        V
<          wait_for_=>: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1']
                                                             V
<          wait_for_ >: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1']
                                                                  V
<          wait_for_ >: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1']
                                                                       V
<          wait_for_ >: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1']
                                                                            V
<          wait_for_ >: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1']
                                                                                 V
<          wait_for_ >: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1']
                                                                                      V
<          wait_for_ >: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1', ' ']
                                                                                 V
<           move_left>: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1', '1']
                                                                            V
<           move_left>: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1', '1']
                                                                       V
<           move_left>: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1', '1']
                                                                  V
<           move_left>: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1', '1']
                                                             V
<           move_left>: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1', '1']
                                                        V
<           move_left>: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1', '1']
                                                   V
<           move_left>: ['1', '1', '#', '*', '1', '#', '=', '1', '1', '1', '1', '1', '1']
                                                        V
<           copy_ones>: ['1', '1', '#', '*', '1', '1', '=', '1', '1', '1', '1', '1', '1']
                                                   V
<          move_left2>: ['1', '1', '#', '*', '1', '1', '=', '1', '1', '1', '1', '1', '1']
                                              V
<          move_left2>: ['1', '1', '#', '*', '1', '1', '=', '1', '1', '1', '1', '1', '1']
                                         V
<          move_left2>: ['1', '1', '#', '*', '1', '1', '=', '1', '1', '1', '1', '1', '1']
                                    V
<          move_left2>: ['1', '1', '#', '*', '1', '1', '=', '1', '1', '1', '1', '1', '1']
                                         V
<           read_ones>: ['1', '1', '1', '*', '1', '1', '=', '1', '1', '1', '1', '1', '1']
accept

Finally, here is the code of my Python TM Emulator:

class TM:
    def __init__(self, states, accept_states, reject_states):
        print 'Initialing TM [%s]...' % (states['description'])
        self.states = states
        self.accept_states = accept_states
        self.reject_states = reject_states
        self.halt_states = accept_states + reject_states
def run(self, input,verbose=False):
        head_pos = 0
        current = self.states['start_state']
        tape = [x for x in input]+[" "]
        while current not in self.halt_states:
            if verbose:
                print ' '*(22+1+1+1+5*(head_pos)+1)+'V'
                print '<%20s>: %s' % (current, tape)
            try:
                next_state, next_symbol, head_movement = self.states[current][tape[head_pos]]
            except KeyError:
                return "reject"
            tape[head_pos] = next_symbol
            head_pos += head_movement
            if head_pos < 0:
                head_pos = 0
            elif head_pos == len(tape):
                tape.append(" ")
            current = next_state
        print ''.join(tape)
        if current in self.accept_states:
            return 'accept'
        else:
            assert(current in self.reject_states)
            return 'reject'

And that of a simple test method:

def test():
    mul = TM(TMUnaryMultiply, ["halt"], [""])
    inputs = ['bad_input','111*11=','1*1=','*1=','1*=','11111111*1111111=']
    for input in inputs:
        print 'Running tm with input [%s]...' % (input)
        print mul.run(input)

A sample output:


>>> test()
Initialing TM [The input must be of the form '11*111='. The output tape will look like '11*111=111111'.]...
Running tm with input [bad_input]...
reject
Running tm with input [111*11=]...
111*11=111111
accept
Running tm with input [1*1=]...
1*1=1
accept
Running tm with input [*1=]...
reject
Running tm with input [1*=]...
reject
Running tm with input [11111111*1111111=]...
11111111*1111111=11111111111111111111111111111111111111111111111111111111
accept

A second, more interesting TM performs binary addition:


TMBinaryAddReverse = {
    'description' : """The input must be of the form '1010+0010=' (i.e. in reversed binary notation, operands of same size padded with an extra 0). The output will look like '1010+0010=1110'.""",
    'start_state' : 'mark_start_carry_0',
    'halt': {},
    'mark_start_carry_0' :
    {
        '1':['sum_1','#',1],
        '0':['sum_0','#',1],
        '+':['halt','+',1]
    },
    'mark_start_carry_1' :
    {
        '1':['sum_2','#',1],
        '0':['sum_1','#',1],
        '+':['halt','+',1]
    },
    'sum_0' :
    {
        '1':['sum_0','1',1],
        '0':['sum_0','0',1],
        '+':['sum_0_get_next','+',1],
        '=':['write_0','=',1],
    },
    'sum_1' :
    {
        '1':['sum_1','1',1],
        '0':['sum_1','0',1],
        '+':['sum_1_get_next','+',1],
        '=':['write_1','=',1],
    },
    'sum_2' :
    {
        '1':['sum_2','1',1],
        '0':['sum_2','0',1],
        '+':['sum_2_get_next','+',1],
        '=':['write_2','=',1],
    },
    'sum_3' :
    {
        '1':['sum_3','1',1],
        '0':['sum_3','0',1],
        '=':['write_3','=',1],
    },
    'sum_1_get_next':
    {
        '#':['sum_1_get_next','#',1],
        '1':['sum_2','#',1],
        '0':['sum_1','#',1],
    },
    'sum_0_get_next':
    {
        '#':['sum_0_get_next','#',1],
        '1':['sum_1','#',1],
        '0':['sum_0','#',1],
    },
    'sum_2_get_next':
    {
        '#':['sum_2_get_next','#',1],
        '1':['sum_3','#',1],
        '0':['sum_2','#',1],
    },
    'write_0':
    {
        '1':['write_0','1',1],
        '0':['write_0','0',1],
        ' ':['move_left_carry_0','0',-1],
    },
    'write_1':
    {
        '1':['write_1','1',1],
        '0':['write_1','0',1],
        ' ':['move_left_carry_0','1',-1],
    },
    'write_2':
    {
        '1':['write_2','1',1],
        '0':['write_2','0',1],
        ' ':['move_left_carry_1','0',-1],
    },
    'write_3':
    {
        '1':['write_3','1',1],
        '0':['write_3','0',1],
        ' ':['move_left_carry_1','1',-1],
    },
    'move_left_carry_0':
    {
        '0':['move_left_carry_0','0',-1],       
        '1':['move_left_carry_0','1',-1],       
        '=':['move_left_carry_0','=',-1],       
        '#':['move_left_carry_0','#',-1],       
        '+':['move_left_carry_0_part_2','+',-1],       
    },
    'move_left_carry_1':
    {
        '0':['move_left_carry_1','0',-1],       
        '1':['move_left_carry_1','1',-1],       
        '=':['move_left_carry_1','=',-1],       
        '#':['move_left_carry_1','#',-1],       
        '+':['move_left_carry_1_part_2','+',-1],       
    },
    'move_left_carry_0_part_2':
    {
        '0':['move_left_carry_0_part_2','0',-1],       
        '1':['move_left_carry_0_part_2','1',-1],
        '#':['mark_start_carry_0','#',1],
    },
    'move_left_carry_1_part_2':
    {
        '0':['move_left_carry_1_part_2','0',-1],       
        '1':['move_left_carry_1_part_2','1',-1],
        '#':['mark_start_carry_1','#',1],
    },
}

Its graph is included here for your convenience (click to enlarge):

tmbinaryaddreverse.gif

And a sample run:


>>> mul = TM(TMBinaryAddReverse, ["halt"], [])
Initialing TM [The input must be of the form '1010+0010=' (i.e. in reversed binary notation, operands of same size padded with an extra 0). The output will look like '1010+0010=1110'.]...
>>> mul.run('1110+1100=',True)
                                  V
<mark_start_carry_0>          : ['1', '1', '1', '0', '+', '1', '1', '0', '0', '=', ' ']
                                       V
<sum_1>                       : ['#', '1', '1', '0', '+', '1', '1', '0', '0', '=', ' ']
                                            V
<sum_1>                       : ['#', '1', '1', '0', '+', '1', '1', '0', '0', '=', ' ']
                                                 V
<sum_1>                       : ['#', '1', '1', '0', '+', '1', '1', '0', '0', '=', ' ']
                                                      V
<sum_1>                       : ['#', '1', '1', '0', '+', '1', '1', '0', '0', '=', ' ']
                                                           V
<sum_1_get_next>              : ['#', '1', '1', '0', '+', '1', '1', '0', '0', '=', ' ']
                                                                V
<sum_2>                       : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', ' ']
                                                                     V
<sum_2>                       : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', ' ']
                                                                          V
<sum_2>                       : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', ' ']
                                                                               V
<sum_2>                       : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', ' ']
                                                                                    V
<write_2>                     : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', ' ']
                                                                               V
<move_left_carry_1>           : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                                          V
<move_left_carry_1>           : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                                     V
<move_left_carry_1>           : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                                V
<move_left_carry_1>           : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                           V
<move_left_carry_1>           : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                      V
<move_left_carry_1>           : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                 V
<move_left_carry_1_part_2>    : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                            V
<move_left_carry_1_part_2>    : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                       V
<move_left_carry_1_part_2>    : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                  V
<move_left_carry_1_part_2>    : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                       V
<mark_start_carry_1>          : ['#', '1', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                            V
<sum_2>                       : ['#', '#', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                 V
<sum_2>                       : ['#', '#', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                      V
<sum_2>                       : ['#', '#', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                           V
<sum_2_get_next>              : ['#', '#', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                                V
<sum_2_get_next>              : ['#', '#', '1', '0', '+', '#', '1', '0', '0', '=', '0']
                                                                     V
<sum_3>                       : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0']
                                                                          V
<sum_3>                       : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0']
                                                                               V
<sum_3>                       : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0']
                                                                                    V
<write_3>                     : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0']
                                                                                         V
<write_3>                     : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', ' ']
                                                                                    V
<move_left_carry_1>           : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                                               V
<move_left_carry_1>           : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                                          V
<move_left_carry_1>           : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                                     V
<move_left_carry_1>           : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                                V
<move_left_carry_1>           : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                           V
<move_left_carry_1>           : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                      V
<move_left_carry_1>           : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                 V
<move_left_carry_1_part_2>    : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                            V
<move_left_carry_1_part_2>    : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                       V
<move_left_carry_1_part_2>    : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                            V
<mark_start_carry_1>          : ['#', '#', '1', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                 V
<sum_2>                       : ['#', '#', '#', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                      V
<sum_2>                       : ['#', '#', '#', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                           V
<sum_2_get_next>              : ['#', '#', '#', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                                V
<sum_2_get_next>              : ['#', '#', '#', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                                     V
<sum_2_get_next>              : ['#', '#', '#', '0', '+', '#', '#', '0', '0', '=', '0', '1']
                                                                          V
<sum_2>                       : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1']
                                                                               V
<sum_2>                       : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1']
                                                                                    V
<write_2>                     : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1']
                                                                                         V
<write_2>                     : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1']
                                                                                              V
<write_2>                     : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', ' ']
                                                                                         V
<move_left_carry_1>           : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                                                    V
<move_left_carry_1>           : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                                               V
<move_left_carry_1>           : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                                          V
<move_left_carry_1>           : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                                     V
<move_left_carry_1>           : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                                V
<move_left_carry_1>           : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                           V
<move_left_carry_1>           : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                      V
<move_left_carry_1>           : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                 V
<move_left_carry_1_part_2>    : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                            V
<move_left_carry_1_part_2>    : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                 V
<mark_start_carry_1>          : ['#', '#', '#', '0', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                      V
<sum_1>                       : ['#', '#', '#', '#', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                           V
<sum_1_get_next>              : ['#', '#', '#', '#', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                                V
<sum_1_get_next>              : ['#', '#', '#', '#', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                                     V
<sum_1_get_next>              : ['#', '#', '#', '#', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                                          V
<sum_1_get_next>              : ['#', '#', '#', '#', '+', '#', '#', '#', '0', '=', '0', '1', '0']
                                                                               V
<sum_1>                       : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0']
                                                                                    V
<write_1>                     : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0']
                                                                                         V
<write_1>                     : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0']
                                                                                              V
<write_1>                     : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0']
                                                                                                   V
<write_1>                     : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', ' ']
                                                                                              V
<move_left_carry_0>           : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
                                                                                         V
<move_left_carry_0>           : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
                                                                                    V
<move_left_carry_0>           : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
                                                                               V
<move_left_carry_0>           : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
                                                                          V
<move_left_carry_0>           : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
                                                                     V
<move_left_carry_0>           : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
                                                                V
<move_left_carry_0>           : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
                                                           V
<move_left_carry_0>           : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
                                                      V
<move_left_carry_0>           : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
                                                 V
<move_left_carry_0_part_2>    : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
                                                      V
<mark_start_carry_0>          : ['#', '#', '#', '#', '+', '#', '#', '#', '#', '=', '0', '1', '0', '1']
####+####=0101
'accept'

I think this machine can be simplified by writing the carry directly to the tape (i.e. writing a ’0′, a ’1′ or a ’2′ to the result part of the tape, and later converting the ’2′s to ’1′ and ’0′ as appropriate).

If anyone of you manages to generate an equivalent machine with a shorter description (less states) please post it!

Galois Theory for Dummies – Part I

Rating: 3.5
September 27th, 2007

galois.jpg Galois Theory is an algebraic theory providing a powerful connection between fields and groups. Many complicated problems involving fields can be converted to (possibly) simpler problems involving groups.

Galois Theory, albeit being extremely beautiful (it answers some very elementary questions, which were all open problems until its arrival), is far from being wildly known (i.e. by non-mathematicians). One of the reasons for this, in my opinion, is that is it a vast subject and most books on the topic are filled with too many details and so are inadequate for a quick read.

I wrote this short article as an introduction to Galois Theory. It is not very easy to read, but as it is very short (only 4 pages), I believe mathematically oriented readers, without excessive mathematical background will be able to read it. I chose to write it as a Latex document, instead of simple HTML residing directly in my site, as it contains a lot of complex mathematical notations.

The article consists of only the first part of the introduction to the topic. I believe that perhaps two more parts (of about equal length) will be necessary to complete the introduction.

If you like this article, let me know (this will enhance the chances of me actually writing down the remaining parts). If anything is not clear – please ask.

Enjoy!

Seam Carving

Rating: 5
August 27th, 2007

carved_pagoda.jpg I recently watched this interesting video by Shai Avidan and Ariel Shamir from MERL. They developed an extremely cool image resizing technique called seam carving. They explain all about it in this article.

One of the coolest things about their idea is that is it easy to implement. I implemented a semi-optimized version of their algorithm in a couple of hours of coding. All the images in this post were generated by my code.

The following image of a pagoda will assist me in demonstrating the ideas to follow:

pagoda.jpg

First lets discuss the two image resizing methods currently used, namely scaling and cropping. Scaling consists of uniformly resizing the image. If for example we were to reduce an image 100 pixels wide to an image 50 pixels wide, we could simply average every two neighbouring pixels (a generalization of this technique can be applied even when the image’s new width does not divide the image’s original width, and of course to resizing both the width and the height of the image).

Scaling is efficient and, in some sense, you do not lose a lot of information from the original image. There are two big problems with scaling however:

  • If the new dimensions of the image are not proportionate to the original dimensions, the scaled image is distorted.
  • The method does not take into account the contents of the image at all (I will clarify this point shortly).

An example of the pagoda image scaled:

scaled_pagoda.jpg

A second widely used form of image resizing is cropping. Cropping consists of simply selecting a sub image of the desired proportions from the original image – i.e. removing the borders of the image. An important question now arises: how do we select which box of the image to keep (or equivalently, how do we select which borders to remove)?

Lets say we assign a weight to each of the pixels of the image, representing its importance. Then selecting the cropping box amounts to simply selecting the box containing the pixels whose sum of weights is highest. We will refer to the weights of the pixels as the energy of the pixels. So optimal cropping would select the box with the highest energy.

Now, all we are left to do is assign the weights to the pixels. Following Avidan and Shamir, we can use simple gradient magnitude as our energy function. Gradient magnitude simply means calculating the gradient at each pixel (regarding the image as a function from R2 to R) and taking its magnitude at each pixel. Here is the result of applying gradient magnitude on the pagoda image:

grad_mag.JPG

The perceptive reader would have noticed that a color image is usually a function from R2 to R3 and not to R (as there are usually three independent color channels). There are several ways to calculate the gradient magnitude in this case. The one I used in my code is to calculate the magnitudes of the three gradients separately and average the results. Note that averaging before calculating the gradients is a bad idea as a lot of information is lost in the process!

Note that the gradient magnitude of a pixel constitutes some measure of how likely that pixel is to be a border pixel (and thus more important).

Using all of this, an optimal cropping of the pagoda image is shown here:

croped_pagoda.jpg

The cropping is optimal in the sense that this is the sub-image of the given dimensions containing the most energy.

Note that all the water from the bottom of the image, and the flowers from the top are missing, as well as the left side of the white house on the left of the pagoda.

Now, unlike the scaling method, the cropping technique just presented takes into account the contents of the image. But what if the image contained two pagodas, one at each side of the image? Cropping would only be able to select one of them.

So what we are looking for is some way to automatically remove the uninteresting parts in the middle of the image, and bring the interesting bits closer together.

A naive solution, similar to regular cropping, would be to remove the columns of the image with the least amount of energy not necessarily from the border. This technique causes severe artifacts.

Avidan’s and Shamir’s seam carving technique is an improvement of this. They define a vertical seam as an 8-connected path from top-to-bottom of the image containing one pixel in each row. 8-connectedness means that if pixel (x,y) is in the vertical seam, then exactly one of the pixels (x+1, y+1), (x, y+1) or (x-1, y) is in the vertical seam as well (unless of course y = the height of the image). A horizontal seam is defined similarly.

Now instead of deleting the column (row) with the least energy, they suggest deleting the vertical (horizontal) seam with the least energy. The seam with the least energy can be easily found using dynamic programming.

The first seam to be removed in the pagoda image is depicted below:

seam.JPG

The process of reducing an image’s size via repeated deletions of the least important seams is called seam carving. The result of applying some seam carving to the pagoda image is shown here:

carved_pagoda.jpg

Note that although some artifacts are present in the image, it is the best among all the reduced images (at least – a mon avis).

A problem with seam carving, compared to scaling, is efficiency. I have some algorithmic ideas that I think might substantially reduce computation costs. I will let you know when I will have time to test them.

Another example is shown. The original image:

pagodas_small.jpg

Optimal cropping:

pagodas_small_cropped.jpg

Scaling:

pagodas_small_scaled.jpg

And finally, seam carving:

pagodas_small_carved.jpg

And just to convince you that the method doesn’t only work on pagodas:

hopper_landscape.jpg

And the carved version:

hopper_landscape_carved.jpg

Piece of Cake

Rating: 3.5
August 23rd, 2007

In this article you will find a collection of riddles. They are all either very well known or extremely easy. Enjoy!

Cutting the Cake

cake1.jpgHow can you cut a circular cake to eight even pieces with 3 cuts? What is the maximum number of pieces possible with 4 cuts?

Colorblind Pills

red_or_blue_pill.jpgYou have a rare and fatal disease. The only cure for your disease is taking one Biaxin pill (which is blue) and one Robaxin pill (which is red) exactly at midnight, for two days. I.e. you have to take one of each today at midnight and one of each tomorrow at midnight, for a total of four pills. You have exactly four pills (two of each kind) at your disposal. The pills, apart from being in different colors, are identical in appearance.

Now, in addition to your fatal illness, you are also colorblind. How can you save yourself?

Note that in order to be saved you have to take the pills exactly as described above – e.g. taking all four pills at the same time will get you killed!

And no - there aren’t other people that can help you, etc…

Cup of Coffee, Fuck of Tea 

glass-of-milk.jpgYou have a cup of coffee and a cup of milk, both containing the same (unspecified) amount of liquid. You take one spoon full of milk and put it in the coffee. You mix well. You then return a spoon full of the mixture to the cup of milk. Which is more: milk in coffee or coffee in milk?

The title is taken from the famous Israeli movie Operation Grandma. Thanks to Misha Seltzer for this riddle.

The Virgin Islands

british_virgin_islands_property.jpg You are shipwrecked on the Virgin Islands (since Chuck Norris visited them, they are known simply as The Islands). You know that there are three tribes in the Islands: The Liatu tribe, that always lies, the Honestu tribe, that always tells the truth, and the Randomu tribe that answers randomly to every question asked. As you wander around the Islands, you come across 3 tribesmen. By the way they are dressed you can tell that they belong to different tribes (i.e. there is one member of each tribe) but you do not know to which tribe they belong. The native language on the Islands is French, which you speak fluently. It is well known that in French you can only ask “yes-no” questions. You are also familiar with an ancient tradition on the Islands, that says that it is impolite to ask more than 2 questions (each addressed to one specific person) per conversation, no más. You are standing at a fork and need to decide whether to go left or right. How can you get your information?

Maximal Partitions

Rating: 2
August 17th, 2007

When considering the extra-credit of the 23 and 2000 riddle, I thought of an additional interesting problem.

Let N be a positive integer. A partition of N into m parts (an m-partition of N) is a multiset of m positive integers, such that their sum equals N. A multiset is a set which may contain repeated elements.

So if N is 5, then the multiset {1, 1, 1, 2} is a 4-partition of N.

Now, for a multiset A of numbers, define the mul of A, denoted mul(A), as the product of all the numbers. So mul({1, 1, 1, 2}) = 2.

Now, let N be a positive integer. An m-partition P of N is called a maximal partition of N,  if for all positive integers k, and all k-partitions of N, Pk, mul(P) >= mul(Pk).

E.g. the 2-partition {2, 3} is a maximal partition of 5 (its the only one). Note that as the number of partitions is finite there always is at least one maximal partition.

Finally, define maxmul(N) to be the mul of a maximal partition of N.

Now, we finally got to the point: given a positive integer N, what is maxmul(N)?

If you want to think about this one yourself, stop reading now, because the rest of this post discusses the solution. Anyway, try to have at least an initial intuition on the answer before reading on.

I must admit that my initial intuition about this problem was that if N is even, then a maximal partition of N will be a N/2-partition of the form {2, 2, …, 2}. This is wrong.

As it turns out, for all N > 1, the maximal partition of N depends on the value of N modulo 3.

If N = 0 (3) then the maximal partition is of the form {3, 3, …, 3}.

If N = 1 (3) then the maximal partition is of the form {3, 3, …, 3, 2, 2}.

If N = 2 (3) then the maximal partition is of the form {3, 3, …, 3, 2}.

I leave the reader the details of the proof. It is not too complicated using induction. I will now try to explain the reason behind this fact.

Notice a weird fact about the number 3. It is, in some way, the densest integer. Why is that so?

As you might have guessed, this is because 3 is the integer closest to e (=~2.718).

Lets consider the continuous equivalent of this problem. Say that N is a real-number. For each integer m, define an m-partition of Nas a multiset of m positive real numbers, whose sum is N. Note that now the number of partitions is infinite (for each m>1, even the number of m-partitionsis infinite) and so we are no longer guaranteed of the existence of a maximal partition (although it is easy to show that it indeed exists).

Using some basic analytic tools (such as Lagrange multipliers) it is easy to show that given m, the partition whose mul is maximal is the partition {N/m, …, N/m}.

So now we are left with finding an integer m such that the expression (N/m)^mis maximal. Again, lets consider the continuous equivalent of this. For each 1<x<∞, define f(x) = (N/x)^x. Using elementary calculus, it is clear that this function has a single maximum for x = N/e. A graph of the function f(x) is shown here (the blue line indicates e):

maximal_partitions.GIF

Using x = N/e gives a partition with a constant chunk size of e.

This explains (at least intuitively) the reason for 3 being the densest integer.

Extra credit

  1. How many partitions does a positive integer N have?
  2. How big is the set {mul(P) | P is a partition of N}?

Note that the definitions used in this post were invented by me and are not universally known (i.e. do not be alarmed if someone does not understand the meaning of the term maximal partition!).

23 and 2000

Rating: 3
August 13th, 2007

easyriddle.gifYou are given 23 whole numbers, not necessarily distinct, in a row.

You cannot change the order of the numbers.

Prove that there exists an arrangement of the symbols ’+’, ‘×’, ‘(‘ and ‘)’ in-between the 23 numbers, such that the final result is a valid formula, whose evaluated value equals 0 mod 2000.

Extra Credit 

  1. Is 23 a tight bound? Can you find a sequence of 22 numbers such that all arrangements of the symbols ‘+’, ‘×’, ‘(‘ and ‘)’ in-between them will result in numbers that are different from 0 mod 2000? I haven’t thought about this one yet, so please post your ideas!
  2. Consider a more general case. Replace in the riddle above the number 23 by K and the number 2000 by N. Describe all the pairs, K, N, for which a solution to the riddle exists.

Thanks to Misha Seltzer, for sending me this cool riddle!

SIMD (Fire)Works!

Rating: 3.5
August 12th, 2007

When I started yaniv.leviathanonline.com, I declared it to be a site about “Mathematics and Hacking”. As of now, most of its content is Mathematical. I thought it was time for a little change…

Those of you with no interest in assembly or low-level programming are welcome to skip this post entirely, although expanding your horizons is never a bad idea!

A few years ago I gave a week-long x86 assembly course for some fellow programmers. As I wanted them to learn how to use the Pentium’s SIMD extensions, I asked them to program a simple fireworks demo and try to apply some filters to it using MMX. In this article I will detail my solution to the SIMD fireworks exercise.

The following paragraph is intended for those of you without assembly knowledge. All of you assembly masters are welcome to skip to the fun stuff below!

Some References 

Well, for starters I recommend reading the ”The Art of Assembly Language Programming” (AoA) which covers a lot of the basics. And yes – Assembly Programming isan Art (maybe more so than programming in other languages). Next, you should also read Intel’s manuals: Volume 1: Basic Architecture, Volume 2A: Instruction Set Reference, A-M, Volume 2B: Instruction Set Reference, N-Z, Volume 3A: System Programming Guide, Volume 3B: System Programming Guide and Architectures Optimization Reference Manual. Yes, I know its a lot of reading but they are really interesting and you don’t have to read the whole thing in order (although I recommend it – so many interesting facts lye in those pages). Finally, if you are going to develop in assembly (e.g. assemble the source included in this article) I recommend using NASM, as it is the simplest and cleanest assembler (although some people prefer MASM, which is kind of similar, but a bit more clumsy). If you are going to do some 16-bit DOS programming, you will find Ralf Brown’s Interrupt List (RBIL) a useful tool.

Throughout this article I will assume basic assembly knowledge (the first chapters of AoA will more than cover everything).

The Buzzwords

SIMD stands for Single Instruction Multiple Data.  It is clear that most algorithms require repeating the same operation over and over again (each time on different sets of data). From this observation stemmed the idea of a vector computer or SIMD. We will shortly see that indeed our fireworks program will repeat the same operations many times and so we will be using the Pentium’s SIMD capabilities for that purpose.

MMX was Intel’s first attempt at bringing SIMD capabilities to the Pentium. MMX does not stand for anything, but it has been offered several meanings since its appearance (such as Multimedia Extensions). MMX adds 8 new registers (mm0,…,mm7) to the CPU. These are 64-bit registers. The main advantage of these registers (compared to the x86 general purpose registers) is that the MMX registers can function as vector registers. This means that each of these registers can represent one 64-bit integer value, two 32-bit integer values, four 16-bit integer values or eight 8-bit integer values. For example, the command:

psubusb mm0, mm1

is essentially equivalent to 8 byte subtractions (it stands for Packed SUBtract Unsigned Saturated Byte).

Note that in order to be backward compatible with existing OS’s (mainly Windows) Intel aliased the MMX registers to the FPU registers. This means that when accessing one of them the contents of the other one are altered as well. This allowed the OS’s to keep using their old context switching mechanisms.

SSE stands for Streaming SIMD Extensions. This was Intel’s second attempt at bringing SIMD to the PC (in response to AMD’s 3DNow!). SSE added many instructions to the instruction set as well as 16 128-bit vector registers (xmm0,…,xmm15). Unlike the 8 MMX registers, the SSE registers can contain four 32-bit floating point values. As these registers are not aliased to any existing registers (and so require special care by the OS) they are disabled by default and are only turned on when the OS explicitly commands it.

Mode 13h

The easiest to program DOS graphics mode is probably mode 13h, which we shall therefore use. In mode 13h, the screen is represented as a 320*200 byte buffer (located in address 0xA000). Each byte in the buffer is an index to a 256 bytes long palette. When mode 13h is enabled, simply writing bytes memory in the range 0xA000-0x19A00 (0x19A0 = 0xA000 + 0xFA00, 0xFA00 = 320*200) will draw pixels on the screen. You should not draw directly to this buffer though, as when the video card is drawing the screen, it scans through this buffer, thereby stalling your write commands. What you should do, is allocate an off-screen buffer and render to this buffer and once in a frame spill the contents of this buffer as quickly as possible to the memory block starting at 0xA000. We will get to this later.

Entering mode 13h is very easy, and is done by calling BIOS function 0×10, as such:

mov ax, 0x13
int 0x10

Calling BIOS functions is done through raising an interrupt. The parameters are passed in the registers (for more information see RBIL).

The default video mode for DOS is mode 3h. It is the mode of choice for displaying text. So upon program exit we should restore the video mode to mode 3h, as such:

mov ax, 3
int 0x10

The Palette 

Now, in order to render properly colored fireworks, we need to have a decent color palette. In mode 13h the palette is set by outputting the palette index number to port 0x3c8 and then outputting 3 color values (for the red, green and blue components) to port 0x3c9. Note that if the entire palette is to be set in one go, only index 0 needs to be outputted to port 0x3c8, as it is incremented automatically after three writes to port 0x3c9. Each of the 3 color components is represented by a 7 bit value, it is a number in the range 0-63, where a value of 0 means that the component is not present at all in the color, and a value of 63 means that the component is present with full opacity in the color. So the color pure green will be represented by the tuple (0,63,0). The palette we shall use will consist of a simple linear interpolation from black to red to white. Index 0 will be total black, index 128 will be total red and index 255 will be total white.

A simple program that sets such a palette and then draws it to the screen can be found here (its source is here) Our fireworks program will use most of this code. Its output is:

palette.JPG

The Basic Fireworks Program

Enough with the introductions – lets get to business!

The design for the fireworks program is very simple. We will manage a buffer of particles, each having the following qualities: x, y, x_speed, y_speed and color (although this version uses a constant white color for all particles, which looks better). Each particle’s position will be updated according to its speed, which will in turn be updated according to gravitation.

The full source of the program can be found here. The program itself can be found here. It begins with some defines (you can play with the number of particles by changing the NUM_PARTS define). I then allocate a buffer big enough to contain a copy of the entire screen (we will draw to this buffer, so as to avoid the stalling inherent in writing to the video memory itself). We access the buffer though the es register.

We then enter mode 13h, set the palette and zero the buffer.

We then use a little very useful trick. We call the RDTSC instruction, which reads the CPU’s 64-bit time-stamp register (this time-stamp register is zeroed upon CPU initialization – i.e. when you turn on your computer – and is incremented after every instruction! What do you say, is there a chance of overflow?) into registers EDX:EAX. As NASM does not support the RDTSC instruction I coded it as:

db 0x0f, 0x31

This trick is very useful, as when writing C++ programs for the PC, it can be used to generate very accurate time-stamps very efficiently. The following code will do the trick in Visual Studio:


int RDTSC()
{
     __asm _emit 0x0f;
     __asm _emit 0x31;
}

Lets get back to the fireworks program. We use this trick to generate a random seed for the random number generator (RNG). This RNG will later be used to initialize the particles.

Next, is the init_particle function, which as its name implies, initializes a particle’s properties. In order to make the particles look like fireworks I initialize the position of bunches of particles to the same value (using the global_x and global_y variables). The init_particle function accepts a single argument (a pointer to a particle) passed through the bp register.

The MAIN function updates the particles (and calls init_particle when needed) and then draws the particles to our off-screen buffer.

After all of the particles are drawn, the MAIN function calls two functions which apply filters on our buffer, before displaying it on the screen (by copying our video buffer to address 0xA000).

The Filters 

These filters are very important (and are implemented using MMX!). There are three: mmx_shade, mmx_blur and mmx_blur_right.

The particles are drawn by the MAIN function with a solid white color (i.e. an index of 255). Instead of wiping the screen after each frame, we call the mmx_shade function which decreases the value of all the pixels on the screen by 1. Its code is very simple:


mmx_shade:
mov cx, 320*200/8
xor di, di
movq mm1, [sub_mask]
.lop:
 movq mm0, [es:di]
 psubusb mm0, mm1
 movq [es:di], mm0
 add di, 8
 loop .lop
ret

Where,

sub_mask: dd 0x01010101, 0x01010101

Note how simple this is! First of all we gain a big speed factor as we are processing 8 pixels in each iteration. Even more importantly we use the extremely convenient saturated subtraction instruction. Saturated subtraction means that if the result of the subtraction is negative, then instead of getting the value mod 256, we get 0!

Running the fireworks program with only the mmx_shade filter on, results in this:

shade_only.JPG

Not very impressive yet, but it does look like fireworks! (at least when animated…)

In order to make our program much nicer, we use two more blurring filters. The mmx_blur function averages the value of all pixels with that of their 4 immediate neighbours (giving twice more weight to the value already in the current pixel). Its code is:

mmx_blur:
mov cx, (320*200-330*2)/8
mov di, 328
.lop:
 movq mm0, [es:di]
 movq mm1, [es:di+1]
 movq mm2, [es:di-1]
 movq mm3, mm0
 movq mm4, [es:di-320]
 movq mm5, [es:di+320]
 pavgb mm0, mm1
 pavgb mm3, mm2
 pavgb mm4, mm5
 pavgb mm3, mm4
 pavgb mm0, mm3
 movq [es:di], mm0
 add di, 8
 loop .lop
ret

Notice that we skip the first and last lines (plus several extra bytes) in order to keep the function as simple as possible. Those of you with sharper eyes will observe a small issue with this function. In order to properly perform these averages, we need to use a third buffer. As we are scanning through our buffer, we use the results of previous blurs as inputs to the following ones, instead of using the original contents of the buffer. This is not very noticeable, and as it saves us the need to allocate a third buffer it is worth it.

Running the program with the mmx_blur filter applied, as well as the mmx_shade filter, results in this:

mmx_blur.JPG

Much nicer! Finally we have the mmx_blur_right function, which is a slight variation of mmx_blur, that gives the fireworks comet like appearance. Its code is self-explanatory:


mmx_blur_right:
mov cx, (320*200-330*2)/8
mov di, 328
.lop:
 movq mm0, [es:di]
 movq mm1, [es:di+1]
 movq mm2, [es:di+320]
 movq mm3, [es:di+321]
 pavgb mm0, mm1
 pavgb mm3, mm2
 pavgb mm0, mm3
 movq [es:di], mm0
 add di, 8
 loop .lop
ret

The MAIN program switches between the two blur functions once in a while. The result of applying the mmx_blur_right function is:

mmx_blur_right.JPG

Exercises to the Reader

  1. Change the three filter functions so that they use the 16-byte SSE registers, thus gaining a factor of 2 on these functions’ running time. This is very easy!
  2. Add another offscreen buffer and correctly implement the blurring.
  3. Make the program shorter/cleaner/nicer. Play with the parameters. Change the palette. Enjoy!

I hope that you now have a decent understanding of SIMD and how to use it when programming the Pentium!

PS – in case you want to assemble the source use the following command:

nasmw.exe fireworks.asm -o fireworks.com

You are in my Seat!

Rating: 3.5
July 25th, 2007

plane_seats.jpg There are 100 seats in an airplane.

There are 99 male passangers with reserved seats and one female passager that does not have a ticket.

The female passanger enters the plane first, selects a random seat (out of the total 100 seats) and sits in it.

Then the first man enters. If his seat is not taken he sits in it. If it is taken he selects a non-occupied random seat in the plane and sits there.

The rest of the men enters and does the same – each of them first tries to take his own seat (if it is available) and otherwise sits in a random non-occupied seat.

What is the probability that the last man to enter the plane gets his original seat?

easyriddle.gif

Blindfolded Flipping

Rating: 2.5
July 25th, 2007

blindfold_small.gif There are 100 coins on the table. 90 of the coins have their Heads face up, the other 10 have their Tails face up. Your task is to split the coins into two groups, such that both groups contain the same number of coins with their Heads face up. You are even allowed to flip the coins…

The catch is that you cannot see the coins – you are blindfolded!

What is your strategy?

easyriddle.gif

Spot The Not

Rating: 4
July 25th, 2007

This one is a riddle of my own invention. It gives a very good counter-example to something we tend to take for granted. Do not be discouraged – I posed it to one of my smartest professors and he did not find the answer! (at least not in the first five minutes…).

The riddle requires some knowledge of Topology and Real-Analysis.  For those of you lacking it, all the relevant definitions are included at the end (I recommend skimming through them before reading the riddle itself).

This seemingly trivial list of claims leads to a contradiction. Can you find the error?

  1. If A is an open subset of R, the R-A is a closed subset.
  2. If A is a subset of R, then bdy(A) = bdy(R-A).
  3. If T is the set of all irrational numbers in the segment [0,1], then u(A) = 1 (u is the Lebesgue measure).
  4. There exists a closed set B contained in the set T (T as in 3) such that u(B) > 0.9.
  5. If A is an open subset of R, then A is a countable union of open intervals.
  6. If I is an open interval then bdy(I) contains at most 2 points.
  7. From 5 and 6 it follows that if A is an open subset of R, then bdy(A) is countable.
  8. From 1, 2 and 7 it follows that if A is a closed subset of R then bdy(A) is countable.
  9. From 8 it follows that if A is a closed set then u(A) = u(A-bdy(A)).
  10. From 9 it follows that u(B-bdy(B)) > 0.9 (where B is as in 4).
  11. Let C=B-bdy(B). Then C is an open set contained in T with u(C) > 0.9. This is clearly impossible since the only open set contained in T is the empty set with Lebesgue measure of 0!

Can you “Spot the Not”?

Definitions Used by the Riddle

  1. Open set – a set is called open if each point of the set is an interior point of the set.
  2. Interior point - a point p is called an interior point of a subset A of R if there exists e>0 such that the interval (c-e,c+e) is contained in A.
  3. Closed set – a set is called closedif the limit of every converging sequence contained in the set is also in the set. It can  be shown that if A is an open set in R, then R-A is a closed set in R, and vise-versa. This is actually very easy - try to prove it!
  4. Measure of an interval – the measure of an interval I=(a,b) is denoted m(I) and is equal to b-a.
  5. Lebesgue outer measure – let A be a subset of R. The Lebesgue outer measure of A is defined as inf { Σm(Pi) | Pi are countably many open intervals and A is contained in union of the Pis }. Those of you unfamiliar with the definition of the Lebesgue measure, can replace the occurrences of Lebesgue measure with Lebesgue outer measure in the claims above.
  6. Boundary – the boundary of a subset A of R, denoted bdy(A) is the set { x | for all e>0, (x-e,x+e) ∩ A, contains both a point from A and from R-A }.

Good luck!

lebesgue.gif

Higher Dimensions

Rating: 4
July 22nd, 2007

cube_4d.GIFA couple of years ago I was very interested in visualizing objects of more than three dimensions. I reasoned that if I stared at such objects for long enough time, I was bound to get a good intuition on how they look.

But how can one see objects of more than three spatial dimensions?

Our goal in this post will be to render an object residing in a coordinate system of arbitrary dimensions on a 2D computer screen. We will try to do this in such a way as to lose as little information as possible on the structure of the object, while maintaining its 2D description as simple as possible (so that indeed, we will get some form of intuition on the shape of the object).

The 4D Cube

Before we begin with the actual rendering process, lets explore some properties of the 4D cube. Lets start with the obvious:

A 0D cube is simply a point.

A 1D cube is a line segment. It has 1 * 2 = 2 faces each consisting of a 0D cube.

A 2D cube is a square. It has 2 * 2 = 4 faces each consisting of a 1D cube.

A 3D cube is a cube (I hope that the repeated use of the word cube for two slightly different meanings does not confuse you). It has 3 * 2 = 6 faces each consisting of a 2D cube.

Finally a 4D cube has 4 * 2 = 8 faces, each consisting of a 3D cube.

Here is a list of some interesting figures about the 4D cube. Can you figure out why they are correct?

These values can also be generated by my program, which is described below.

4D cube:

Number of edges:  32
Number of facets:  8
Facets per edge:  3
Edges per facet:  12

3D cube (for comparison): 

Number of edges:  12
Number of facets:  6
Facets per edge:  2
Edges per facet:  4

Dimensionality Reductions

I will now present several methods for encoding an N-dimensional object in (N-1)-dimensional space. Call such an encoding a 1-reduction. It will then be possible to encode any object on a 2D screen, by repeated applications of 1-reductions.

Lets start by considering a very simple special case. Lets say that our N-D (N-dimensional) object is actually an (N-1)-D object residing in N-D space. An example of this is an object with a constant coordinate. More complex examples exist (the object may not have any constant coordinate, but after applying some rotations, one of the coordinates may become fixed).

A basic 1-reduction can then consist of simply dropping the extra coordinate.

This may seem trivial, but I have a point here – when considering an inanimate object one of the coordinates is always fixed – that of the time!

Our first approach will then be trading a spatial dimension for a time dimension. I.e. encoding the Nth coordinate of an inanimate object by some form of animation.

A simple such encoding consists of slicing the N-D object. Consider the 3D case. If we denote the coordinate system of the object by x, y and z and the time coordinate by t, then we can select only points with a constant z coordinate equal to t. We would then get an animated 2D object representing our 3D object.

A key feature of this 1-reduction method is that no structural information about the object is lost by this encoding. Lets define this formally:

Definition – A 1-reduction of an object is called lossless if it can be reversed uniquely.

Using this definition, we can say that the method of animated-slices proposed above is lossless. Note that animated-slices is obviously not the only lossless encoding (can you think of others? 8-) ).

We will now restrict our discussion to smooth and bounded objects.

Definition – an object is called smooth if it can be approximated well by discrete boxes.

Definition – an object is called bounded if there exists a cube of finite sides that contains the object.

Cubes, spheres and rings are examples of smooth and bounded objects.

The smoothness of an object allows us to encode it using discrete samples. For example, when we display an animated slice of a 3D object on a computer screen, the animation consists of discrete frames each consisting of discrete pixels. A smooth object will be represented faithfully with such a discrete representation. More formally, lets define:

Definition – A 1-reduction is called casi-lossless if it can be reversed, and the difference between the inversion result and the original object is small.

So, encoding a smooth object as a sequence of frames is casi-lossless.

The advantage of considering only smooth and bounded objects is that we can encode two spatial coordinates on the same axis.

Lets start by demonstrating this with the simple case of a 3D cube (we will get to 4D cubes shortly!). This encoding is depicted here:

slice_illustration.GIF

The image shows the 2D slices of the 3D cube (rotated by 45 degrees around two of the axis).

Note that the encoding is indeed casi-lossless: the entire structure of the cube can be reconstructed uniquely from the slices (again, neglecting small differences due to the fact samples are discrete).

Note that employing this method on a 4D object, instead of on a 3D one, results in 3D slices instead of the 2D ones. We can then apply another casi-lossless 1-reduction to the 3D slices, and get a casi-lossless 2-reduction from four dimensions to two dimensions (I leave to the reader to formulate the definition of a casi-lossless 2-reduction).

Projection 

The method above, generalized slightly, is the most important 1-reduction method - the Projection. Projecting an object consists of simply ignoring one of its coordinates. It is actually simpler than what we discussed above – instead of converting one dimension to another (i.e. z-coordinate to t-coordinate) we simply ignore one of the coordinates.

This method by itself is obviously not lossless (it is not even casi-lossless). This means that there are many different N-D scenes that result in the same (N-1)-D projection. The method does work on a much larger scale of objects though, and this is the method we shall employ.

A projection of the rotated 3D cube is depicted here:

slice_illustration_proj.GIF

Note that unlike the slices version above, there is no way to tell whether this cube is rotated around 2 of the axes or just one (actually – we cannot even know whether it is a cube!).

After projecting an object we can try to rescue some of the information lost by encoding it in the color channel.

In the picture below I encoded the same 3D cube as above, but this time the color of the object represents the depth at the pixel. The brighter the color, the bigger the intersection of the cube with the segment perpendicular to the slice plane:

slice_depth2.GIF

Using the color channel thus does not make the encoding lossless. There is no way to tell whether the shape above is that of a cube or of a pyramid. It does considerably reduce the set of possible source-objects for the encoding – it cannot be a cube rotated about only a single axis (as may be the case for the colorless version).

The main advantage of the projection method is that our brain is used to reversing projections. When you close one of your eyes, your brain recieves a completely flat two-dimensional image. Even with both eyes open, you do not see a true three-dimensional image. You merely see two 2D images. Two projections of the 3D space in front of you!

Your brain does a great job at guessing the real 3D structure of objects by their 2D projections (it tries to calculate the most probable inverse of the projection). As we noted before, the projection method is not lossless, and thus your brain can be fooled. These are a couple of examples of the works of Felice Varini (see http://www.varini.org/) demonstrating mistakes made by the brain at reconstructing the true 3D structure of the scene from a 2D projection:

3d_room_04.jpg

3d_room_05.jpg

And also:

3d_room_10.jpg

3d_room_11.jpg

The photos show that there can easily be two completely different 3D images resulting in the same 2D projection. That said, in most usual scenarios your brain correctly guesses the 3D structure of things based on insufficient 2D inputs. It is important to say that your brain uses a lot of apriori information when reconstructing such scenes. In the absence of such apriori information (as is the case with our 4D cube) our brain’s work is indeed much harder.

Ok, so now that we understand some of the properties of 4D cubes and how projections work, lets get to the cool stuff – render some 4D images!

The Program

If you are not interested in an implementation of all of the above, you are welcome to skip this section.

I implemented the projection reduction method with a few lines of python. My program displays an arbitrary shape (box, ring, spiral) of arbitrary dimensions in a universe of arbitrary dimensions (of course dim(universe) >= dim(object)!).

The objects are rotated in real time. A color encoding of the high coordinates is also used.

At this point I should add that I wrote this program quite a while ago and for internal use only (i.e. I never thought someone else would look at it – I am actually quite surprised that I even found it and that is still works! [as unattended code tends to rot]). I did spend a few minutes in order to enable you to run the program from the command line (instead of from a python shell), but in order to access all of the options a shell is still required. Also, the code was not at all tested, and running it with the wrong arguments can cause it to throw some exceptions. It is very short though, and all of you with basic python knowledge are welcome to browse it. 

A very important part of the code consists of the create_rect and gen_line_pairs functions, defined as follows:

def create_rect(dim = 4):
    if dim == 1:
        return [[-1],[1]]
    else:
        res = []
        r1 = create_rect(dim-1)
        for i in r1:
            res += [i + [-1]]
            res += [i + [1]]
        return res

def gen_line_pairs(rect,dist=1):
    res = []
    for i in range(len(rect)):
        for j in range(i):
            dif = count_dif(rect[i],rect[j])
            if dif <= dist:
                res.append((i,j))
    return res

Can you figure out how they work? :-)

Note that in order to run the program, you will need python, as well as the pygame package (which I use for drawing and managing user input). You can try running the program with the following arguments:

>4d.py 4 rect 4d auto normalize

Some keys you can try:

  • Up, down, left, right, home, pageup, return, space, end, pagedown - manually rotating the object (around the four first axis). This only works when automatic rotation is off.
  • r – toggles automatic rotation mode.
  • 4 – toggles the high-dimensions mode of the automatic rotation. When the automatic rotation is enabled, the object will be rotated around random axis. When the high-dimensions mode is turned off, the rotation will only take place around the first 3 coordinates. This mode allows appreciating a 3D slice of a high dimension object.
  • c – change the coordinate that is used for the color-encoding.

The complete source of the 4D renderer can be found here.

Results 

To give all of you without access to python a taste of how some basic primitives do look like in higher dimensions I attached these images. I must say that viewing the objects in motion is much more insightful. I really recommend running the program!

This image shows a regular 3D cube, residing in a 4D universe (after it was rotated a little about all of its axes):

rect_4_reside.GIF

Notice that this is indeed a cube (all its sides are equal!) the reason it appears to be a rectangular box is because the rotation about the 4th axis deforms it. A simpler to grasp, but similar phenomenon occurs while residing a 2 dimensional square inside a 3D universe. After some rotations it may look like this:

facet.GIF

This image shows a 4 dimensional cube:

cube_4d.GIF

This is perhaps the most interesting object (maybe I will add some more pictures of it).

The same cube, this time viewed from a different angle and having its faces interconnected:

rect_full_4d.GIF

A 5D cube:

cude_5d.GIF

Two rings in a 4D universe:

rings_4d.GIF

And finally, a very simple 10D spiral in a 10D universe (notice that all the lines are perpendicular!):

spiral10.GIF

Again, these images are much more intuitive when the complete animation is viewed, so download the program and run it!

Enjoy!

coke-balancing.gif

hold-the-sun.gif

What are the Odds?

Rating: 3.5
July 20th, 2007

coin.gifThis is the first riddle I post in my site that I haven’t yet solved.

Because of this, I give it the “very hard” icon. It may turn out to be easy – do not be intimidated!

veryhard.gif

You are given a natural number N and two coins. You can set the probabilities of getting heads or tails on both coins as you wish. You are then asked to generate a random number uniformly distributed on the set { 1, 2, …, N }. The catch is that you must have a finite upper bound on the number of coin tosses you use.

Lets clarify the requirements of the riddle by a simple example. Say N=6. Setting the probabilities for the coins to 2/3 and 1/2 will do the trick, as we can generate a number uniformly distributed on the set { 1, 2, 3, 4, 5, 6 } with a maximum of 3 tosses of the two coins (can you see how?).

Good luck! 

I’m Back (and so is Phrack)!

July 19th, 2007

blond_small.jpg Hi everyone!

As you may have noticed, yaniv.leviathanonline.com has not been updated for a while (exactly one month, pour être précis). This was largely due to my intense exam period.

First, I want to thank you for all the interesting ideas you sent me. This extended period of not updating my site resulted in an extremely long idea-list – so stay tuned!

Now, apart from me, phrack-logo.jpg Phrack, which was also on a long vacation, is now back.

I must admit that the new issue #64 does not appear to contain anything ground-breaking, but it deserves a quick read (if only for times past).

For those of you that don’t know Phrack, I can only recommend that you browse its outstanding articles’ archive since the very first issue on November 17, 1985 (I was exactly 3 years and one day old then…).

Should you not be interested in hacking or general anarchy, and thus choose not to read all of the issues, at least be sure not to miss the famous Phrack #49′s Smashing The Stack For Fun And Profit. To get an idea of the Phrack spirit, I also recommend reading the Hacker’s Manifesto and Screwing over your local McDonald’s.

See you soon!

Out of the Norm

Rating: 4
June 19th, 2007

vivaelnorm.jpgI am taking a course about Hilbert Spaces this semester. A very basic notion in a Hilbert Space is that of the norm. For a while now, I kept several questions regarding norms in the back of my mind, and as I finally got to think about them, I wanted to share my conclusions with you.

Introductory Concepts 

Lets start from the beginning (those of you familiar with Normed Spaces can skip this section). A linear space is a set of objects with two operations:

  1. Addition (marked by the symbol +) which acts on two elements of the space and returns an element of the space.
  2. Multiplication (marked by the symbol *) which acts on an element of the space and a real number and returns an element of the space.

The operations must follow these rules (x and y are elements of a linear space, a and b are real numbers):

  1. x + y = y + x                              (+ commutativity)
  2. x + (y + z) = (x + y) + z           (+ associativity)
  3. Existence of a zero element (marked by the symbol 0) such that for every element of the space: 0+x = x+0 = x
  4. For every element x of the space, existence of a negative element (marked by -x) such that: -x + x = 0
  5. (a*b)*x = a*(b*x)                    (* associativity)
  6. For every element x of the space, 1*x = x
  7. a*(x+y) = a*x + a*y                (distributive law 1)
  8. (a+b)*x = a*x + b*x                (distributive law 2)

A simple example of a linear space is the Euclidean space of n dimensions (for any n >= 1).

Note that we define the minus symbol, -, as follows:

x – y = x + (-1 * y)

Now, a norm is essentially a function that somehow tells us how far away an element of our space is from the neutral element (0).

We write ||x|| for the norm of x.

The norm can also tell us how far apart two elements of the space are. We say that the norm generates a distance. This is done as follows:

dist(x, y) = ||x – y||

Formally, a norm is a function from the linear space to R+ (the set of non-negative real numbers), such that (x and y are elements of the linear space, a is a real number):

  1. ||a*x|| = |a| * ||x||                             (homogeneity)
  2. ||x+y|| <= ||x|| + ||y||                      (triangle inequality)
  3. ||x|| >= 0 and ||x|| = 0 i.f.f. x = 0

Examples 

Now lets consider the 2-dimensional euclidean plane as our linear space (call it R2). An example of a norm in R2 is the euclidean norm (where u=(x,y) is a member of R2):

||u|| = sqrt(x^2 + y^2)

We shall call this norm, the l2 norm.

For those of you not familiar with the concept of the norm, verifying that this is indeed a norm may prove insightful.

The l2 norm is very useful as it measures actual distances in a plane.

Now lets consider another norm, called the l1 norm:

||u|| = |x| + |y|

Again, it can be easily verified that this is indeed a norm. This norm obviously generates a different distance than that of the plane. Does it have a “real-world” use? You are encouraged to take a minute of reflection upon this before reading on…

 Well, yes, it does have a use. Lets say that we are standing in Midtown-Manhattan.

manhattan_midtown.gif

What is the distance between us? The aerial distance is the regular euclidean distance (neglecting the roundness of the Earth) but the walking distance is the distance generated by the l1 norm! This is caused by the fact that we can only walk along the streets (i.e. parallel to the axes!). For this reason the distance generated by the l1 norm is sometimes referred to as the Manhattan-Distance.

Now, lets consider a third norm called l∞, defined as:

||u|| = max(|x|,|y|)

Again, this is obviously a norm (check it!).

Now lets start with the fun stuff…

First, can you figure out the meaning of the names of the norms (l1, l2, l∞)?

For all p >= 1, lets define the function fp:R2->R+ as:

fp(u) = (x^p + y^p)^(1/p)

Then fp is a norm on R2 and we call it lp (I feel like I am repeating myself, but again, this can be easily checked).

The names of l1 and l2 become obvious right away, and so does the name of l∞ (after a little thought :-) ).

If you still can’t figure it out - l∞(x) is the point-wise limit of lp(x) as p->∞ for all the elements x of R2. Actually, the functions lp converge to l∞ locally-uniformly (can you prove this?).

This brings up an interesting question: Do norms other than l1 and l2 have any “real-world” meaning? If you have any insights on this, please comment!

Unit Circles

The unit circle of a normed space is the subset of elements of the space such that their norm is equal to 1.

Can you figure out how l1, l2 and l∞ unit circles look like in R2?

As I found it difficult to visualize the unit circles for norms lp, p>2, I jotted down a few lines of C in order to create the following image:

norms.JPG

The red, green, light-blue and blue shapes denote the unit circles for the l1, l2, l3, and l∞ norms respectively. The other colors denote the unit circles for lp norms for various values of p.

In the picture I included values of p that are not whole numbers. This should not pose any problems.

I also included values of p that are less than 1 (the shapes inside the l1 circle). These are there for illustration purposes only, as they do not constitute norms!

Note: the glowing effect was generated by my program to overcome jagged edges caused by precision issues.

Something else to think about is the volume of the unit circles. The volume of the unit circle under the l1 norm is 2, under the l2 norm Π and under the l∞ norm, 4.

Can anyone supply some more values?

That’s it for now!

I might update this article, so check the updates section!

Ants Revamped

Rating: 3.5
June 14th, 2007

ant-drawing2.jpgPlease make sure you have read (and solved) the original Ants riddle before reading on!

 easyriddle.gif

Here are two sequel questions to the original Ants riddle:

  1. Which is the last ant to fall off the stick?
  2. What is the maximum number of ant collisions possible?

I want to thank Danny Valevsky for bringing these to my attention!

Divide and Conquer

Rating: 2.5
June 9th, 2007

divide_and_conquer.gif Yesterday, Zohar Gilboa, with whom I take a PDE course, asked me the following riddle. I must admit it is probably the riddle I like the least on my site thus far. It is very easy though, and as Zohar told me he really liked it himself, I decided to publish it anyways.

Consider the following image:

divide_and_conquer.gif

Your goals are:

  1. Divide shape 1 (bluish) to two equal parts.
  2. Divide shape 2 (greenish) to three equal parts.
  3. Divide shape 3 (yellowish) to four equal parts.
  4. Divide shape 4 (purplish) to five equal parts.

All the division parts mentioned should be continuous shapes.

The second page contains the solution! Please only visit it after giving the riddle some thought…

Secure Computing – Part I

Rating: 4
June 7th, 2007

Wheres Waldo?This article is the first among a series of articles describing the subject of Secure Computing. The material was mainly taken from a Cryptography course by Amos Fiat.

Say we have a function F(x1,x2,…,xn)->(y1,y2,…,yn) that takes n inputs and has n outputs. Me and n-1 of my friends (i.e. we are n people in total) have secret numbers. Lets number the people p1,…,pn and the secret numbers x1,…,xn (i.e. person pi knows secret number xi). We want to perform a process after which person pi will know yi, where yi is the i’th output of F computed on the xi’s, but he must not know anything else. Let me give you an example:

Say that there are 3 people, each having a monthly salary of xi dollars (for i=1,2,3). They want person number 1 to know the sum of the salaries, person 2 to know the product of the salaries and person 3 to know the maximum of the salaries. But they do not want to reveal their salaries to each other!

A simple solution would be to find a new and neutral person that all of the men trust. They would then reveal to him all of their salaries, one by one, and he will perform the calculation and tell them the results. The goal of this series of articles is to solve this problem without using such a trusted party.

Before we begin to think of a solution, lets notice some inherent problems:

  • The men can lie about their salaries. Since it is assumed that only person pi knows xi, he can lie about it (i.e. input another number, say xi’ instead of xi). There is obviously nothing that can be done about it. We will therefore assume all the men (while they may try to cheat at other parts of the protocol) will input their true numbers to it.
  • The function itself may leak back information about the secret numbers of the other people. For example, if there are only 2 people, and y1 is the sum of the xi’s and y2 is their product, then after the process each of the men will know the secret number of the other (as the function is reversible). This is simply not considered a problem, as our goal was to do without the trusted party, and if for some reason the function is reversible (or semi-reversible) then the same problem will arise with the trusted party protocol, which is what we are trying to emulate.

The solution will consist of several parts (which will be dealt with in the other parts of this topic):

  1. Performing a Zero Knowledge Proof (ZKP) of a generic NPC problem. 
  2. The “Oblivious Transfer” protocol.
  3. Transforming a general function f(x1,…,xn) to a circuit consisting only of XOR gates and AND gates.
  4. Solving our problem for the specific cases where our function is a XOR gate or an AND gate.

So now, after we have set our goals, lets dive in to the the first step:

Zero Knowledge Proof of an NPC Problem

At this point, I rather not formally define the exact meaning of a ZKP. I will instead try to explain is intuitively. Say I know the answer to a complicated puzzle and you don’t. The process of me convincing you that I know the answer, without you learning anything about it, is called a Zero Knowledge Proof.

Let me give you this neat example (taken from Applied Kid Cryptography). Say we are playing the game of Where’s Waldo? with this picture:

waldo.jpg

After minutes of contemplation, I suddenly find him! The following dialog then takes place between us:

Me: “Yes! I found him! Wow, he was really well hidden.”

You: “I don’t believe you! Its too hard! You must be lying!”

Me: “No, seriously. He’s right there, next to the…”

You: “Go on – Where is he?!”

Me: “Well, I do not want to ruin this for you… This is such a cool puzzle…”

You: “I knew it! You are lying – you haven’t found him.”

The situation becomes frustrating for me! How can I prove to you that I know where Waldo is without revealing his location?

I can give you a general fact about his location (for example, that he is near the center of the picture and not near the borders) and later when you will find him too, you will be able to verify that I was indeed right.

There are obvious drawbacks with this method:

  • While not revealing exactly where Waldo is, a lot of information is still leaked (you now know not to look for him in the sides of the image).
  • I can lie to you with a good probability – even if I haven’t found him my general statement may be right.
  • And finally, the biggest drawback is that you may still not be convinced (i.e. we have to wait for you to find him as well).

Suddenly it hits me! I know what to do! I take a big white piece of paper (much bigger than the original image, say twice as big in every direction). I then place the original image at a random location beneath it and cutout Waldo’s location from the paper. I then place another big piece of paper on top of everything (this time, without the cutout):

waldo_ilust.gif

In my illustration the covering papers have the same size as the original image, but they should  be much bigger in reality.

You then choose a number: 1 or 2. If you choose 1, I remove both pieces of paper, revealing to you the entire image. If you choose 2, I remove only the top piece of paper, revealing to you that I know where Waldo is, as such:

waldo_cover.gif (click thumbnail for big image)

What does this accomplish? Well, first of all, if indeed I know where Waldo is I can answer both of your questions. If, on the other hand, I was lying, and I put the original image under the papers, then I could not have cut out Waldo’s location. In order to avoid this, I could have replaced the underlying image all together (as you do not see it at all!) with another image, in which I know where Waldo is. In this case, should you ask me to show you where Waldo is, I will be able to do so, but I will get busted if you selected 1, and made me reveal to you the entire image.

What we got is this:

  • If I know where Waldo is, I can deliver a correct answer that you can verify 100% of the time.
  • If I am lying and I do not know where Waldo is, you will bust me (no matter what I’ll do) with a probability of at least 1/2.

Well, being busted with a probability of only 1/2 is not very bad, but notice that we can repeat the process any number of times (each time placing the white papers at different locations over the original image). My chances of being correct are independent in each of the times, and so I will get busted with a probability (1/2)^n (where n is the number of iterations). If n is chosen high enough, you can be quite certain that I am not lying (10 iterations suffice for the probability of missing a liar to be less the 1/1000).

Note that the process is not really “Zero Knowledge” as by looking at the cutout version of Waldo you get an idea of his immediate surroundings and of his own posture (indeed, my cutout image above helps a little in finding him).

In the next installment of the series I will further discuss the notion of ZKP, especially in relation to solutions of NPC problems.

To be continued…

Understanding Soccer

Rating: 5
June 3rd, 2007

soccer_ball.gifIn 1998 (I can’t believe it was so long ago!) I was working at MATE (Media Access Technologies), an Israeli start-up at the time, specialising in face recognition and video searching. It was then that I came to know one of the coolest tricks in image recognition - the Hough Transform (pronounced “huf transform“).

In this article, I will demonstrate the algorithm with the following example:

Say you want to write a program that watches soccer. The input to your program is a sequence of frames from the game. The output should be the locations of all the players at all times. Given a frame of soccer, you first need to locate the players in the frame (i.e. you should know that a certain player occupies a certain set of pixels). Next, you need to know what location in the world is represented by those pixels. In order to do that, you need to figure out where the camera is looking at. In this article I will focus on this part of the problem (understanding the view point and direction). I will not deal with the (easier) part of determining what pixels are occupied by players.

Consider the following image of Ronaldinho:

nike-ronaldinho.jpg

I manually transformed it to this:

nike-ronaldinho_small.jpg

(the portion of the image covered by grass). I think it is clear how this step can be automatized.

Next I applied an automatic algorithm that got me this:

bw.gif

(the same image converted to black and white, for all the pixels that were bright enough). My algorithm simply turned all the pixels that are brighter than the color (80,80,20) (Hough Thresh 1) on a per channel basis to white, and all the other pixels to black. Note that my threshold color requires a high value of the red component, which is not present in the green grass, in order to turn the grass itself to black.

Lets return to our original problem – determining the point and direction of view (POV). We have big clues in the picture for determining the POV, namely the lines on the ground. Note that from the lines on the ground in the black and white picture, it is easy to see that Ronaldinho is standing in the center of the field. Can we somehow detect them?

Lets formulate the following abstact problem:

You are given a matrix M over Z2 (i.e. a matrix of bits). The matrix represents an image. Pixel Pij of the image is white if Mij is 0 and black if Mijis 1. The image contains straight lines. The lines are not perfect (i.e. they may overlap, they may not be continuous and there is a lot of noise). Your goal is to develop an efficient algorithm that receives the matrix and outputs the line equations.

Please take some time to think about this before continuing to read. If you spend enough time thinking about it, you will probably reach the conclusion that this is a really though problem.

A solution to this problem is the Hough Transform. The Hough Transform consists of noticing a duality of the Euclidean plane. We Consider a the straight line equation:

y = m * x + b

In this equation m and b are constants and x and y are variables. But what happens if we make x and y constants and m and b variables?

Well, the set of m’s and b’s that satisfy the equation represents all the lines that pass through our point (x, y)!

It is easy to see that for each point in the x-y plane we get a line in the b-m plane, namely, the line:

b = -x * m + y

Now each line in the x-y plane is matched with a point in the b-m plane! The duality depicted:

Duality

So the algorithm basically says:

  • Run through the original matrix.
  • For each pixel (x,y) that is white draw the line that matches it in the dual plane.
  • Then count the number of line intersections in the dual plane.
  • All points at which there is an intersection of more than T lines (these correspond to lines in the x-y plane on which we have more than T points in our original image) are considered valid lines.

To better illustrate the algorithm, I jotted down a few lines of python. The results follow (I must say that I was really surprised that it worked perfectly the first time around! All the threshold values I picked were somehow correct on the first go ;-) ).

Note that there is an implementation problem with using the b-m plane - Precision issues. For example, a near vertical line has huge values of m. In order to solve this, I used another duality (which is completely equivalent): the duality of the x-y plane and the a-c plane, where a is the angle of the line with the x axis (i.e. m = tan(a)), and c is b*cos(a).

The Hough map (i.e. the result of the Hough Transform) my script produced is:

hough_org.gif

Since it is kind of hard to distinguish the different color values, I generated a more human-friendly version (all the values are multiplied by 15):

hough.gif

And finally, the result (after converting back to the x-y plane):

res.gif

And the same result overlayed on the original image:

merged.gif

A perfect match! 

This is a big subject, and I only covered a small part of it. I will update this article in the near future with a more detailed analysis. If you have specific questions please post them as comments and I promise I will answer!

Meanwhile, let me just point out that the same method can be used to detect other shapes (e.g. circles). The generalization is trivial.

I will finish by showing the results of running the script on another image. Note that in this case, the threshold values are a little off. The results are nevertheless impressive:

a_small.jpg

The black and white image generated was:

a_bw.gif

The Hough map:

a_hough.GIF

The result:

a_res.GIF

And, finally, the overlayed result:

a_merged.GIF

To be updated…

Two Envelopes

Rating: 4
June 2nd, 2007

EnvelopesYou write down 2 numbers on 2 pieces of paper (one number on each piece). You put each paper in a sealed envelope. I choose one of the envelopes randomly and open it. I then carry out a certain procedure at the end of which I know with a probability greater than 1/2 whether I received the larger or the smaller of the numbers. I can do this even if when you initially wrote down the numbers, you knew my decision procedure!

How can I do that? What is my trick?

Guess my Number

Rating: 4
May 26th, 2007

easyriddle.gifWell, it is time for a cool and easy riddle. 100 men are each assigned a number between 1-100 with repetitions (e.g. all of them may be assigned the number 17). Each of the men sees all the numbers assigned to all the other 99 men, but none of them sees the number assigned to himself.

Each of them needs to guess his own number (of course no information is exchanged between them – no one hears what others have guessed etc.). What strategy can they employ in order to make sure at least one of them makes a correct guess?

Some followup thoughts:

  • Can they make sure more than one person succeeds?
  • How many different solutions to the riddle are there (i.e. how many strategies can the men employ?).

Whole Rectangles

Rating: 3.5
May 25th, 2007

veryhard.gifA short introduction to Graph Theory is needed for this one. If you already are familiar with Graph Theoretic constructs feel free to skip it.

A graph G, is a pair (V,E) where V is a set of vertices and E is a set of edges. Each edge is an unordered pair of the form (u, v) where u and v are vertices (i.e. they belong to V). The degree of a vertex t (denoted deg(t)) is the number of edges containing it:

deg(t) = #{ e | e = (s, t) ^ s belongs to V }

In this post I only consider finite and simple graphs. A finite graph is a graph whose vertex-set V is a finite set. A simple graph is a graph in which there are no loops (i.e. edges of the form (u, u)) and no multiple edges (i.e. E is a proper set so it cannot contain the same element twice).

The following is a trivial claim:

The sum of the degrees in a graph equals twice the number of edges.

It is trivial to prove this claim by induction on the number of edges (on a graph with no edges it is clear, and by adding an edge to the edge set of the graph the sum of degrees increases by two).

Use this claim to solve the following riddle:

A rectangle is called whole if at least one of its sides is an integer. For example, a rectangle of 2 by 3/5 is whole as well as a rectangle of sqrt(5) by 3. A rectangle of 1/2 by 1/2 is not whole. Examples:

Rectangle Exmaple

A set T of rectangles constitues a tiling of a rectangle R if the rectangles in T are disjoint and for every point p in R there is a rectangle S in T such that p belongs to S. Example:

Tiling Example

Prove that if R is a rectangle and T is a tiling of R consisting only of whole rectangles (i.e. every rectangle S in T is whole) then R is whole.

Smart Disagreement

Rating: 3.5
May 23rd, 2007

DiceThis one is a riddle of my own invention. I am really proud of it as it sounds impossible at first, but it can be resolved in a very satisfying fashion. I must admit that had I heard about it from someone else I would not have gone through the trouble of trying to solve it, as it sounds like a ”cheap-trick” riddle. I can only ask you to trust me that it is indeed a purely mathematical riddle. Although it did provoke several harsh disputes with some very smart people (Matan, Eyal, how are you guys?), I was able to finally convince them of the mathematical validity of my solution.

The riddle goes like this:

A group of very smart men perform an experiment. There are no restrictions on the men’s behaviour during the experiment. After the experiment is over, each of the men reveals to all the others everything that he knows. Despite the fact that none of the men lies and they all know of everything that happened, there are 2 individuals in the group that reach different scientific conclusions. Your task is to devise such an experiment.

Note that the experiment is scientific in that it does not concern individual tastes or opinions. All the men are assumed to come from similar backgrounds and have similar likings. The different conclusions regard real aspects of nature.

The dice picture is a (vague) reference to one variant of my solution.

The Mighty Jungle

Rating: 2
May 23rd, 2007

I visited my parents yesterday, for the Jewish holiday of Shavuot. I found there a picture of myself and some friends from the time we sailed on the Tuichi river in Bolivia on a raft we built ourselves with machetes. Here is the picture and a legend:

Legend

Tuichi Raft

Note that the picture is very dark, as it was taken in the night time.

I have to admit that looking at the picture, it seems that we are escaping from a prison. To make this post more in the spirit of my site I could have added one of the many prison riddles here, but I must say I am happy with this post just the way it is (maybe I will update it with a prison-riddle in the future).

This other cool picture I stumbled upon in my parents house, I call “What’s for Dinner?”:

Baby

I took it in the market of the city of Cusco, Peru.

Happy holidays!

Constructions with a Straight-Edge and a Compass – Part II

Rating: 2.5
May 21st, 2007

Make sure you have read the first part of this post here.

Now we are ready to define the rules for our second, more complex game – “Construction with a Straight-Edge and a Compass“. These are very similar to the rules of “Constructions with a Straight-Edge Only” (defined in the previous post) except that now, in addition to creating infinite straight lines, you can also create circles, as follows: given 2 points, a and b, you can create the circle with center at a that passes through b. A point is added to the set of points if it is the intersection of 2 lines (as before), of two circles or of a circle and a line.

Now, the addition of the new tool (the compass) enables us to construct many more points than before. If the initial set of points contains only one point then, as before, no new points can be constructed. But if the initial set of points contains two points then a the set of constructable points is infinite (we will shortly prove it). The subject for this second part of the topic will be analysing the set of constructable points in our new game.

Note that it is enough to examine the constructable-points’ set, P, from the initial set A = { (0,0), (1,0) }. This is because if the initial set of points contains the two points a and  b(instead of (0,0) and (1,0)), then there is an affine operator M acting on the euclidean-plane and consisting of translations and rotations, such that a becomes the origin and bbecomes (1,0). Denote the inverse of M by N. Then the set of points constructable from the initial set { a, b } is the same as the set { N(x) | x belongs to P }.

So our question is, “what points belong to P, the set of constructable points, from the initial set A = { (0,0), (1,0) }?”

In order to answer this question we shall develop some tools:

Mid-Point Perpendicular

Given two points, a and b, we can construct the line passing through their mid-point that is perpendicular to the line passing through a and b. This is done as depicted here (the yellow dot is the constructed mid-point):

Mid Point Perpendicular Construction

Point Reflection

Given a point a and a point b, we can create the point c (that is different from a) lying on the line through a and b and whose distance from b is the same as the distance from b to a. This can be done by using the straight-edge to create the line through a and b and then use the compass with center at b and leg at a. cis the intersection of this circle and line:

Point Reflection Construction

Creation of the Axis

The x-axis is easily constructable from the two initial points by using the straight-edge. The point (-1, 0) is constructable from the initial points by using Point Reflection. The y-axis is then constructable from (-1, 0) and (1, 0) by using Mid-Point Perpendicular.

Axis Flip

Given a point on the x-axis of the form (a, 0), create the point (0, a) on the y-axis, and vise versa. Given the Creation of the Axis, as above, performing an Axis Flip is very easy.

Coordinate Selection

Given a point (x, y) create the point (x, 0). By placing the compass’ center on (x, y) and its leg on the origin, its other intersection with the x-axis is the point (2x, 0). Using the Mid-Point Perpendicular we can construct (x, 0).

Now we shall note that the notion of constructable points in our new game can be changed a bit to the notion of constructable numbers. A number x is constructable if the point (x, 0) is constructable. Note that if a point (x, y) is constructable than both x and y are constructable (by using Axis-Flip). If on the other hand x and y are constructable numbers then again by using Axis Flip we get that (x, 0) and (0, y) are constructable points. By using a trick similar to what we did for creating the y-axis the point (x, y) is shown to be constructable. So we shall now only consider the simpler notion of constructable numbers instead of constructable points.

We have shown a relation between constructable points and constructable numbers. We can also consider the notion of constructable distances. A number d is a constructable distance, if it is the distance between to constructable points. It can be shown (I might do this in the future) that a number d is a constructable number i.f.f. (if and only if) d is a constructable distance. One direction is easy. Can you prove the other?

Lets continue to build our tools, this time for constructable numbers.

The Integers

All the integers are constructable numbers. 0 and 1 are constructable (as (0,0) and (1, 0) are in the initial set). Given the integers i and and i-1 integer i+1 can be constructed. Just place the compass’ center on the point (0, i) and its leg on (0, i-1) and the circle’s intersection with the x-axis yields i+1. Integer i-2 can be constructed in a similar fashion. By induction all integers (positive and negative) are constructable.

Negative

If x is constructable then -x is constructable. This is obvious (can you see why?).

Addition and Subtraction

Given constructable numbers a and b, a+b is a constructable number. We can construct -b, and so a+b is a constructable distance. Using the equivalence stated above (which I have not yet proved) between constructable distances and constructable numbers, a+b is constructable. Addition and Negative lead to Subtraction.

Multiplication and Division

 If a and b are constructable numbers, so are a×b and a/b. The construction for multiplication is depicted below (division is similar):

Multiplication Consturction

As you can see,  all the rational numbers are constructable.

Actually, we have reached a more important conclusion: the constructable numbers constitute an extention field of Q, the field of rational numbers!

We are getting closer to the answer we are looking for, as we already know quite a bit about the set P of constructable points.

To be continued…

Total Fun

Rating: 2.5
May 21st, 2007

This post is about a conversation I once held with a good friend of mine, Nadav Sherman. It is not a very serious topic, so take it lightly. We were wondering how to calculate the total amount of fun a person has during his life time. We reasoned as follows:

The absolute state score (a.s.s.) of a person is a number ranking the person’s current state and possessions (it is a function of time). For example, if you have 500,000$ and a girl-friend, then your a.s.s. is higher than that of someone identical to you without a girlfriend, but lower than that of someone identical to you with an additional 1,000,000$. Note that we do not specify exactly how to calculate the a.s.s., but we believe that it can be defined such that claim 1 below will be true. The actual value of the a.s.s. is not directly correlated to the fun a person has – a person with a million dollars and a beautiful girl-friend can be sadder than a poor lonely guy.

We define the effective fun of a person as a measurement of the instantaneous fun a person has in a given moment (again, it is a function of time). Our first claim is:

claim 1: The effective fun equals the derivative of the a.s.s..

Again, this obviously depends on the exact function used to calculate the a.s.s.. Instead of proving the claim above, I will give an intuitive reasoning as to why it makes sense. It is obvious that if your a.s.s. is going up right now, you are happy, regardless of the actual value of the a.s.s.. Conversely, if your a.s.s. goes down you are sad, regardless of the value of your a.s.s..

Lets give an example. If you get a new girlfriend, your a.s.s. goes up and so your effective fun (its derivative) is positive. If you lose 100$ your a.s.s. goes down and so your effective fun is negative. The actual value of the a.s.s. does not matter – only changes in the a.s.s. influence your effective fun.

Please make sure you understand claim 1 before continuing to read (see also the following figure).

Fun Graph

Now, in order to calculate the total amount of fun a person has in a given period we obviously integrate the effective fun function over the given period. We use the fundamental theorem of calculus, stating that the integral of the derivative of a function equals the function itself (note that as this is not the true fundamental theorem of calculus, which speaks solely of continuous functions, but its more in the spirit of it) to get the following:

claim 2: The total fun a person has in a given period is the a.s.s. value at the end of the period minus the a.s.s. value at the beginning of the period.

This leads us to our final (morose) conclusion:

The total fun a person has in his lifetime equals his a.s.s. on the day he was born minus his a.s.s. on the day he dies (no matter how long he lived or what happened during his lifetime).

Perl vs. Python One-Liner

Rating: 3.5
May 20th, 2007

A few years ago a friend of mine asked me the following Perl riddle. Unfortunately, in order to solve it you must know Perl. As I like Python much better, I translated the riddle to Python. Attached are both versions.

I admit the Perl version is a bit more cryptic and if you know both Perl and Python you should try to solve the Perl version (but use Python for everything else in life :-) ).

Oh, and try to solve the riddle without running it (run it only as a last resort).

Perl Logo

perl -wle 'print "True" if (1 x shift) !~ /^1?$|^(11+?)\1+$/' [number]

Python Logo

python -c "import sys, re; print None == re.match('^1?$|^(11+?)\\1+$','1'*int(sys.argv[1]))" [number]

To alleviate all doubt – [number] denotes a numeric command line argument (e.g. 17).

Constructions with a Straight-Edge and a Compass – Part I

Rating: 3
May 18th, 2007

This topic is about defining a weird game and exploring its properties. When I first encountered it I could not help but wonder, why is it interesting?

Well, this game (as you shall soon find out for yourself) has barely any practical implications. But it is an interesting subject none the less. There are 2 main reasons for that:

  1. This game was analysed (with limited success) by many important mathematicians, since the times of the Greeks.
  2. Using the right tools, the complete answer to the questions relating this matter is achievable and it turns out to be beautiful.

Now lets dive in. We will begin with a simplified game I call ”Constructions with a Straight-Edge Only“. BTW, a straight-edge means a ruler without markings (the only thing you can do with a straight-edge is connect two points with an infinite line).

These are the rules of our simplified game: The board is the 2 dimensional euclidean plane. You start with an initial set of points and during the game you make it grow. Lets call this set of points P. On each turn you can connect two points in your set P by an infinite straight line. If two of these lines intersect at one point that is not already in P, the new point is added to your set.

A point of the euclidean plane is called constructable if it can be added to the set. Note that the notion of constructability depends on the initial set of points (i.e. some points may be constructable from one initial set of points but not from another).

Lets demonstrate the rules with some simple examples:

  1. If the initial set of points consists of only a single point then this single point is also the only constructable point (as no lines can be drawn). The same applies for an initial set containing 2 points (as only one line can be drawn, and so there are no line intersections!).
  2. If the initial set contains 3 points then 3 lines can be drawn (as shown here). But no new point is created.
    Triangle Exmaple
  3. If the initial set is a rectangle, one new point can be created.
    Rectangle Image

A more complex example is depicted below. Well, as you can see, many new points can be created. I added to the illustration only the first few, but you get the point.

Complex Example

Let me raise an interesting question I have not yet had time to think about: what initial sets of points (if any) lead to an infinite set of constructable points?

Note that we have not specified that the initial set should be finite. For example, we can start out with all the points on the x-axis. It is clear that the set of points constructable from this set is equal to the x-axis (i.e. no new point can be constructed).

The above example suggests another interesting question – what initial sets of points are unexpandable? I.e. no new points can be constructed from them. The sets containing fewer than 4 points are obviously unexpandable, as are the sets contained in a line and the set consisting of the entire euclidean-plane. But are there others? 

Now lets assume our initial set contains only points with rational coordinates (lets call them rational points). What can be said about the set P of constructable points?

A nice observation is that P contains only rational points. To see this, observe that the line passing through points (x1, y1) and (x2, y2) satisfies the equation:

y = (y2 – y1) / (x2 – x1) × (x – x1) + y1

Opening all the parenthesis, and noting that the set of rational numbers constitutes a field, the line equation becomes: 

y = m × x + b

Where m and b are rational numbers.

Note that if the line is vertical it does not satisfy the equation but since it is clear how to handle this special-case I will not elaborate.

We need to show that the point of intersection of two lines of the form:

y = m × x + b

and

y = m’ × x + b’

is rational (i.e. it has rational coordinates). Since the point of intersection (x0, y0) satisfies both line equations, we get:

m × x0 + b = m’ × x0 + b’

or

x0 = (b’ – b) / (m – m’)

and so x0 is rational (again, I ignore the special case m = m’ in which the lines are parallel).

And then y0 = m × x0 + b is also rational. So indeed the intersection of two lines passing through two rational points is a rational point.

This concludes the first part of this topic.

To be continued…

Find Your Name

Rating: 4.5
May 13th, 2007

I had a lot of thinking before deciding to include this riddle in my site. The argument for not including it, is that it is too complicated. I ended up deciding it deserves its place in my riddles section (which means I believe it is one of the most beautiful riddles ever) because, well, I think it is one of the most beautiful riddles ever. So enjoy! (but be ready for a hard one…):

veryhard.gif

There are 100 men, 100 boxes and 100 notes with the men’s names on them. The 100 boxes are arranged in a line in a room. Each of the notes with the names is put in a different box randomly. The men are put together and are allowed to decide upon a strategy. When they are done, they are taken, each in his turn, to the room with the boxes. Each one is allowed to open 50 of the boxes. No information whatsoever is shared between the men.

The goal of the men is that each of them finds his own name among the 50 boxes he opened. If but one of them fails to find his own name, they all get killed (why is it, that in so many riddles people end up dead?). You are required to find a method with which they are all saved with a probability of above 30%.

If you understood this riddle properly, it should seem impossible to you at first.

Because the riddle contains many details and to make sure I explained it properly, lets consider two bad strategies for the men:

1. All the men decide to open the same 50 boxes – this is probably the worst strategy for them, as all of them will die with probability 1. This is because 50 of them are sure not to find their names in the boxes they open.

2. Every man open 50 boxes randomly – although a better strategy than the former one, it is a very bad strategy. Each person is likely to find his own name with a probability of 1/2. As all the events of the men finding their names are independent, they will be saved with a probability of (1/2)^100.

As this number is smaller than 0.00000000000000000000000000000079 (which is obviously smaller than 30%) it is a very bad strategy indeed.

BTW - to calculate the above number I used the following python statement:
>>> '%1.32f'%0.5**100

Solution now available on next page!

Wish Me Luck!

May 13th, 2007

Hello all!

I am very excited to announce that my site is up and I really hope you enjoy it. I will always appreciate your feedback (and especially now, that its all brand-new!). Do not hesitate to contact me at yaniv@leviathanonline.com and please register and post comments!

Yaniv

Ants

Rating: 5
May 11th, 2007

This is probably the best riddle I know of.

There are 1000 ants walking on a horizontal stick, 1m long.

Each ant is walking at a constant rate of 1m/hour.

The ants start out in random locations on the stick, some walking to the left and some to the right.

Whenever two ants meet they reverse their direction, as follows:

Riddle Illustration

Whenever an ant reaches one of the ends of the stick, it falls off (eventually all ants will fall).

What is the longest time it will take the last ant to fall off the stick?

The Ants riddle is one of the best riddles ever. It requires an imaginative mind, but does not require any knowledge of mathematics. A talented 6 year old can solve it!That is not to say it is very easy, but you can solve it yourself. DO NOT DESPAIR!

If you already solved the riddle and want to see the solution – visit the second page below.