MonaTweeta II

Preliminary result of a little competition between me and Ralph Hauwert (who had the initial idea) with the goal to write an image encoder/decoder that allows to send an image in a tweet. The image on the left is what I currently manage to send in 140 characters via twitter.


This is the tweet for the image:



I am using chinese characters here since in UTF-8 encoding they allow me to send 210 bytes of data in 140 chars. In theory I could use the whole character code range from 0x0000-0xffff, but there are several control chars among them which probably could not be sent properly. With some tweaking and testing it would be possible to use at least 1 or 2 more bits which would allow to sneak 17 or 35 more bytes into a tweet, but the whole encoding would be way more nasty and the tweets would contain chars that have no font representation.


Besides this char hack there are a few other tricks at work in the encoding. I will reveal them over time. For now I just mention the difficulties involved here:


A typical RGB color needs 24 bits which is 3 bytes. This means if you just stored raw colors you could send 70 colors. Unfortunately you couldn't send anything else. At least that would allow you to send a 7x10 pixel matrix.


The worst way to store one full x/y coordinate would be 2 times 4 bytes, which is 26 coordinates in one tweet. That's 8 triangles. Obviously you have to do some concessions with the precision here. 2 bytes per number maybe? Gives you 52 points or 17 triangles. Unfortunately those come without color info.


--- Additional info added on May 12th --

Looks like my little project got a bit of attention lately, so I guess I should explain a few more of the details.


The image file format currently looks like this:


[0x00-0x17] 8 color lookup table, each RGB color is 24 bit


[0x18] approximate image proportions, stored in 2 x 4 bits, the proportion is (v >> 4) / (v&4) - which means the actualy physical size of the image is not stored, which is not necessary since it gets rendered in vectors anyway. So the height will be derived from the available width.


[0x19-0xD0] 61 points with color info each stored in 3 bytes:

The first two bytes are the x and the y position whereby their final position is calculated byte / 0xff * displayWidth and byte / 0xff * displayHeight


The color info is stored in the third byte and the way it is done is quite nifty I think: since my lookup table stores only 8 colors I just need 3 bits to store an index to a color. This would leave me with 5 unused bits. So I use these additional bits to give me a wider range of colors by creating blends between the colors in the table. So additionally to one color index I store another color index in the same byte. The remaining 2 bits I use as the blending factor. 2 bits allow for 4 different values. The ones I pick are 0 = 0.125, 1=0.25, 2=0.375, 4 = 0.5. I don't need any higher values since I can simply switch the order of the "upper" and "lower" color to get the same result as e.g. 0.75. I also do not need 0 or 1 since if I want a full color I just mix two times the same color. The 0.5 is a bit of a waste since it means I get the same mix in both directions, maybe it would be smarter to use 0.45 in this case. Overall this trick means that instead of just 8 colors I have a choice of about 256 shades of color.


The actual creation of the image is an evolutionary algorithm. I start by quantizing the image's colors to get 8 representative colors. And I scatter the 61 points over the image area. At each point I read the pixel color of the blurred image and choose the closest shade I can create with my extended color table. With this data I greate a binary "gene" ( the encoded version of is the chinese twitter tweet). From the gene I create a voronoi diagram which is the image you see on the left.


In order to get the best representation (meaning best positions of points and their choice of color) I compare the rendered image with the original by summing up the squared difference of the pixel colors and dividing it by the amount of pixels. The result is the fitness value. The ideal value would be 1 which meant that there is no difference at all between original and rendered image, but obviously that is impossible to reach for most images.


After calculating the fitness value I clone the gene and make a few random mutations to it. Once again I calculate the fitness of the mutation and if it is higher than of its parent the mutation becomes the new parent. This process can run indefinitely but usually the rate of improvement decreases rapidly after a few minutes.


My current goal is to figure out the optimum ways to get good results quickly.


243 faves
Uploaded on May 10, 2009