MonaTweeta II

    Prev Next

    Preliminary result of a little competition between me and Ralph Hauwert (who had the initial idea) with the goal to write an image encoder/decoder that allows to send an image in a tweet. The image on the left is what I currently manage to send in 140 characters via twitter.

    This is the tweet for the image:
    圑嘌婂搒孵怤實恄幖戰怴搝愩娻屗奊唀唭嚟帧啜徠山峔巰喜圂嗊埯廇嗕患嚵幇墥彫壛嶂壋悟声喿墰廚埽崙嫖嘵奰恛嬂啷婕媸姴嚥娐嗪嫤圣峈嬻尤囮愰啴屽嶍屽嶰 寂喿嶐唥帑尸庠啞彐啯廂喪帄嗆怠嗙开唅恰唦慼啥憛幮悐喆悠喚忐嗳惐唔戠啹媊婼捐啸抃岖嗅怲幀嗈拀唹坭嵄彠喺悠單囏庰抂唋岰媮岬夣宐彋媀恦啼彐壔姩宔嬀

    I am using chinese characters here since in UTF-8 encoding they allow me to send 210 bytes of data in 140 chars. In theory I could use the whole character code range from 0x0000-0xffff, but there are several control chars among them which probably could not be sent properly. With some tweaking and testing it would be possible to use at least 1 or 2 more bits which would allow to sneak 17 or 35 more bytes into a tweet, but the whole encoding would be way more nasty and the tweets would contain chars that have no font representation.

    Besides this char hack there are a few other tricks at work in the encoding. I will reveal them over time. For now I just mention the difficulties involved here:

    A typical RGB color needs 24 bits which is 3 bytes. This means if you just stored raw colors you could send 70 colors. Unfortunately you couldn't send anything else. At least that would allow you to send a 7x10 pixel matrix.

    The worst way to store one full x/y coordinate would be 2 times 4 bytes, which is 26 coordinates in one tweet. That's 8 triangles. Obviously you have to do some concessions with the precision here. 2 bytes per number maybe? Gives you 52 points or 17 triangles. Unfortunately those come without color info.

    --- Additional info added on May 12th --
    Looks like my little project got a bit of attention lately, so I guess I should explain a few more of the details.

    The image file format currently looks like this:

    [0x00-0x17] 8 color lookup table, each RGB color is 24 bit

    [0x18] approximate image proportions, stored in 2 x 4 bits, the proportion is (v >> 4) / (v&4) - which means the actualy physical size of the image is not stored, which is not necessary since it gets rendered in vectors anyway. So the height will be derived from the available width.

    [0x19-0xD0] 61 points with color info each stored in 3 bytes:
    The first two bytes are the x and the y position whereby their final position is calculated byte / 0xff * displayWidth and byte / 0xff * displayHeight

    The color info is stored in the third byte and the way it is done is quite nifty I think: since my lookup table stores only 8 colors I just need 3 bits to store an index to a color. This would leave me with 5 unused bits. So I use these additional bits to give me a wider range of colors by creating blends between the colors in the table. So additionally to one color index I store another color index in the same byte. The remaining 2 bits I use as the blending factor. 2 bits allow for 4 different values. The ones I pick are 0 = 0.125, 1=0.25, 2=0.375, 4 = 0.5. I don't need any higher values since I can simply switch the order of the "upper" and "lower" color to get the same result as e.g. 0.75. I also do not need 0 or 1 since if I want a full color I just mix two times the same color. The 0.5 is a bit of a waste since it means I get the same mix in both directions, maybe it would be smarter to use 0.45 in this case. Overall this trick means that instead of just 8 colors I have a choice of about 256 shades of color.

    The actual creation of the image is an evolutionary algorithm. I start by quantizing the image's colors to get 8 representative colors. And I scatter the 61 points over the image area. At each point I read the pixel color of the blurred image and choose the closest shade I can create with my extended color table. With this data I greate a binary "gene" ( the encoded version of is the chinese twitter tweet). From the gene I create a voronoi diagram which is the image you see on the left.

    In order to get the best representation (meaning best positions of points and their choice of color) I compare the rendered image with the original by summing up the squared difference of the pixel colors and dividing it by the amount of pixels. The result is the fitness value. The ideal value would be 1 which meant that there is no difference at all between original and rendered image, but obviously that is impossible to reach for most images.

    After calculating the fitness value I clone the gene and make a few random mutations to it. Once again I calculate the fitness of the mutation and if it is higher than of its parent the mutation becomes the new parent. This process can run indefinitely but usually the rate of improvement decreases rapidly after a few minutes.

    My current goal is to figure out the optimum ways to get good results quickly.

    yezzer, watz, mark knol, and 238 other people added this photo to their favorites.

    View 20 more comments

    1. Quasimondo 60 months ago | reply

      It looks like Neil Graham has already done some very interesting research into image compression with vectors:
      www.screamingduck.com/Article.php?ArticleID=46&Show=ABCE

      screamingduck.com/Lerc/evopic.html

    2. Quasimondo 60 months ago | reply

      @ingagilchrist Oh did you write this: www.flickr.com/photos/pleribus/3553944624/? Thank you! To answer your question - yes I might do that, but right now I have to do some real work - unfortunately no time for recreational coding.

    3. Sam Hocevar 60 months ago | reply

      Hi there! I love your idea, it's probably one of the first interesting uses for Twitter.

      Here's my first attempt:

    4. bumbolo_bill 60 months ago | reply

      This is extremely impressive, and I'd like to make some of these on my own. Problem is I have almost no knowledge of algorithms or generating vectors, even after your explanation. Is there any application or executable i could use that would create an automatic result, or a mutation program of some sort?

    5. Furee 60 months ago | reply

      that is intense, very clever competition.

    6. Quasimondo 60 months ago | reply

      BTW, encoded as a QR code the Mona Tweeta text looks like this:

    7. Sam Hocevar 60 months ago | reply

      @kristophrr: are you really a computer vision professional? Using vector-valued components to compress an image is used in the field, especially for very lossy compression. See the works of Joachim Weickert, for instance, especially his paper titled "Towards PDE-based image compression". Quasimondo's approcach is IMHO good, it just needs work 1) in the bit allocation, and 2) in the image reconstruction itself (smoothing the whole thing a bit).

    8. Quasimondo 60 months ago | reply

      Another great link - thank you very much Sam! It definitely shows that I am just a dilettante in this field.

    9. Brian Campbell 60 months ago | reply

      I rather like the effect that Quasimondo's approach gives you; I think while you might get something closer to the original by applying some smoothing, the sort of mosaic effect is actually very visually appealing. I do like the edge enhancing diffusion from the Joachim Weickert's work, though. I think that would be really interesting to apply to this problem.

    10. tsevis 60 months ago | reply

      Great idea. Congrats.

    11. bjornblog 60 months ago | reply

      love it :-D

    12. moi_fotografii 59 months ago | reply

      That's very very smart!!
      Do you think you can create Voronoi diagrams controlling the cells, so that smaller cells will be created in the area where more image details is present? (smaller cells in face area on the behalf of larger ones in less-of-interest areas)

      Here is my version of Image Twitter which I've created a few weeks ago
      uzhin.info/imagetwitter/

      My thought was to create images in 144px (12x12) :)

    13. ThenAndAgain 59 months ago | reply

      : Sorry for the delay, here's the source of the project we worked on.

      www.darrellnoice.com/download/utfsmuggler.tar.gz

      the project was done up in netbeans if you don't want to wrestle with files manually. Once you compile it up, run it with --help and you should be good to go.

      From a data point of view, you get a guaranteed 20bits of real data for every UTF8 character. And only the last character of the stream is lost for overhead.

      I hope this helps! Good luck with your project.

    14. alxflickrrr 55 months ago | reply

      Thanks for sharing!

    15. guaritore2 [deleted] 46 months ago | reply

      Fashionably late to the party...

      @Everyone: Wow. I'm blown away that I know *just* enough about math, encoding, and computer functionality to at least try to appreciate what this actually represents. I see a potentially huge impact to the areas of "data/concept representation" in general and consequently, to software programming. If ideas and processes can be represented in novel ways then mobile apps (for e.g.) could be built to allow unprecedented access to knowledge, processes, and logistical synchronization (for events for e.g.).

      @{Away until inspiration comes}: Ditto to your comment. Translating a photo into words is quite an expressive feat!

      @Dieter Paul-Serge: I want a t-shirt too!

      @Quasimondo: 1. What's the idea for the other project involving translating the Chinese characters? 2. I think you may have brought much attention to an incredible yet sadly ignored art form! Translating a picture into words/poetry? Amazing!

    keyboard shortcuts: previous photo next photo L view in light box F favorite < scroll film strip left > scroll film strip right ? show all shortcuts