Tuesday, August 22, 2006

Patterns for solving the Adobe San Jose Semaphore cryptograph

After putting this down for a few days to work on a new session on Adobe’s Security Architecture for the upcoming Max 2006 show, I came back to the Semaphore Website and observed some patterns. This time, I have the benefit of actually being able to both see the semaphore as well as hearing the codes. For each 7.2 second change, I noted the position of each of the four glyphs and the string-integer code. There is a repeated pattern of both the visual and audio clues synchronized. The following characters are used to denote the glyph’s position: | - vertical; \ top left to bottom right; / bottom left to top right and – for horizontal.

-| \ - K02
\ --- M 14
-/// L09
\|// O10
|\\/ B01
-||| K8
-/-/ C10
\\|/ E2
/|-- N11
\//\ G8
|/|/ K2 (note that even though K2 repeats, glyphs are different)
/\\/ M14
|--| L (5 or 9 – was not sure)
//-/ O10
- ||| B1
|//\ K8
|-\| C10
/|/| N11
/--- G8
--/| K2
\||| M14
-\\\ L9
\-\| O10
|//\ B01
---- K08
-\|\ C10
\/-\ E2
/-|| N11
\\\/ G8
|\-\ K2
///\ M14
|||- L9
/\|\ O10
---- B1 (note the glyphs repeat but not the alpha-numeric)
|\\/ K8
||/- C10
/-\- E02
/||| G8
-|\- K02 (First in sequence of compelte repeat. Duplicate with first entry.)

From here is repeated the entire cycle over. Given it started repeating a third time in a row, I got an idea. Perhaps it repeats a certain section of the code specific to each client. Some aspect of the interaction between the client and the server for Semaphore “seeds” the semaphore to produce a specific set of codes. Being in the mood to test, I hit “reload” and low and behold, the alpha numerics were the same but the glyphs were different. CAVEAT: I did not go through the entire cycle to verify it.

Hypothesis:

If Semaphore reacts to it’s environment, it might use some unique aspect of web based interactions to seed the pattern to avoid pattern detection between multiple clients. Maybe this hypothesis is too nerdy and over-analytical but I would be very interested in the abilities of others to see this. I also would be interested to see if the real world live Semaphore is synchronized with the simulcast. If someone could go in front of the Adobe building with a laptop connected ot the internet and visually verify whether or not the Semaphore’s glyphs are the same as on their laptop screen, it would be useful information.

Anyone else get the same patterns online?

16 comments:

  1. I found a pattern in the semaphore, and then found your blog through google. The pattern in the semaphore I found supports your theory that the glyphs are different for each "run". I found that for each run, the glyphs and the alphanumerics match. Once it's a new run, the glyphs for the alphanumerics change but stay consistent throughout the the run until a new run occurs. I haven't actually written down the actual beginning and end of the run but I was taking note between the audio and the visual. I, too, started out with a 16x16 table with Alpha-Papa and 1-16. I am also interested in overlaying the ascii table onto it. Also, the live broadcast seems to be different than the simulcast. I am heading over tomorrow night to sit there with my laptop and try to analyze it some more. Do you know if there's wireless there?

    ReplyDelete
  2. This is a great clue. I also observed that the glyphs stay consistent for each run and even some of the meta patterns are repeated. For example, on one run, three glyphs were the same and the forth was not. In the next run, three different glyphs appeared and a forth was different, although all were different.

    I have to ask a question. Did you trigger a new run or did the new run just occur by itself.

    Another clue (or not) might be to see if there is any correlation between the airplanes flying overhead and the new run.

    ReplyDelete
  3. I have the swf file loaded on a separate browser and I shrink it to sit on the bottom corner. Once in a while, I'll take a 5 minute break and just take down the glyphs on my 16x16 chart.

    So to answer your question, the new run occurs by itself.

    I was actually taking it down when I noticed the sudden change in pattern. I didn't click refresh or anything.

    ReplyDelete
  4. Looking at the source of the encryption algorithm, I think it is interesting to see who is credited for this...

    Mark Hansen who worked on the encryption is interesting - responsible in part for MPG compression
    http://www.stat.ucla.edu/~cocteau/cv/cv.pdf

    As is the other person credited with helping on the encryption, Dan Wallach
    http://www.cs.rice.edu/~dwallach/

    ReplyDelete
  5. My son and I are sitting next to each other using our laptops. Here are some observations.

    Alpha to Papa are used (first 16 letters of the alphabet). The verbal numbers are also 0-16.

    While both listening, we each got completely different combinations of letters, glyphs, etc.

    I took notes for a while. I don't stand a chance of figuring this out, but I'm certainly willing to help with any data collection.

    There are also several audio tracks that will have to be recorded. There's a plucked instrument (like a guitar or cello, although sometimes it sounds like a piano), the blips, the beeps (sound like radar). There's also an underlying tone that increases/decreases at varying points in the signal.

    I'm also convinced that there's added patterns in the static over the "alpha" voice and there's got to be something to the tone of the woman saying the numbers.

    Sometimes her voice is flat, sometimes high-pitched and almost sung.

    Thanks for doing this blog.

    ReplyDelete
  6. The numbers and 1-16 and A-K could correspond with the Hex numbers 0 through F. Taken together you have 00 - FF, or an 8 bit value. I'm sure an ASCII or Unicode lookup is too simple?

    Each of the 4 possible positions of the circles can represent two bits, giving you an 8 bit value. I don't think it's this simple since they also rotate clockwise or counterclockwise which hasn't been mentioned thus far. This gives you at least 4 more bits of potential info. Also, rather that simply encoding info as a position and a spin, the info could be encoded as a delta from the previous position. So, a clockwise rotation of 45 degrees could represent +1, while a counterclockwise rotation of 90 degrees could represent -2, etc.

    How many different notes are played on the guitar? Doesn't seem like there's 16 of those... How many different beep/boop boop/beep tones? There seems to be two tones used in the voicing of the numbers, IE, 1 bit.

    Oh Crap! I just realized that the clicking sequence is different in each left/right channel! They seem to be tied to the circle rotation, but are they?

    Here is the number/letter/decimal/hex mapping (read downwards):

    NUM 0000000001111111
    1234567890123456
    ||||||||||||||||
    LTR ABCDEFGHIJKLMNOP
    ||||||||||||||||
    DEC 0000000000111111
    0123456789012345
    ||||||||||||||||
    HEX 0123456789ABCDEF

    So a code such as K02 is A1 in Hex which is an upsidedown exclamation mark in Unicode. The previous message recoded this way is still gibberish, but it is a potential datastream.

    K02 A1 ¡
    M14 CD Í
    L09 B8 ¸
    O10 E9 é
    B01 10 !
    K8 A7 §
    C10 29 )
    E2 41 A
    N11 DA Ú
    ...

    The real task is going to be figuring out how all these different forms of data correlate.

    ReplyDelete
  7. oh, sorry, that last comment really requires a fixed width font to make sense. If you care, copy it elsewhere and change the font to Courier New...

    ReplyDelete
  8. I haven't had a chance to study this in more detail, but reading the background, and seeing how this is Adobe, perhaps there is a visual element as per the semaphore telegraphs?

    ReplyDelete
  9. I just saw all of this for the first time, and I honestly have no idea about any of it -- but I did notice through my LiveHTTP headers in Firefox that XML is being loaded...

    Initially, my browser grabs this XML: http://www.sanjosesemaphore.org/finnegan/index.xml

    That file has one number that apparently serves as the seed. From there, your browser will grab another XML file based on that seed.

    For example, if the current data is "28282", then you will get the 28282.xml file from the folder above: http://www.sanjosesemaphore.org/finnegan/28282.xml

    That file has 10 "entries", each with 4 "Disc" elements, and 6 "Sound" elements. I have no idea what they do. And I don't know how the file numbering system works -- I do know that they increase after a certain number of disc moves.

    Couldn't you theoretically download the SWF, and grab all of the possible XML files, and fake the seed numbers to see the results?

    Not saying that'd be useful, but who knows.

    ReplyDelete
  10. TO WHOMEVER PUTS SPAM ON THIS POST:

    If I ever catch you, you will pay a very large price. Do not spam this blog. If you do, I will hunt you down as I am really pissed. I will not be responsible for my actions.

    Think very carefully about this!

    ReplyDelete
  11. Sunjun:

    Thank you. The links on Falun Gong you sent me were very impressive. I didn't know you were into Falun Gong so much. The Chinese Ministry of the Interior friends I have were very surprised when they learned your identity. It seems they have been looking for the person who uses falun gong and pornography. I decided to help them find you.

    Expect a knock on your door soon.

    Have a nice day!

    ReplyDelete
  12. 温岭市经纬流水线制造厂位于东海之滨的温岭市淋川工业区。是专业发展成为集成研究、设计、生产、销售生产线的厂家.

    ReplyDelete
  13. Sunjun:

    You are an idiot. You are an asshole. You are a spammer. You should die. I delete all your ads but I know who you are now.

    You should not sleep to well at night.

    Duane

    ReplyDelete
  14. This comment has been removed by a blog administrator.

    ReplyDelete
  15. This comment has been removed by a blog administrator.

    ReplyDelete

Do not spam this blog! Google and Yahoo DO NOT follow comment links for SEO. If you post an unrelated link advertising a company or service, you will be reported immediately for spam and your link deleted within 30 minutes. If you want to sponsor a post, please let us know by reaching out to duane dot nickull at gmail dot com.