Friday, August 25, 2006

San Jose Semaphore Solution Theory

After reviewing the posts from Jo, Kelly, Ben and Joann, I think I have a theory that might solve the cryptogram. The critical piece of information is that there is no repeatable pattern of both glyphs and alpha-numeric content that are synchronized (other than short term repeats) and repeated and that two people can observe totally different streams simultaneously. If two people are seeing different broadcasts, then there must be a dynamic key. There are also mechanisms that can force a pattern to break (reloading a web page or a plane flying overhead).

Here is the theory:

Assuming the 16 by 16 grid concept is legitimate, each repeatable block of string-integer points at a coordinate. However, that coordinate is only a partial solution. The glyphs modify the partial solution by stating a path that the solver should take on the grid from that place. I suspect that the glyphs point in sequential order to a path that you would map. For example, if the sequence Kilo 02 is stated with the glyphs - - / \, you might go the grid position of k2, then go one square to the left or right, another square to the left or right, one square immediately to the right and above and another move to the right and below. That would land you on the correct answer.

This theory is supported by Kelly’s observation that “While both listening, we each got completely different combinations of letters, glyphs, etc”. This probably indicates that the observable signals are computer generated and random and combine two or more keys with the cyphertext. This would be a good cryptographic technique because it avoids anyone seeing patterns and/or potentially introduces patterns that might mislead someone trying to solve it. Given Ben pointed out that the creators have cryptographic experience, I suspect they would not use linear keys (too easy to break).

The questions to test this theory:

Given the glyphs are ambiguous as to direction, I suspect that there is another key in the voice or music tones. For example, if the woman’s voice “sings” the number, perhaps that signals that all horizontal moves and left to right, rather than right to left. The tones might also come into play. If the tone is higher than the previous tone, the vertical moves are from bottom to top, if lower than previous, top to bottom.

Also the question remains as to the characters in the grid itself. Is it simply the alphabet repeated over and over? Is it the ASCII table? There might be two grids as well. One with the alphabet vertically laid out and one with it horizontal.

So far I have put about 5 hours into solving this (perhaps a bit more given I think about it sometimes). I suspect that it could not be this easy given the statement that it should take about two years to solve. Either that or Ben, Joann, Kelly, Jo and myself are a good team.

I probably won’t have time to test the theory this week given I have to write several presentations for Adobe Developer Days (Yes - I am Adobe's security technical evangelist) in London the week after next, but I’ll try to map this out and test the theory.

Tuesday, August 22, 2006

Patterns for solving the Adobe San Jose Semaphore cryptograph

After putting this down for a few days to work on a new session on Adobe’s Security Architecture for the upcoming Max 2006 show, I came back to the Semaphore Website and observed some patterns. This time, I have the benefit of actually being able to both see the semaphore as well as hearing the codes. For each 7.2 second change, I noted the position of each of the four glyphs and the string-integer code. There is a repeated pattern of both the visual and audio clues synchronized. The following characters are used to denote the glyph’s position: | - vertical; \ top left to bottom right; / bottom left to top right and – for horizontal.

-| \ - K02
\ --- M 14
-/// L09
\|// O10
|\\/ B01
-||| K8
-/-/ C10
\\|/ E2
/|-- N11
\//\ G8
|/|/ K2 (note that even though K2 repeats, glyphs are different)
/\\/ M14
|--| L (5 or 9 – was not sure)
//-/ O10
- ||| B1
|//\ K8
|-\| C10
/|/| N11
/--- G8
--/| K2
\||| M14
-\\\ L9
\-\| O10
|//\ B01
---- K08
-\|\ C10
\/-\ E2
/-|| N11
\\\/ G8
|\-\ K2
///\ M14
|||- L9
/\|\ O10
---- B1 (note the glyphs repeat but not the alpha-numeric)
|\\/ K8
||/- C10
/-\- E02
/||| G8
-|\- K02 (First in sequence of compelte repeat. Duplicate with first entry.)

From here is repeated the entire cycle over. Given it started repeating a third time in a row, I got an idea. Perhaps it repeats a certain section of the code specific to each client. Some aspect of the interaction between the client and the server for Semaphore “seeds” the semaphore to produce a specific set of codes. Being in the mood to test, I hit “reload” and low and behold, the alpha numerics were the same but the glyphs were different. CAVEAT: I did not go through the entire cycle to verify it.

Hypothesis:

If Semaphore reacts to it’s environment, it might use some unique aspect of web based interactions to seed the pattern to avoid pattern detection between multiple clients. Maybe this hypothesis is too nerdy and over-analytical but I would be very interested in the abilities of others to see this. I also would be interested to see if the real world live Semaphore is synchronized with the simulcast. If someone could go in front of the Adobe building with a laptop connected ot the internet and visually verify whether or not the Semaphore’s glyphs are the same as on their laptop screen, it would be useful information.

Anyone else get the same patterns online?