Tokenizing is kind of a given.. \:\)

I've begun using a different (unpublished) form of CODEC that is based on a cypher used in an episode of Mission: Impossible from the late 1960's. \:D What's cool about this cypher is that it is not succeptible to hacking based on letter frequency methods. For example, the word "cat" could potentially be represented by "zzz". (unlikely, but possible!) The cypher key is numeric and composed of potentially thousands of digits.


PS - here's a sample cypher from the simple version:
kgt+,x*6]1io^t7nq8 w|6a⌂XyOi%f+(~~z{o$u⌂^e{⌂m1,\ )ozdx% `l{'⌂" xhvv
Note the "~~" and "vv", which are actually "ro" and "e!" in the clear text. There are thre occurrences of "ss" in the original text, yet they cypher to 'o$', ')o', and '~"'. This cypher used a 3-digit key with a 13-byte rotation, although I can use byte rotations that number in the thousands (I usually use 32768 when encoding files).

Edited by Glenn Barnas (2010-03-23 09:09 PM)
Edit Reason: added example
Actually I am a Rocket Scientist! \:D