Code Break News

Copiale Cipher Cracked!

"When you get a new code and look at it, the possibilities are nearly infinite," Knight said. "Once you come up with a hypothesis based on your intuition as a human, you can turn over a lot of grunt work to the computer." computer scientist Kevin Knight of the USC Viterbi School of Engineering, part of the international team that finally cracked the Copiale Cipher.

After trying 80 languages, the cryptography team realized the Roman characters were "nulls," intended to mislead to reader. It was the abstract symbols that held the message.

The team then tested the hypothesis that abstract symbols with similar shapes represented the same letter, or groups of letters. Eventually, the first meaningful words of German emerged: "Ceremonies of Initiation," followed by "Secret Section."

For more information about the method of decipherment, visit http://stp.lingfil.uu.se/%7Ebea/copiale/

Knight is now targeting other coded messages, including ciphers sent by the Zodiac Killer, a serial murderer who sent taunting messages to the press and has never been caught. Knight is also applying his computer-assisted codebreaking software to other famous unsolved codes such as the last section of "Kryptos," an encrypted message carved into a granite sculpture on the grounds of CIA headquarters, and the Voynich Manuscript, a medieval document that has baffled professional cryptographers for decades.

But for Knight, the trickiest language puzzle of all is still everyday speech. A senior research scientist in the Intelligent Systems Division of the USC Information Sciences Institute, Knight is one of the world's leading experts on machine translation -- teaching computers to turn Chinese into English or Arabic into Korean. "Translation remains a tough challenge for artificial intelligence," said Knight, whose translation software has been adopted by companies such as Apple and Intel.

With researcher Sujith Ravi, who received a PhD in computer science from USC in 2011, Knight has been approaching translation as a cryptographic problem, which could not only improve human language translation but could also be useful in translating languages that are not currently spoken by humans, including ancient languages and animal communication.

Mathematics of ancient carvings reveals lost language

11:24 01 April 2010 by Kate RaviliousC

Thanks to Kate Ravilious of New Scientist for writing this article. Be sure to check out the link to her full story.

From New Scientist - Online click here for the full article on their website

Elaborate symbols and ornate depictions of animals carved in stone by an ancient Scottish people have given up their secret – to mathematics. Statistical analysis reveals that the shapes are a forgotten written language. The method could help interpret many other enigmatic scripts – and even analyse animal communication.

Conventional statistical methods for analysing scripts calculate the entropy or "orderedness" of the symbols: Shakespeare's prose would have a higher entropy than Egyptian hieroglyphs or Morse code, for example. However, such analysis only works for datasets large enough to capture most of the vocabulary in a language.

To overcome this problem, Rob Lee of the University of Exeter, UK, and colleagues have devised a way to compare small undeciphered scripts with known texts. The team compared symbols created by the Picts – a Scottish Iron Age society that flourished from the fourth to the ninth centuries AD – with over 400 known ancient and modern language texts.

The most common pairings of symbols used by the Picts are the mirror and comb, the crescent/"V rod" and mirror, and the crescent/"V rod" and Pictish beast (Image: Rob Lee)

Number crunching

They standardize the texts by calculating their ratio of paired characters to single characters. They then insert this term into a two-stage calculation. The first stage measures how repetitive a script is: Pictish turned out to be much less repetitive than pictorial scripts and codes, strongly indicating that it was a written language, rather than religious imagery or heraldic arms as has been speculated in the past.

The next part of the calculation reveals the difference between words, syllables and letters. Pictish symbols that were contrasted with texts analyzed at the level of whole words were found to be comparable to a modern language with a small vocabulary. "It's equivalent to the language used in the 'Janet and John' learning-to-read books," says Lee.

The meaning of the Pictish words is still a mystery, but the researchers suspect the stones are memorials to the dead. Contemporary stones carved in Old English and Latin have been found across the UK.

Unlocking other languages

Katherine Forsyth, an expert on Pictish symbols at the University of Glasgow, is delighted with the findings. "It confirms exactly what I have deduced, but uses a rigorous and context-free method to do so," she says.

Rajesh Rao of the University of Washington in Seattle is also enthusiastic. Last year he used entropy analysis to study the undeciphered script of the ancient Indus valley civilisation and concluded that it was a written language. Now he has applied the new technique to his Indus data. "[Lee and colleagues'] method predicts that the Indus symbols represent words rather than heraldic or political symbols, which is consistent with our earlier work suggesting that the Indus script represents language," he says.

Lee and his colleagues are now keen to analyse other undeciphered ancient scripts, such as the "cup and ring" marks from the northern UK and Bronze Age petroglyphs from Scandinavia.

They think the technique could also be adapted to analyse animal communication, assessing how much meaning dolphins can convey with their whistles, for example.

Click on the image below to read the full article and find links to the Photographer who took this beautiful shot.

Examples of Pictish symbols found in Scotland (Image: Patrick Dieudonne/Robert Harding/Rex Features)

************************************************************

Two Centuries On, a Cryptologist Cracks a Presidential Code

Unlocking This Cipher Wasn't Self-Evident; Algorithms and Educated Guesses

By RACHEL EMMA SILVERMAN C(Click on the link above from the Wall Street Journal site for the complete text with some great interactive graphics of the actual cipher as penned by Robert Patterson)

For more than 200 years, buried deep within Thomas Jefferson's correspondence and papers, there lay a mysterious cipher -- a coded message that appears to have remained unsolved. Until now.

The cryptic message was sent to President Jefferson in December 1801 by his friend and frequent correspondent, Robert Patterson, a mathematics professor at the University of Pennsylvania. President Jefferson and Mr. Patterson were both officials at the American Philosophical Society -- a group that promoted scholarly research in the sciences and humanities -- and were enthusiasts of ciphers and other codes, regularly exchanging letters about them.

In this message, Mr. Patterson set out to show the president and primary author of the Declaration of Independence what he deemed to be a nearly flawless cipher. "The art of secret writing," or writing in cipher, has "engaged the attention both of the states-man & philosopher for many ages," Mr. Patterson wrote. But, he added, most ciphers fall "far short of perfection."

To Mr. Patterson's view, a perfect code had four properties: It should be adaptable to all languages; it should be simple to learn and memorize; it should be easy to write and to read; and most important of all, "it should be absolutely inscrutable to all unacquainted with the particular key or secret for decyphering."

Mr. Patterson then included in the letter an example of a message in his cipher, one that would be so difficult to decode that it would "defy the united ingenuity of the whole human race," he wrote.

There is no evidence that Jefferson, or anyone else for that matter, ever solved the code. But Jefferson did believe the cipher was so inscrutable that he considered having the State Department use it, and passed it on to the ambassador to France, Robert Livingston.

The cipher finally met its match in Lawren Smithline, a 36-year-old mathematician. Dr. Smithline has a Ph.D. in mathematics and now works professionally with cryptology, or code-breaking, at the Center for Communications Research in Princeton, N.J., a division of the Institute for Defense Analyses.

A couple of years ago, Dr. Smithline's neighbor, who was working on a Jefferson project at Princeton University, told Dr. Smithline of Mr. Patterson's mysterious cipher. Robert Patterson, University of Pennsylvania Archives Photo

Dr. Smithline, intrigued, decided to take a look. "A problem like this cipher can keep me up at night," he says. After unlocking its hidden message in 2007, Dr. Smithline articulated his puzzle-solving techniques in a recent paper in the magazine American Scientist and also in a profile in Harvard Magazine, his alma mater's alumni journal.

The code, Mr. Patterson made clear in his letter, was not a simple substitution cipher. That's when you replace one letter of the alphabet with another. The problem with substitution ciphers is that they can be cracked by using what's termed frequency analysis, or studying the number of times that a particular letter occurs in a message. For instance, the letter "e" is the most common letter in English, so if a code is sufficiently long, whatever letter appears most often is likely a substitute for "e."

Our Thanks to Rachel Emma Silverman of the Wall Street Journal for this article and accompanying graphics. (July 2, 2009)

Because frequency analysis was already well known in the 19th century, cryptographers of the time turned to other techniques. One was called the nomenclator: a catalog of numbers, each standing for a word, syllable, phrase or letter. Mr. Jefferson's correspondence shows that he used several code books of nomenclators. An issue with these tools, according to Mr. Patterson's criteria, is that a nomenclator is too tough to memorize.

Jefferson even wrote about his own ingenious code, a model of which is at his home, Monticello, in Charlottesville, Va. Called the wheel cipher, the device consisted of cylindrical pieces, threaded onto an iron spindle, with letters inscribed on the edge of each wheel in a random order. Users could scramble and unscramble words simply by turning the wheels.

But Mr. Patterson had a few more tricks up his sleeve. He wrote the message text vertically, in columns from left to right, using no capital letters or spaces. The writing formed a grid, in this case of about 40 lines of some 60 letters each.

Then, Mr. Patterson broke the grid into sections of up to nine lines, numbering each line in the section from one to nine. In the next step, Mr. Patterson transcribed each numbered line to form a new grid, scrambling the order of the numbered lines within each section. Every section, however, repeated the same jumbled order of lines.

The trick to solving the puzzle, as Mr. Patterson explained in his letter, meant knowing the following: the number of lines in each section, the order in which those lines were transcribed and the number of random letters added to each line.

The key to the code consisted of a series of two-digit pairs. The first digit indicated the line number within a section, while the second was the number of letters added to the beginning of that row. For instance, if the key was 58, 71, 33, that meant that Mr. Patterson moved row five to the first line of a section and added eight random letters; then moved row seven to the second line and added one letter, and then moved row three to the third line and added three random letters. Mr. Patterson estimated that the potential combinations to solve the puzzle was "upwards of ninety millions of millions."

After explaining this in his letter, Mr. Patterson wrote, "I presume the utter impossibility of decyphering will be readily acknowledged."

Undaunted, Dr. Smithline decided to tackle the cipher by analyzing the probability of digraphs, or pairs of letters. Certain pairs of letters, such as "dx," don't exist in English, while some letters almost always appear next to a certain other letter, such as "u" after "q".

To get a sense of language patterns of the era, Dr. Smithline studied the 80,000 letter-characters contained in Jefferson's State of the Union addresses, and counted the frequency of occurrences of "aa," "ab," "ac," through "zz."

Dr. Smithline then made a series of educated guesses, such as the number of rows per section, which two rows belong next to each other, and the number of random letters inserted into a line.

To help vet his guesses, he turned to a tool not available during the 19th century: a computer algorithm. He used what's called "dynamic programming," which solves large problems by breaking puzzles down into smaller pieces and linking together the solutions.

The overall calculations necessary to solve the puzzle were fewer than 100,000, which Dr. Smithline says would be "tedious in the 19th century, but doable."

After about a week of working on the puzzle, the numerical key to Mr. Patterson's cipher emerged -- 13, 34, 57, 65, 22, 78, 49. Using that digital key, he was able to unfurl the cipher's text:

"In Congress, July Fourth, one thousand seven hundred and seventy six. A declaration by the Representatives of the United States of America in Congress assembled. When in the course of human events..."

That, of course, is the beginning -- with a few liberties taken -- to the Declaration of Independence, written at least in part by Jefferson himself. "Patterson played this little joke on Thomas Jefferson," says Dr. Smithline. "And nobody knew until now."