Welcome to Anoraks Corner, an introduction to the statistics used to attack Lorenz
The breaking of Lorenz depended on finding the wheel start positions by doing
counts of various cross correlation runs down the whole length of the cipher
text. The basis for this was that when patterns of bits with the correct
start positions were added to the cipher bits, some of the enciphering
bits generated by the Lorenz machine were stripped off revealing a non-random
count due to the plain language non-uniformity.
BREAM PLAIN LANGUAGE
............................................................H
........................................................H.H.H
........................................................H.H.H
........................................................H.H.H
........................................................H.H.H
........................................................H.H.H
........H...............................................H.H.H
........H...............................................H.H.H
........H.........................H.....................H.H.H
H.......H...............H.H.......H.....................H.H.H
H.......H.......H.......H.H.......H.H.H.................H.H.H
H.....H.H...H.H.H.....H.H.H.H...H.H.H.H.H.H.H.....H.....H.H.H
H.H.H.H.H.H.H.H.H...H.H.H.H.H.H.H.H.H.H.H.H.H.H...H.....H.H.H
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 3 4 - + . /
Notice the usual high counts of letters A E R, in German as in English,
and the very large counts of teleprinter editing characters, figure/letter shift,
carriage return etc., at the right of the histogram.
However, the Germans had chosen the wheel patterns on the Lorenz cipher machine
so that the cipher text was nearly flat random
BREAM CIPHER
..................H.........H.........H...H...............H.H..
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 3 4 - + . /
But the Lorenz cipher machine added two obscuring characters to the plain
text to get the cipher text. These characters were generated by two sets
of 5 wheels, known in Bletchley Park as the Chi(K) wheels and the Psi(S) wheels.
Of these two the Chi wheels moved regularly but the Psi wheels moved intermittently
under control of the two so-called motor wheels. Thus stripping off the Chi
wheel added characters was the first part of the attack on the cipher. When this
had been done with the correct Chi patterns and correct Chi starts, the
result was known as the De-Chi.
BREAM DECHI
............H.H.......H.............H.....H.....H.H...H........
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 3 4 - + . /
This is also nearly flat random. However what Bill Tutte found was that
if the DELTA was used things were very different. The DELTA results when
one character is added to the next. If these are the same this gives /
(all dots)
..............................................................H
..............................................................H
..............................................................H
..............................................................H
..............................................................H
..............................................................H
..............................................................H
..............................................................H
..............................................................H
..................H.....................H.................H...H
..................H.....................H.................H...H
............H.....H.....................H...........H...H.H...H
H.........H.H.....H.H.H.....H...H.H.H.H.H.......H.H.H.H.H.H...H
H...H...H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 3 4 - + . /
Again the uneven distribution of Delta plain language shows, but what also
shows is the very large count of "/" resulting from the large number of repeated
characters in the language and in the teleprinter editing characters.
Whilst the Delta cipher text is still nearly flat random:
DELTA BREAM CIPHER
......H.....H.....H...H.H...H...........H...........H..........
H.H.H.H.H.H.H.H.H.H.H.H.H...H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 3 4 - + . /
The Delta of the DeChi shows a residue of the excess of the "/" character
DELTA BREAM DECHI
..............................................................H
..............................................................H
....................H...H...H...H.......H...............H.H...H
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H.H
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 3 4 - + . /
If the Chi wheel patterns are known, then the correct start positions can
be found by trying every possible combination of wheel starts, adding
the resultant characters to the cipher text and counting the number of
resulting "/"s. This will be a maximum when the pattern starts are correct for
the cipher text.
But the Chi wheels have lengths of 41, 31, 29, 26 and 23 so the total
combination of possible wheel starts is about 22 Million!! so even today
exhaustive search would take a long time.
However what Bill Tutte also found was that it was possible to do the search
in smaller segments. If wheels K1 and K2 are taken first, this is only
1271 possible starts. If K4 and K5 are then set given K1 and K2 in the correct
position, this is only 598 starts and if then K3 is set given all the others
correct this is only 29 starts. So the resultant start search space is
only 1,898 compared to 22 Million!
It was the discovery of this partitioned method of attack which allowed
Colossus to find the Chi wheel starts on a cipher text in about 30 minutes.
When a score (count) had been obtained it was vital to be able to tell whether
this score could have occured by random chance or whether it was a sufficiently
significant score to indicate the finding of the correct wheel start position.
The measure of "goodness" of a score was how far it was away from the expected
random score for the algorithm being run.
For the setting algorithm for K1 and K2, 1+2 = (dot), random probabilty
p = 1/2
for the setting algorithm, 1=2=/4=5, for K4 and K5, random probability p = 1/8
for a single character count of "/", as used for setting K3, random probability p = 1/32
(In the following x is times, sqrt is square root, ^ is to the power of,
/ is divided by ).
if n = number of characters in cipher text
random (mean) score r = n x p
(1) standard deviation, sigma = sqrt(n x p x (1 - p))
maximum score = c the probability of this score occuring randomly is:-
(2) p x e^-(((c-r)^2) / 2(sigma^2))
If c is expressed as r + (F x sigma) this reduces to:-
(3) probability of score = p x e^-((F^2) / 2)
This enables the probability of a score to be tabulated against F, the number
of standard deviations the score is away from the mean or random score.
The odds, o = 1/probability
(4) o = e^((F^2)/2)
(5) deciban of odds = (F^2) x log10(e)/2 = (F^2) x 4.34/2 = (F^2) x 2.17
result decibans = decibans "on" due to F - decibans "against" due to wheels
(6) result decibans = 2.17 x F^2 - 10log10 w + 3
where w = total wheel length search space
for wheels 1 and 2 w = 1271 = 30.8 decibans
for wheels 4 and 5 w = 598 = 27.8 decibans
so for a result 50 : 1 on (certain) i.e. 17 decibans, F needs to be 4.5 for
wheels 1 and 2
for 6:1 on (good) i.e. 8 decibans, F needs to be 4.0 for wheels 1 an 2
but another way of looking at this is as follows:-
F x sigma = result score bulge = c - r = lambda
then F = lambda/sigma and sigma for two wheels = 1/2 x (n)^1/2
(7) so F = 2 x lambda x (n)^-1/2
then from (4) odds = o = 4 (lambda)^2 / n
(8) count decibans = 2.17 x 4 x (lambda)^2 / n
(9) result decibans = 8.68 x (c - r)^2 / n - 10log10 w + 3
This gives the result decibans directly from the count bulge and wheel length
without having to calculate sigma.
Some runs result in counts that are very close and all significantly
away from the random. These are contenders for the correct wheel start
positions and have to be resolved by further runs involving the disputed
wheels.
Scenario 2 shows a good example of this.
(With grateful thanks to the Walter Fried Fish Notes without which I would
have had difficulty presenting this)
This page was originally created by the late Tony Sale, the original curator of the Bletchley Park Museum,