Homework 4 Notes

Home > CS178 > Homework 4

I wasn't planning on posting notes this week but here goes...

Some people are having problems deciphering the homework problem, though they can do last year's sample (from this site). As far as I could tell, my program works for both, so it SHOULD be possible. If you have a little bug in your final correlation coefficient step, try printing out the highest values instead of just taking the very highest value. From that, you might see a word or other reasonable key.

> I'm done with the program =), but I hate not understanding what I'm doing
> What exactly is p(t-j) doing? The one gram probability minus the
> alphabetical letter you are computing a probability for?
>
> Why?

remember that whatever's inside the parentheses is what you're computing
the onegram of.  so it's the onegram of p-j, not (onegram of p) minus j.

In any case, doing p-j is like decrypting... you're in effect doing a
trial decryption.  So p is ciphertext, j is key, and p-j is giving you
plaintext.  that's why onegram makes sense... onegram is for English, and
you're using it with decrypted text.

> > I guess I also don't understand the two lines above the table, the first
> > y0 to yn-1-j and the second yj to yn-1. I thought we compared the whole
> > ciphertext against the whole shifted ciphertext (except for dangling head
> > and tail chunks that have no partner letter due to the shift) to find
> > coincidences. Are we supposed to cut it in half and shift one half?
>
> no

When (I think) Ted said to shift the ciphertext and compare, that was
supposed to be a general bird's-eye view of what to do.  When you're
coding it, you can just index into the ciphertext array at different
places and compare the letters to see if they are the same.  You don't
actually shift, cut, etc. the character array.

> > i have the key.
> >
> > and when i'm trying to subtract the ciphertext letter - key letter
> > i get something that does look english.
> >
> > what i'm i doing wrong?
> >
>
> If the key is WINE, build a table like this:
>
>  ABCDEFGHIJKLMNOPQRSTUVWXYZ
> Wwxyzabcdefghijklmnopqrstuv
> Iijklmnopqrstuvwxyzabcdefgh
> Nnopqrstuvwxyzabcdefghijklm
> Eefghijklmnopqrstuvwxyzabcd
>
> If the ciphertext is:
>
> omcenigi
>
> WINEWINE
>
> SEPARATE
>
> Use key 'W' for 'o' to get S.

Ted's method is exactly how the Vignere's table is supposed to work.
Once you understand that, you could try to do the following.

Every letter can be thought of as a number from 0 to 25, with A=0 and
Z=25.  Conversely, every number 0..25 can be thought of as a letter.

1. Make a function ord() that takes a char and returns the ordinal value
of the character, from 0 to 25.  In pseudocode, you might have something
like

ord(c) { return (int)tolower(c)-(int)'a'; }

2. Make a function chr() that does the opposite.

3. For every ciphertext char, subtract the correct key letter:

  chr(ord(ciphertext_letter)-ord(key_letter))

Of course, all math is done modulo 26, so add 26 if the difference is
negative.

> > also if you have two letters that are the same , when u subtract them
> > you will get 0 does this 0 mean a space?

From the above, we see that 0 means the letter A.

Once you think you have the key, try deciphering a few letters by hand.
Write out the numerical(ordinal) positions of the letters, and it should
go pretty quickly.  Having deciphered the first word or two, you can then
check your work against your program, and use the program to decipher the
rest.

> Could you clarify the correlation equation?  I've implemented code that
> performs the equation on part 4 from the slides (~konheim/CS178/VIG.pdf), but I
> keep getting nearly the same values when testing the example.  Should I use the
> equation on part 6?  Then what is S(i)?  Does the following look like it needs
> to be fixed (generally)?
>
> for(...), for(...), for(...)
> if( k-j >= 0 )
> correlation[i][j] += subfiles[i][k] * onegm[k-j] / size[i];
> else
> ... onegm[26-j] ...;
>
> Where subfile contains the number of occurances of a letter donoted by 'j', 'i'
> is the different subfiles, onegm is based on the actual cyphertext
> probablilities, and size is the number of letters in the subfile.

I hope onegm is based on plaintext, and not ciphertext, probabilities.
Secondly, instead of doing onegm[26-j] do onegm[k-j+26].

Remember that you are still doing k-j ... it's just that you are doing mod
26 math, so you would add 26 to the result, not replace k with 26.

If you fix that, there will be a much bigger chance of it working.
You might get similar values... as long as they aren't all the same.
There should be a pretty obvious max value for each subfile, and Konheim's
key is usually a word (though I won't tell you if it is in this case ;)

> > Hi, I'm doing homework 4 and I need to find the kappa-value; in the reader is the formula:
> > (1/n) * sum of all values i=0 to i=n-1-s of (X(yi+s=yi))
> >
> > What is that "X" part?  I'm not sure what it is I'm summing.

y    is the ciphertext
yi+s is the character at position i+s in the ciphertext
yi   is the character at position i in the ciphertext
X    is 1 if the two are the same character, and 0 otherwise

Keep the big picture in mind: you are counting the number of times that
ciphertext characters that are s apart match.

so if s is 2, then

y=sadadf


when i is 0, you have no match (comparing s and d) so X = 0
when i is 1, you have a match (comparing a and a)  so X = 1
when i is 2, you have a match (comparing d and d)  so X = 1
when i is 3, you have no match (comparing a and f) so X = 0

So for this ciphertext, if s = 2, then the sum of all your X values is 2.

For the final kappa value, you divide this sum by the length of the
ciphertext.

Compute kappa for all values s (i recommend going from 1 to 30).  then
find local maxima etc. etc. as shown in section/class/etc.

Top | Home > CS178 > Homework 4