Home > CS178 > Homework 2

>       I'm looking at homework assignment 2 and am scratching my head. I 
> understand the concept of using the probability of letters appearing 
> next to each other to determine the order of columns. Given these 
> probabilities I feel confident I can determine the column orders. It's 
> finding these probabilities that I find troublesome. Is there a table 
> somewhere with the probability of characters appearing next to each 
> other? I am assuming the markov1 file on the website is the frequency 
> of each character appearing in text, given there are 26 numbers. 
> Markov2 however, has a table of all 26 letters but only 13 entries for 
> each letter.

First of all please remember to always address your query to both TAs
for all of the following reasons: we can both provide input, we both know 
what students are having questions with and can give consistent replies, 
and you have a better chance of getting a quick answer.

That said...

You're right for the most part, but notice there are 52 rows.
Dr. Konheim's 2-gram markov table is structured as follows:

   A....M
  A
  :
  :
  Z
   N....Z
  A
  :
  :
  Z

(The column headers A...M and N...Z are not shown, but the row 
headers make it clear that there is a break in the table).

Hint 1:  Since we don't need to look at your code, you can go into a text 
editor and mangle the table.  For example:

   A....Z
  A
  :
  :
  Z

Only you will know, and no one that knows will care. =)

Hint 2: A simple way of checking to make sure you have your table right is 
to check the value of QU. QU should be one, and since it isn't on the main 
diagonal you will know if you switched your table axes by accident.
To expand on what I said above and to answer the question more clearly, there are in fact two tables that are to be used. Dr. Konheim might have changed naming schemes from markov0,markov1 to markov1,markov2 but the theory is all the same. Refer to Equation 10 in the middle of page 15 of your reader.

The lower numbered markov file (currently at http://www.cs.ucsb.edu/~konheim/CS178/markov0) has 26 values used in scoring the first letter of a word. This is the "pi" that you see in the equation. The higher number markov file (currently at http://www.cs.ucsb.edu/~konheim/CS178/markov1) has 26x26 values that are transitional probabilities. Each entry in this (bigger) table is the probability of finding an character given that you've already seen another character, and is "P" in equation 10.

This is why there is a row of all zeros except for a single 1. In English, Q can only be followed by a U (usually). So given Q, the probability of finding a U is one (and so the entry is 1). Given Q, the probability of finding any other letter is zero.

Equation 10 will be explained in lecture/section in case you have any questions.

> if N=7 then ciphertext =
> 
> 0: Tve__oyp_odaocsun_rhesaoiaiI_r_nstd_iuyrnrpyiI_,th_ee_bgta
> 1: o_riAuna_n_o_iyasi_clcuisalft_iese_astrtmyi_eerdem_x,ost6r
> 2: rwptbmkunatlas)tdirserthrriloepimtidainn_fdmlsn_o_otldkrte
> 3: s_,eecr_9cg_p_aonfo_aemeMaekelhinwcg_pttitma_ntdeenarmanon
> 4: f,ci_e_ast_es_l_en1_oyadnsa_itmt_nTndntanmnhthadr_mcinfeye
> 5: th_uccrt_eeeuiinsdbbuaombr_esppsi_e_nc_tdeh(_ebgienicpp_ep
> 6: to_no_mlf_wtg_rtc.hmhm_may_i_cefa_u.h0yaaloleinsuelc_,t_nm
> 7: .
> 
> If i check 3 and 4 I take characters from 5 and put on the end of 4 when I
> change the offset.
> Now i want to check 5 and 6. I only have the "." to take from 7. Am I
> supposed to continue to take characters from 0 and put at the end of 6??

Nope.  Stop comparing when you run off the end of the ciphertext.
The reason you use values from the next column is because some rows may 
be short and some rows may be long, so you don't know where the boundaries
between the columns are.

However, you're sure that the end of the ciphertext represents the end of
some column, and the start of the ciphertext represents the start of some
column.  Thus, you don't need to break those boundaries (and in fact you 
shouldn't).

Actually, you don't really have to break the ciphertext into different
arrays.  You can use something like the equation on page 15 to index  
directly into your ciphertext string, and it will take care of offsets,
boundaries, etc.  I went over this in office hours today (Thursday).  Come
by tomorrow if you want an explanation.

> I'm horrible at mathimatical notations so I am just double checking what
> this means  YiL+a+k. Now I understand L+a+k are constants but this all 
> happens in the ith column?

Short answer: yes.

Long ansswer:

Y is the array/string/vector/etc. containing the ciphertext.  Everything 
else (in other words, the subscript) is the index into that structure.

i is the column id (j is the other column id)
L is the length of a column
a is the shift used for the current iteration (b is the other shift)
k is the position in the column for the current character.

This is basically a way of looking at a 1-dimensional structure in a 2-d 
way.  You have a single-dimensional array, but you can think of it in 
terms of columns and positions within columns.  i*L is the column, and a+k 
represents where you are within that column.

If something is unclear, send me e-mail and tell your friends the same.  I 
might hold an office hour this weekend if many ppl are still confused.

> Ok, so I think I found the answer to a cyclic shift.  I found blah blah blah 
> blah blah blah blah blah.  Maybe you can confirm this.  The last question I 

What you should do is decipher the text.  If you can do this, then you 
know you have the correct width and permutation!  

> have is about page 17 in the reader.  I am not sure how that inverse 
> thing works.  I couldn't make it to office hours or I would have asked.  
> Thank you in advance!

What you get as a result of making the tables shown on page 16 is the 
inverse ordering up to a cyclic shift.  It doesn't give you the exact 
ordering (make sure you understand why this is the case).

My pet example:

plaintext: Hello, how are you?
N=4
tau=3201
ciphertext: lhayl___?Hooroe,weu

You might find (for example) that

  0 belongs to the right of 1
  1 belongs to the right of 3
  2 belongs to the right of 0
  3 belongs to the right of 2

so the inverse permutation is 

  2310

If finding the inverse permutation seems weird, notice that the four 
statements of position are true for that inverse permutation, e.g. 3 does 
appear to the right of 2, etc.

So now that we have the inverse, how do we get to the actual permutation?

What you do is consider each number from 0 to 3 and see where it appears 
in the inverse permutation.

  Where does 0 appear? position 3
  Where does 1 appear? position 2
  Where does 2 appear? position 0
  Where does 3 appear? position 1

So the permutation is 3201, which is the permutation used to encipher.

Now that you have the permutation, you can do a trial decipherment.  It 
turns out that this is the correct one, but you may have to try several 
because the inverse permutation might be any of the following:

Top | Home > CS178 > Homework 2