Basic text encryption in bash

This will be a repost of an assignment I had in my network security class last semester, mostly to test out embedding code in WordPress since I’ve never used it before. I don’t think this counts as plagiarism but who knows?

There are a variety of online cipher tools that demonstrate different cryptographic algorithms. Visit the website Cipher Tools (rumkin.com/tools/cipher/) and explore the different tools. Select three tools, one of which is mentioned in this chapter (ROT13, One-Time Pad, etc.). Experiment with the three different tools. Which is easy to use? Which is more difficult? Which tool would you justify to be more secure than the others? Why? Write a one-page paper on your analysis of the tools.

I chose to experiment with PlayFair, Atbash Cipher, and Caesarian Shift. To see which is the easiest to use and in the pursuit of answering the questions thoroughly and specifically, I decided to implement each of them in bash and note which was the most complicated to implement and which was most complicated to use for an end user. Since our responses are only supposed to be 300 words, you can go straight to the conclusion.

PlayFair works using a key agreed upon by the message sender and the receiver. This key cannot have repeated letters, or if it does the repeated letters will be discarded. A 5×5 table containing all the letters of the alphabet are used – except generally the letter J is replaced with the letter I to give 25 letters, which allows us to have a 5×5 grid instead of some non-square grid. The keyword is inserted to the start of the table. Any letters that are not used in the keyword are added alphabetically at the end. An encoding table with the keyword “SECURITY” would look like – notice J is missing:

In this example, we will use the SECURITY keyword to encrypt the phrase RAIL. First, the plaintext phrase is split into pairs of two letters. Here, it would be RA IL. If there had been an odd number of characters, the last letter would be padded with an X (or any letter you like). If there had been double letters, the algorithm requires that we either drop the second letter or separate the two identical letters with an X in our plaintext. Next, we look at the table and create a rectangle out of the letter pairs. So the first two letters will be RA. Each letter serves as a corner for the rectangle.

For each letter in the pair, you slide horizontally to where the rectangle ends. So, our ciphertext for this pair becomes UB. Then, we move to the next pair: IL. They line up in a column, and the rule is to pick the letter beneath the one you’re encrypting. Should it be needed, you’d wrap back up to the top. Horizontal rows work in a similar fashion but don’t show up here. Below I is D, and below L is Q. The ciphertext after going through the algorithm is UBDQ.

Decrypting is the same algorithm but in reverse. PlayFair is an old algorithm and is quite easy to do by hand, which made it popular in times before computer aids. For being a handwritten cipher, it is fairly strong, but it is not impossible to break and suffers from having patterns – altering a character once from the plaintext alters it everywhere equally in the ciphertext (diffusion). It is easy to use, and likely even a child could understand how it works if they had a decent teacher.

#!/bin/bash
#playfair.sh
#Jordan Thomas
#comments added and some error checking removed to make everything
#easier to understand for myself

usage() {
    echo "Usage: `basename $0` keyword encryptable_text"
    echo "Note:  only use uppercase letters or I will break"
    echo "-h    show this message"
}

if [ "$1" == "-h" ]; then
    usage
    exit 0
fi

if [ $# -le 1 ]; then
    usage
    exit 1
fi

keyword=$1
encryptme=$2

#arrays for our key and text to encrypt
declare -a keys
declare -a plain

# only uppercase will be implemented for simplicity's sake
characters="ABCDEFGHIJKLMNOPQRSTUVWXYZ"

#take our one dimensional key array and print it in 2D
#just make a new line at every 5th letter, using 5x5 key
printKey() {
    for((i=0; i<${#keys[@]}; i++))
    do
        #key square is 5x5, so put a linebreak every 5th letter
        if [ `echo $i%5 | bc` -eq 0 ]; then
            echo ""
        fi
        echo -n ${keys[$i]}
    done
    echo ""
}

#add letter passed in into the key array
#not doing any sort of error checking assuming it's all good
#do need to check if the letter is already in there though so
#later we do not add a letter that was not in the keyword ($1)
addKeyCharacter() {
    letter=$1

    #rumkin.com says that usually the letter J is removed
    #and replaced by the letter I, so if we get a J change to I
    #this is to give us 25 characters and a 5x5 array
    if [ "$letter" == 'J' ]; then
        letter='I'
    fi

    #If the character has been added do not add, just quit
    if [ `echo ${keys[@]} | grep -c "$letter"` -ne 0 ]; then
        return;
    fi

    keys=( ${keys[@]} $letter )
}

#takes in a character
#returns the index of the key array where the letter is found
#used to
findLetterInKey() {
    letter=$1
    for((j=0; j < ${#keys[@]}; j++))
    do
        if [ "${keys[$j]}" == "$letter" ]; then
            return $j
        fi
    done
}

#time now to put our keyword into the key square
for((i=1; i <= ${#keyword}; i++))
do
    #range from current character to current character
    #top answer here https://unix.stackexchange.com/questions/9468/how-to-get-the-char-at-a-given-position-of-a-string-in-shell-script
    current=${keyword:`expr $i - 1`:1}
    addKeyCharacter $current
done

#add the letters not in the keyword into the square
#we do the checking if it was already added in the add function
#so nothing special needs to be done here
for((i=1; i <= ${#characters}; i++))
do
    current=${characters:`expr $i - 1`:1}
    addKeyCharacter $current
done

echo "encoding tableau:"
printKey

tmp=""

for((i=1; i <= ${#encryptme}; i++))
do
    letter=${encryptme:`expr $i - 1`:1}
    #letter=`echo $encryptme | cut -c $i`
    #move J to I
    if [ "$letter" == 'J' ]; then
        letter='I'
    fi

    #Return if the parameter is not a character
    if [ `echo $characters | grep -c "$letter"` -eq 0 ]
    then
        continue
    fi

    if [ "$tmp" == "$letter" ]; then
        continue
    fi

    #make our chunks
    tmp="$tmp$letter"
    #echo "TMP $tmp"
    if [ ${#tmp} -eq 2 ]; then
        plaintext=( ${plaintext[@]} $tmp )
        tmp=""
    fi
done

#add the X if we have a double letter
if [ ${#tmp} -eq 1 ]; then
    tmp=$tmp"X"
    #echo "X ADD $tmp"
    plaintext=( ${plaintext[@]} $tmp )
fi
echo ""
echo "pairs: ${plaintext[*]}"
echo "encoded message:"
#iterate and encode using the rectangle technique
for((i=0; i < ${#plaintext[@]}; i++))
do
    letter1=`echo ${plaintext[$i]} | cut -c 1`
    letter2=`echo ${plaintext[$i]} | cut -c 2`

    #below, $? gives the most recent return value
    findLetterInKey $letter1
    p1=$?

    findLetterInKey $letter2
    p2=$?

    row1=`echo "(($p1)/5)" | bc`
    col1=`echo "(($p1)%5)" | bc`

    row2=`echo "(($p2)/5)" | bc`
    col2=`echo "(($p2)%5)" | bc`
    
    #case of same column
    if [ $row1 -eq $row2 ]; then
        col1=`echo "(($col1+1)%5)" | bc`
        col2=`echo "(($col2+1)%5)" | bc`

    #case of same row
    elif [ $col1 -eq $col2 ]; then
        row1=`echo "(($row1+1)%5)" | bc`
        row2=`echo "(($row2+1)%5)" | bc`
    
    #regular case, choose corners of rectangle
    else
        tmp=$col1
        col1=$col2
        col2=$tmp
    fi

    p1=`echo "($row1*5)+$col1" | bc`
    p2=`echo "($row2*5)+$col2" | bc`

    letter1=${keys[$p1]}
    letter2=${keys[$p2]}

    echo -n $letter1$letter2
done
echo ""

Next up is the Atbash cipher. It is incredibly simple and in theory very similar to ROT13 or the Caesar cipher I will discuss next as it is just sliding your input through the space of available characters. Atbash differs in that it slides backwards. Essentially it is swapping each character in its input for the character the same distance from the end of the list of characters. For instance, if your input space was all lowercase letters you would have your possible inputs as such:

The top row is an array of unencrypted characters, the second row is the array of ciphered characters, and the bottom is a list of the indexes for each character. This doesn’t have to be arrays, but it’s how I chose to implement it. A hash table would be just as effective, and for this input since the alphabet does not repeat a second array is not technically needed. One could simple pull the item from array[lengthOfArray –IndexOfPlainTextCharacter] but as I chose to include lowercase and uppercase in my implementation, this would not have worked as expected although it would decrypt correctly since Atbash has the property of reversibility. The same function to encrypt is used to decrypt.

As an example, if we wanted to use the cipher to encrypt “STEAL”, we would get the index at each letter from the top array and grab the matching element from the second array.

This would give us “HGVZO”. Atbash is incredibly easy to implement and even on very weak hardware could be very efficient. Using it is easier than PlayFair as the user does not need to share keys with the receiver, yet this also makes it less secure. As security increases, convenience decreases and vice versa.

#!/bin/bash
#atbash.sh
#Jordan Thomas
#very simple implementation of atbash cipher
#Build two arrays, one forward alphabet and one backward
#for each letter in input, pull the corresponding letter
#from the backwards array and print it out

usage() {
    echo "Usage: `basename $0` keyword encryptable_text"
    echo "Note:  only use alphabetical letters"
    echo "-h    show this message"
}

if [ "$1" == "-h" ]; then
    usage
    exit 0
fi

if [ $# -le 0 ]; then
    usage
    exit 1
fi

plaintext=$1
declare -a unencryptedChars
declare -a encryptedChars
unencryptedChars=( {A..Z} )
unencryptedChars+=( {a..z} )
encryptedChars=( {Z..A} )
encryptedChars+=( {z..a} )

#takes in a character
#returns the index of the array where the letter is found
findLetterPos() {
    letter=$1
    # echo "searching for $letter"
    for((j=0; j < ${#unencryptedChars[@]}; j++))
    do
        if [ "${unencryptedChars[$j]}" == "$letter" ]; then
            # echo "found at $j"
            return $j
        fi
    done
}

for((i=1; i <= ${#plaintext}; i++))
do
    current=${plaintext[@]:`expr $i - 1`:1}
    # echo "sending $current at i $i"
    findLetterPos $current
    index=$?
    #encrypted[$i]=${encryptedChars[$index]}
    echo -n ${encryptedChars[$index]}
done
echo ""

Lastly, I looked at the Caesarian Shift. It is ROT13 with some arbitrary shift value applied. Caesarian Shift is very easy to break as all one would need to do once they suspected this cipher was used would be bruteforcing it. In my
implementation where only uppercase letters were translated, one would only need to bruteforce 25 variations to guarantee they find the correct deciphered text and look at them until one made sense. This was the easiest to implement, and incredibly easy to use. It is slightly more secure than Atbash as it does at least need a key (the offset) but due to how easily it can be bruteforced it cannot be recommended for anything more than learning purposes.

In a Caesarian Shift, a B would become an L if we were using 10 as our offset value

Below, my implementation of a Caesarian Shift using bash and tr to do most of the heavy lifting. Again, for the sake of simplicity in implementation, only uppercase letters were handled but as a not necessarily intended consequence numbers do not break anything they are just left unshifted.

#!/usr/bin/bash
#caesar.sh
#Jordan Thomas
#Caesar Cipher, basically rot13 with arbitrary rotation distance

usage() {
    echo "Usage: `basename $0` plaintext offset"
    echo "Note:  only use uppercase letters or I will break"
    echo "-h    show this message"
}

if [ "$1" == "-h" ]; then
    usage
    exit 0
fi

if [ $# -le 1 ]; then
    usage
    exit 1
fi

#going past 26 is the same thing and this will ensure that
#we don't go past the end of our available translation characters
offset=$(($2%26)) 
plaintext=$1

#2 sets of characters to let us shift by up to 26
#if we don't, tr will just repeat the last letter after translation
#this way, numbers beyond the 26th are available.  
characters="ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ"
#encoder="${characters:n}${characters:0:n}"

#there just happens to be a tool on most systems for translating characters
#called tr. 
echo $plaintext | tr "${characters:0:26}" "${characters:${offset}:26}"

Conclusions:

In this project, I compared the PlayFair, Atbash, and Caesar ciphers.levels of security.

	Ease of Use	Relative Security	Recommended?
Playfair	Decent	Good	Yes, if information loss is ok
Atbash	Very Good	Very Bad	No
Caesar	Good	Very Bad	No

PlayFair was the only algorithm I would consider recommending because it was the only one of the three which provided even decent cryptographical security. It is only useful, however, for word-based messaging. Since a repeating letter pair causes the rectangle to be 1×1, the corners are the same letter and the character is not encoded. Without modifying the algorithm or the input, the encryption is not complete. The most common way around it is to either insert the letter X between double letters and remove X’s from your decrypted message later or to simply omit one of the repeated letters. This ends with you not being able to encrypt your original message as-is and makes using this cipher more difficult as well as prone to user mistakes.

Atbash was the easiest of the three to use as it only required a plaintext input and no key of any sort. Simply feed in the string to encrypt and receive a ciphered message. Since there is no key required, you can send across this ciphertext to anyone that knows Atbash was used and they will be able to decrypt. It’s incredibly easy to defeat though, and somehow the human brain can look at Atbash English language cyphertext and sense that the letters were flipped about the middle of the alphabet. That’s a very bad trait for a cipher to have, it should be nigh-on impossible to tell what actions were performed on the plaintext.

Caesarian shifts lie somewhere in the middle on ease of use. The only key that needs to be stored is a number describing how far to shift the characters down the alphabet. There is a limited amount of security in that, but once it is known that this shift was applied, it only takes the length of the alphabet amount of times to guarantee you have the correct decrypted
string.

Basic text encryption in bash

Conclusions:

Posts by date

Categories

Navigation