Assignment

Problems

Assume you run a testing center where students arrive throughout the day to take the GRE. Your testing center has a classroom of size 2i, for each i ≥ 0. You only haveone proctor; however, so all students at the test center at any given time must be in the same classroom.
You have also noticed that students perform poorly if a room seems empty, so you insist on the rule thatthe room containing the students at any given time must have more full than empty seats.
To accommodate these constraints, you use the following algorithm to handle each new arrival of astudent:
• Start the day using the smallest available classroom (which is size 20 = 1).
• Once you run out of space in a classroom of size 2i (e.g., the room is full and then a new studentarrives), move all of the existing students and the new student to the classroom of size 2 · 2i = 2i+1.
You are concerned about the cost of this room scheduling policy. In particular, you calculate the cost ofeach student arrival to be the number of students that must be moved (including the new student) toaccommodate the arrival. More formally, consider a sequence of n arrivals. Let ci be the cost of the ithstudent arrival. You can define ci = 1 + mi where:
is a power of 2


The four questions that follow ask about this above scenario. The purpose of these questions is to explorehow amortized analysis might help you more accurately understand the arrival costs.
1.

To better understand the worst case behavior of this scheduling algorithm answer the following:
(a) Provide a concise and clear argument why the definition of ci properly captures the arrival costof new students.
(b) Argue that in a sequence of n operations, ci = Θ(n) in the worst case.
2.

Prove that arrivals have a constant amortized cost by using the aggregate analysis technique.
3.

Prove that arrivals have a lognamortized cost, assuming n total students arrive, by using theaccounting method technique.
4.

Refine your answer from the preceding problem to show that arrivals have constant amortized costusing the accounting method.
5.

We discussed the aggregate analysis and accounting method techniques for amortized analysis. Athird such technique is called the potential method. Answer the following question:
Consider a simple unary counter that counts from 1 to k before wrapping back around to 1 (forsome k ≥ 1). One way to implement this counter is with an array A of size k. Initially A[1] = 1 andA[j] = 0, for 2 ≤ j≤ k. A variable p keeps track of the smallest position in the array that currentlystores a 0. If there are no 0’s in the array, then p = k + 1. We initialize p ← 2.
To INCREMENT the counter, there are two cases. If p = k +1, loop through A and set each positionto 0, then set A[1] ← 1 and p ← 2. Otherwise, if p ≤ k, then set A[p] ← 1 and p ← p + 1.
Let ci be the cost of the ithINCREMENT. If we focus only on changing array entries when calculatingcost, we can use the following definition for ci:
(
1 if i is not a multiple of k, ci =k+1 else.
In the worst case, therefore, ci = k + 1. Prove that INCREMENT has constant amortized cost usingthe potential method.
6.

In the notes on hashing, we introduced the load factor α as a way to measure the potentialperformance of a hash table with m entries containing n total elements. Assume we claim to havea great hash function h that guarantees the following: if you use h to implement a hash table withchaining, then every list is always strictly smaller than the load factor. Prove this is impossible.
7.

At the end of the notes on hashing, we discussed implementing hash tables with open addressing.
Open addressing requires an extended hash function that maps each key to a permutation of thevalues 0,1,…,m – 1. Consider the following simple definition of an extended hash function h:
∀k ∈U : h(k) = h0,1,2,…,m – 1i.
Given an open address hash table implemented with h, how many probes are required for anunsuccessful SEARCH, assuming that α<1? 

Solution 

  1. To better understand the worst case behavior of this scheduling algorithm, answer the following:

(a)  Provide a concise and clear argument why the definition of ci properly captures the arrival cost of new students.

Answer:

If the number of students (n) in classroom is power of 2, all students in classroom have to be moved to new classroom with arrival of new student. ith student comes in this time, (i-1) students have to move so the arrival cost of new students is i.

If there are places for new students that arrive, the students who arrived before don’t need to move. So in this case the arrival cost of new students is 1.

(b)  Argue that in a sequence of n operations, ci = Θ(n) in the worst case.

Worst case is when all students have to move to new classroom.

In this case arrival cost is n. It means ci cost is Θ(n) in the worst case. 

  1. Prove that arrivals have a constant  amortized  cost  by  using  the  aggregate  analysis  technique.

At first we need to obtain the upper bound on actual total cost.

If the number of students is power of 2, all the students move to new classroom so that the running time of this operation is linear to n.

The worst is when all the students have to move. The number of students that should be moved to are at most n. For any value of n, the ith operation takes a total of O(n) time.

The average cost of each operation is O(n)/n = O(1), So is the constant amortized cost.

3.Prove that  arrivals  have  a  logn  amortized  cost,  assuming  n  total  students  arrive,  by  using  the accounting method technique.

Only when the number of students in classroom is power of 2, students have to move. The number of students who have to move is at most n.

The MOVMENT operation is performed logn times.

Summing this fact gives the total amortized cost of .

Thus, amortized cost is 

  1. Refine your answer from the preceding problem to show that arrivals have constant amortized cost

using the accounting method.

For the amortized analysis, let us charge an amortized cost of 2 dollars for a new student arrival.

When a new student arrives, we use 1 dollar (out of the 2 dollars charged) to payfor his arrival, and we place the other dollar on the one of the students who arrived beforeexpansion as credit tobe used he moves to new classroom.

At any point in time, at least one student in the classroom has a dollar of credit, and thus we can charge nothing to move him to new classroom; we just pay for the new students’ arrivals. Accordingly, the amount of credit stays nonnegative at all times.

Thus, for n student’s arrival, the total amortized cost is O(n).

The average amortized cost O(n)/n is O(1), that is constant.

  1. We discussed the aggregate analysis and accounting method techniques for amortized analysis. A

third such technique is called the  potential method. Answer the following question:

Consider a simple unary counter that counts from 1 to k before wrapping back around to  1  (for

some k ≥ 1). One way to implement this counter is with an array A of size k. Initially  A[1] = 1 and

A[j] = 0, for  2 ≤ j  ≤ k. A variable  p keeps track of the smallest position in the array that currently

stores a 0. If there are no 0’s in the array, then p = k + 1. We initialize p ← 2.

To INCREMENT the counter, there are two cases. If p = k +1, loop through  A and set each position

to 0, then set A[1] ← 1 and p ← 2. Otherwise, if p ≤ k, then set A[p] ← 1 and p ← p + 1.

Let ci be the cost of the ith INCREMENT. If we focus only on changing array entries when calculating

cost, we can use the following definition for ci:

(1)if i  is not a multiple of  k,  ci = k+1 else.

In the worst case, therefore, ci  = k  + 1. Prove that INCREMENT has constant amortized cost using the potential method.

We define the potential function Ф on an array to be the number of 1s in the array.

For the array with k zeros Do , we have Ф(Do) = 0. Since the number of 1s in the array is never negative, the array  that results after the ith operation has nonnegative potential, and thus

Ф(Di) 0 = Ф(Do)

The total amortized cost of n operations with respect to Ф therefore represents an upper bound on the actual cost.

If INCREMENTAL operation is performed on , then the potential difference is

Ф(Di) – Ф(Di-1) = (i + 1) –i = 1

The amortized cost of this INCREMENTAL operation is

=  + Ф(Di) – Ф(Di-1) = 1 + 1 = 2

The kth operation on the array is to reset all the array members to zero and the potential difference is Ф(Di) – Ф(Di-1) = –k

Thus, the amortized cost of the kth operation is

=  + Ф(Di) – Ф(Di-1) = k – k =0

The amortized cost of each operation is O(1), and thus the total amortized cost of a sequence of n operations is O(n). Since we have already argued that Ф(Di) 0 = Ф(Di-1), the total amortized cost of n operations is an upper bound on the total actual cost. The worst-case cost of n operations is therefore O(n).

Thus INCREMENTAL operation has the constant amortized cost O(1).

  1. In the  notes  on  hashing,  we  introduced  the  load  factor  α  as  a  way  to  measure  the  potential performance of a hash table with m entries containing n total elements. Assume we claim to have a great hash function h that guarantees the following: if you use h to implement a hash table with chaining, then every list is always strictly smaller than the load factor. Prove this is impossible.

We can prove that this is impossible to make every list of entries in hash table is strictly smaller than α.

We adopt α to avoid of multiple keys hashing to the same entry in hash table.

Let’s assume every list – , is strictly smaller than α.

In that case the total number of entries that hash function can handle would be

By inequality above, 

It means, at least two entries will return the same value in the range of [0, …, m].

That’s why it’s absolutely impossible that every list is smaller than .

  1. At the end of the notes on hashing, we discussed implementing hash tables with open addressing.

Open addressing requires an extended hash function that maps each key to a permutation of the

values 0,1,…,m − 1. Consider the following simple definition of an extended hash function h:

∀k ∈U : h(k) = h0,1,2,…,m − 1i.

Given  an  open  address  hash  table  implemented  with  h,  how  many  probes  are  required  for  an unsuccessful SEARCH, assuming that α < 1?  

Given an open address hash table with load factor α = n/m < 1, implemented with a “good”
hash function, the expected number of probes required in an unsuccessful search is at most 1/(1 – α). 

Project 2:

1
A Simple, Encrypted P2P Instant Messenger
youwill be extending your earlier unencrypted messaging application with encryption! We’llcall this new program EncryptedIM.py.
Your program should encrypt messages using AES-256 in CBC mode, and use HMAC with SHA256 formessage authentication. Initialization Vectors should be generated randomly. Your program must use aMAC-then-encrypt scheme.

Your program should have the following command-line options:

python EncryptedIM.py [–s|–c hostname] [–confkey K1] [–authkey K2]
where the –s argument indicates that the program should wait for an incoming TCP/IP connection on port9999; the –c argument (with its required hostname parameter) indicates that the program should connectto the machine hostname (over TCP/IP on port 9999). –confkey specifies the confidentiality key (K1) usedfor encryption, and –authkey specifies the authenticity key (K2) used to compute the HMAC.
To force the keys to be 256 bits long, you may use the SHA-256 hash function on the K1 and K2 parameters.
As an example, you may run “python EncryptedIM.py –s –confkey FOOBAR –authkey CSISAWESOME” in oneterminal window, and then start “python EncryptedIM–c127.0.0.1 –confkey FOOBAR –authkey CSISAWESOME” in another terminal window. Note that theinstance with the –s option must be started before the other instance.
Along with your code, you must submit a brief protocol document in plain ASCII that describes the format of your messages. In particular, the document should describehow/where the IV is transmitted, and the locations of the ciphertext and HMAC in the messages.

Additional requirements and hints.Please make sure that your program conforms to the following:
• Your program should verify that the HMAC is correct. If it is not, it should exit with an error message.You should test that authentication is working properly by specifying different authentication keysfor your client and server instance: this should produce your error message and cause the programto exit!
• You may use only the pycryptolibrary for encryption and authentication. DO NO CREATE YOUROWN AES OR SHA-256 IMPLEMENTATIONS.
• You may use, with proper citations as code comments, code and documentation from https:
//www.dlitz.net/software/pycrypto/.

2
• Your program should not take in any additional command-line options other those described above.The –confkey and –authkey arguments are mandatory; they are not optional.
• Your program can terminate either when the user presses CTRL-C, or when end-of-file (EOF) is
received. To generate EOF from the terminal, press CTRL-D.
• You can test whether your program is actually encrypting data by using tcpdumpto capture trafficon your loopback address using the command tcpdump -i <loopback interface>-w <output file>.
This will listen on the loopback address and create a pcap file at the give file address that you canopen and view with Wireshark. To determine what your loopback interface is, use the commandtcpdump -D.

 UnencryptedIM.py.txt

 import sys

import socket

import select

importargparse

port = 9999

parser = argparse.ArgumentParser()

# arguments are mutually exclusive

group = parser.add_mutually_exclusive_group(required=True)

group.add_argument(“–s”, dest=’server’, action=’store_true’)

group.add_argument(“–c”, dest=’hostname’)

args = parser.parse_args()

#create an INET, STREAMing socket

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

ifargs.server:

# makesoket non-blocking

sock.setblocking(0)

# reuse socket

sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR,1)

sock.bind((”, port))

# become a server socket

sock.listen(1)

else: # client

# connect to the host

sock.connect((args.hostname, port))

inputs = [sock, sys.stdin]

outputs = []

input_msg = ”

output_msg = ”

while inputs:

readable, writable, exceptional = select.select(inputs, outputs, inputs)

for s in readable:

if s is sock and args.server:     # accept new connection

conn, address = s.accept()

# make client non-blocking as well

conn.setblocking(0)

# accept input from client

inputs.append(conn)

elif s is sys.stdin:            # user input from stdin

output_msg = sys.stdin.readline()

ifoutput_msg:

if s not in outputs:

if not args.server:     # client sends to server via sock

outputs.append(sock)

elif conn:              # server sends to client via conn

outputs.append(conn)

else:                          # receive input from the other side

input_msg = s.recv(2048)

ifinput_msg and sys.stdout not in outputs:

outputs.append(sys.stdout)      # pass input to stdout

for s in writable:

msg = input_msg if s is sys.stdout else output_msg

ifmsg:

if s is sys.stdout: # write to stdout

sys.stdout.write(input_msg)

sys.stdout.flush()

else:   # send to the other side

s.send(msg)

outputs.remove(s)

# close connection on any errors

for s in exceptional:

inputs.remove(s)

if s in outputs:

outputs.remove(s)

s.close() 

Solution 

EncryptedIM.py 

import sys

import socket

import select

importargparse

fromCrypto.Hash import SHA256, HMAC

fromCrypto.Cipher import AES

from Crypto import Random

port = 9991

parser = argparse.ArgumentParser()

# arguments are mutually exclusive

group = parser.add_mutually_exclusive_group(required=True)

group.add_argument(“–s”, dest=’server’, action=’store_true’)

group.add_argument(“–c”, dest=’hostname’)

parser.add_argument(“–confkey”, dest=’confkey’, required=True)

parser.add_argument(“–authkey”, dest=’authkey’, required=True)

args = parser.parse_args()

confkey_hash = SHA256.new(args.confkey).digest()

authkey_hash = SHA256.new(args.authkey).digest()

mac = HMAC.new(authkey_hash)

#create an INET, STREAMing socket

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

def pad(s):

return s + (AES.block_size – len(s) % AES.block_size) * chr(AES.block_size – len(s) % AES.block_size)

defunpad(s):

return s[:-ord(s[len(s)-1:])]

ifargs.server:

# makesoket non-blocking

sock.setblocking(0)

# reuse socket

sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR,1)

sock.bind((”, port))

# become a server socket

sock.listen(1)

else: # client

# connect to the host

sock.connect((args.hostname, port))

inputs = [sock, sys.stdin]

outputs = []

input_msg = ”

output_msg = ”

while inputs:

readable, writable, exceptional = select.select(inputs, outputs, inputs)

for s in readable:

if s is sock and args.server:     # accept new connection

conn, address = s.accept()

# make client non-blocking as well

conn.setblocking(0)

# accept input from client

inputs.append(conn)

elif s is sys.stdin:            # user input from stdin

output_msg = sys.stdin.readline()

ifoutput_msg:

if s not in outputs:

if not args.server:     # client sends to server via sock

outputs.append(sock)

elif conn:              # server sends to client via conn

outputs.append(conn)

else:                          # receive input from the other side

input_data = s.recv(2048)

ifinput_data and sys.stdout not in outputs:

# extract HMAC hexdigest

verify = input_data[:32]

# extract AES encrypted message

input_msg = input_data[32:]

mac.update(input_msg)

if verify != mac.hexdigest():

sys.stdout.write(“Error: authentication failed\n”)

sys.exit()

else:

outputs.append(sys.stdout)      # pass input to stdout

for s in writable:

msg = input_msg if s is sys.stdout else output_msg

ifmsg:

if s is sys.stdout: # write to stdout

# extract initialization vector

iv = input_msg[:AES.block_size]

# create AES CBC cipher for (confkey_hash, iv) pair

cipher = AES.new(confkey_hash, AES.MODE_CBC, iv)

# decrypt the message

decrypted = cipher.decrypt(input_msg[AES.block_size:])

# remove message padding

unpadded = unpad(decrypted)

sys.stdout.write(unpadded)

sys.stdout.flush()

else:   # send to the other side

# make a random initialization vector

iv = Random.new().read(AES.block_size)

# create AES CBC cipher for (confkey_hash, iv) pair

cipher = AES.new(confkey_hash, AES.MODE_CBC, iv)

# make sure a message is a multiple of AES.block_size

padded = pad(msg)

# encrypt the message using IV

encrypted = iv + cipher.encrypt(padded)

mac.update(encrypted)

s.send(mac.hexdigest() + encrypted)

outputs.remove(s)

# close connection on any errors

for s in exceptional:

inputs.remove(s)

if s in outputs:

outputs.remove(s)

s.close()