Packet Transmission

reject Packet Transmission CIS307: Packet Transmission

Packets: For a variety of reasons data in networks is transmitted in packets, which are sequences of octets (i.e. bytes). [Some reasons: cheaper to deal with loss or corruption of small packets than with long full messages; easier to share communication channels between concurrent communicating entities since there are no long messages for whose transmission all have to wait; usually less total transfer time in a store-and-forward multi-hop network.]
Usually packets are transmitted asynchronously, so we need to know when the packet starts and when it ends [in the case of RS232 characters we had a start and an end bit]. In general the problem of recognizing start and end of a packet is called the framing problem.

One possible way to represent a packet is by starting it with a special character, say SOH = 0x01, and ending it with a special character, say EOT=0x04. Then one can recognize a packet by looking for these characters.
Of course, we need to make sure that SOH and EOT appear only at the beginning and end of packets. This is done with byte stuffing: that is replacing in the data each occurrence of SOH, EOT, and ESC=0x1B, with ESC SOH, ESC EOT, and ESC ESC respectively (in bit oriented transmission like in HDLC one encounters bit stuffing, where one uses the pattern 01111110 to start and end a packet. Of course then in the data one has to prevent the occurrence of 6 consecutive 1s. This is done always inserting a 0 after five data 1s.).

Error Detection: Error detection is a form of error control [other forms of error control, for error recovery, are Forward Error Recovery and Automatic Repeat Request [ARQ]. ARQ involves Positive Acknowledgement, Retransmission after Timeout, Negative Acknowledgement and Retransmission. We will talk more about ARQ when we discuss data link level protocols.]. In general when a frame is sent, it may be lost in transit or it may be delivered corrupted (one hopes it gets there unchanged). The loss-in-transit of a frame is recognised by the sender by setting a time-out on the expected reception of an acknowledgement from the receiver.
Here we describe three methods that allow the receiver to detect errors, that is, to know with some probability that it has received what the sender sent.

Parity Bits: Suppose we send the following sequence of bits: 00101100 then if we send the sequence using even parity we will append one 1 and if are using odd parity we will append one 0. At the receiving end it will be easy to check if there has been a recognizable error. Instead of providing a one bit parity check for a whole data sequence we may provide an octet as parity check by dividing the data sequence into octets and then providing parity for each bit position. Parity checks are easily done but with very limited power to detect errors. Here we show the even parity octet for the following four octets

   00011100
   11011110
   11100110
   10101001
   =======
   10001101 Even parity octet

Checksum: Now instead of interpreting the data as a sequence of octets, we can treat it as a sequence of 16 bit integers and then add to the sequence the sum of all these integers summing in also the carry bits. The reason for adding up also the carry bit is so that the checksum can be computed in exactly the same manner and with the same result in little-endian and big-endian machines. For example

|48|65| "He"

|6C|6C| "ll"

|6F|20| "o "

|77|6F| "wo"

|72|6C| "rl"

|64|2E| "d."

======

|71|FC| Checksum

Alternatively the checksum can be computed by adding each individual byte and using the carry as an additional addend. Checksums are easy to compute and have a fairly good capacity to detect errors. Unfortunately they do not detect simple forms of errors. For example, suppose that errors reverse the value of the second bit of each word being transmitted. If these words had the same number of 1s and 0s as second bits then the error will not be detected. Experimental studies indicate that about 1/5000 packets are with errors (the majority of errors are introduced at the routers). Consequently since the checksum can fail to detect an error at best in 1/64k cases, we have that one packet in (5000*64k) =~330M packets will be with an error and its checksum will say that there is no error! So something else must be done to detect such errors (for example as we will see in security we can use digests).

Here is a routine to compute the checksum of an array of short unsigned integers:

       unsigned short checksum(unsigned short *buffer, int count)
       {   
          unsigned long int csum = 0;
          while (count--) {
             csum += *buffer++;
             if (csum & 0xFFFF0000){ /* There is a carry bit */
                csum &= 0x0000FFFF;  /* Remove the carry bit */
                csum++;              /* and add it to csum   */
             }
          }
          return (csum & 0xFFFF); /* The checksum computed in the 
                       internet is really ~(csum & 0xFFFF)
                       that is the 1-complement, so that at 
		       receiving end the total sum, including
                       received checksum, is zero. Also, since
                       1-complement has two representations for 0, 
                       0000...00 and 1111...11, this can give a bit
                       of information, a flag that can be used 
                       by the network protocols.
                       */
       }

Cyclic Redundancy Checks (CRC): The best method for detecting transmission errors is to use Cyclic Redundancy Checks, or CRC. CRC is founded on the idea that each word can be interpreted as a polynomial. For example the word |48|65| has the binary digits 100 1000 0110 0101 and can be interpreted as the polynomial

X**14 + X**11 + X**6 + X**5 + X**2 + 1

Then an irreducible polynomial P (it is for polynomials the same as a prime for numbers) is used to compute the modulo of the data polynomial. This modulo is taken as the CRC and appended to the data. The nice thing is that the modulo can be computed using simple circuits: XOR gates and shift registers initially containing 0. Here is an example of circuit for computing the CRC using as P the polynomial

X**16 + X**12 + X**5 + 1 When the all the data has been inputted the shift registers will contain the CRC.

|In
v X**16            X**12                      X**5                      1
X------------------>+--------------------------->+----------------------+
^                   |                            |                      |
|   +--+--+--+--+   v   +--+--+--+--+--+--+--+   v   +--+--+--+--+--+   v
+<--|  |  |  |  |<--X<--|  |  |  |  |  |  |  |<--X<--|  |  |  |  |  |<--+
|   +--+--+--+--+       +--+--+--+--+--+--+--+       +--+--+--+--+--+   
vout

To understand this structure, imagine that we are transmitting a single 1. Then, since it will be appended to the remainder, it is as if the 1 was multiplied by X**16, the degree of P. That is we obtain

   X**16            1
    |               |
    v               v
    10000000000000000

the remainder of this when dividing by P is

      X**12   X**5  1
        |      |    |
        v      v    v
     0001000000100001

which is just P eliminating its leading term. The Xor gates are positioned in the places where we have the 1s. When the message (data plus remainder) is received, the data will cause that same remainder to be computed in the receiver's register. When the original remainder is received, the two remainders will cancel each other leaving 0s in the registers.

Here is how things work out for our original data 100100001100101

    Arriving |
    Bits     | Content of registers
    =================================
      1         0000 0000000 00000
      0         0001 0000001 00001
      0         0010 0000010 00010
      1         0100 0000100 00100
      0         1001 0001001 01001
    ---------------------------------
      0         0011 0010011 10011
      0         0110 0100111 00110
      0         1100 1001110 01100
      1         1000 0011101 11001
      1         0000 0111011 10010
    ---------------------------------
      0         0001 1110110 00101
      0         0011 1101100 01010
      1         0111 1011000 10100
      0         1110 0110000 01001
      1         1101 1100001 10011
    ---------------------------------
                1011 1000011 00110

Thus the message (data plus remainder) is 100100001100101 1011100001100110

More formally, if P(x) is the irreducible polynomial and M(x) is the message we are sending, the circuit operates as if we were sending the code C(x) = (x**r)*M(x) + R(x), where r is the degree of P(x) and R(x) is the remainder of the divisions of (x**r)*M(x) by P(x). Since the dividend is equal to the divisor*quotient + remainder, C(x) is just P(x)*quotient [since R(x) + R(x) is just 0 in binary arithmetic, and P(x) is the divisor). Thus the receiver will receive just P(x)*quotient and dividing it by P(x) will have a zero remainder. Of course if the code is corrupted during transmission the remainder at the receiver will not be 0. In fact, if the error is E(x), the remainder will be E(x)/P(x).

If you use a CRC polynomial of degree r, then that use will allow us to detect errors in the message of up to r bits. If there are more than r bit errors, then there is no guaranty that we will detect the error, but it is likely that we do so (the probability of undeteced error is 1/2^r).

You can find additional information about CRC and even an animation of the algorithm here.

Though Cyclic Redundancy Checks may be greatly superior to checksums, checksums may be preferred when computation speed is of the essence. For example in the IP protocol (that we study later) they use checksums [and to do them faster the sender computes the 1-complement checksum, the receiver computes the checksum and then adds it to the sender's checksum to obtain 0 - thus saving on the time it would take if we did checksums at both sender and receiver and then compared]. Ethernet uses 32-bit CRC.

Forward Error Recovery

By using redundancy (i.e. transmitting in addition to the data a number of extra bits], it is possible to not only detect but also recover from certain kinds of errors without need of retransmission. Forward Error Recovery used to be called just Error Correction. Here are some basic ideas.

Suppose I have two bit vectors, for instance 0110101 and 0100111, their Hamming Distance is the number of bit where they differ (in our case the Hamming distance is 2). Then, if we use a set S of bit vectors such that the minimal Hamming distance between two elements of S is d, then

we will be able to detect up to d-1 bit errors, and
we will be able to correct up (d-1)/2 bit errors

when transmitting elements of S.
The argument for 1. is easy: by definition of distance we know that if we change up to d-1 bits in an element of S we do not get an element of S, thus the result of the change will not be in S.
The argument for 2. is a bit subtler (if follows what is called The Maximum Likelihood Principle): we claim that we can correct up to e errors where 2e+1 <= d. In fact if we have e errors in an element of S we will remain at distance at least e+1 from any other element of S, so it is rational to identify the erroneous vector with the closest element in S, the one at distance e.

Here is an example of 1 bit error correcting code (minimum distance = 3): We are given the following 4-bit vectors.

   0000		0001		0010		0011
   0100		0101		0110		0111
   1000		1001		1010		1011
   1100		1101		1110		1111

We represent them with the 7-bit vectors (the original 4 plus 3)

   0000 000	0001 101	0010 111	0011 010
   0100 011	0101 110	0110 100	0111 001
   1000 110	1001 011	1010 001	1011 100
   1100 101	1101 000	1110 010	1111 111

where the three additional bits are parity bits obtained considering, in order,

   bits 1     3  4  to compute the 5^thbit
   bits 1  2  3     to compute the 6^thbit
   bits    2  3  4  to compute the 7^thbit

that is

   (bit1        + bit3 + bit4 + bit5               = 0 modulo 2)
   (bit1 + bit2 + bit3               + bit6        = 0 modulo 2)
   (       bit2 + bit3 + bit4               + bit7 = 0 modulo 2)

You can check by looking at each pair of 7-bit vector that the Hamming Distance now is 3. Thus we can correct a one bit error. You might, in a night when sleep is late in coming, think why parity bits computed this way worked!
Also, do you see a fast way in the case of a single bit error to recognize what is the bit in error? can you suggest how to reorder the bits so that it is easier to identify the faulty bit? [You may want to check on the web a discussion of Hamming Codes.]

Automatic Repeat Request (ARQ):

A basic way of dealing with transmission errors, detected either by the receiver (error detection as described earlier), or at the transmitter (no acknowledgement within timeout) is to retransmit data. Methods based on this idea are called ARQ (Automatic Repeat Request) methods. Three such methods are in common use:

Stop-and-Wait: a transmitter after sending a frame waits for a message (ACK/NACK) back from receiver indicating if the previous frame was or not received correctly. If not, the frame is retransmitted.
Go-Back_N: Now frames are numbered and the transmitter is allowed to send a number of frames (up to N) without waiting for acknowledgement ( Sliding Window Protocol), but when the receiver sends a "NACK i" to indicate an error with frame i, then frame i and all its transmitted successors are retransmitted.
Selective-Repeat: Similar to Go-Back-N, but now for the response "NACK i" only frame i is retransmitted.

Performance of ARQ Techniques

We can express the time required to transmit a frame using Stop-and-Wait ARQ in terms of two basic times: the Transmission Time Tt and the Propagation Time Tp. The former is easily computed by dividing the size of the frame by the data rate of the channel. So a 1500 Bytes frame in Ethernet will take 1.2 ms (1,500B = 12,000b, data rate = 10Mbps). The latter is easily computed by dividing the distance between sender and receiver by the propagation speed of the signal (usually 200,000Km/s). So the propagation time in Ethernet at the distance of 200 meters will be 1 microsecond.
In Stop-and-Wait the delivery time of a frame, Td, is thus (assuming that the time to transmit an ACK or NACK is negligible and assuming that the time to process the ACK or NACK at the receiver is also negligible)Td = Tt + 2*Tp. This says that the utilization of the communication channel is

       Tt          1                 Tp   propagationTime
    --------- = --------  where a = --- = ----------------- =
    Tt + 2*Tp    1 + 2*a             Tt   transmissionTime
                                    propagationTime * dataRate
                                  = -------------------------- =
                                           frameSize
                                    sizeofDataInTransit
                                  = -------------------
                                        frameSize

So the utilization in our example in Ethernet is almost 100%. Note the delay-throughput-product = sizeofDataInTransit in "a". As the ratio of this product to the size of the frame grows, so does the efficiency of the transmission diminish.
Suppose we use a T1 line (1.5Mbps) to transmit a 100 Byte frame to Los Angeles from Philadelphia (5000 kilometers), we get Tp = 25ms, Tt = 0.5ms for a utilization of about 1%.
Things become even worse if we take into consideration the probability of errors. Say that the probability of loss of the frame is P (we assume the ACK/NACK never fails), then the utilization becomes

      (1 - P)
     ---------
      1 + 2*a

[this result is obtained reasoning as when we computed the round robin time in the notes on performance evaluation.]

If we consider instead the Go-Back-N ARQ or the Selective-Repeat ARQ the transmission of ACKs is overlapped with the transmission of the frames. Thus the utilization, without worrying about errors, becomes

 
                            N
                      /  -------  if  N < 1 + 2*a
                      |  1 + 2*a
      utilization =   |
                      |
                      \  1        otherwise

This can be understood with the following observations:

When N*Tt is less than (Tt + 2*Tp) the acknowledgement of the first frame is not yet received by the transmitter when the transmitter terminates sending all the N frames. Thus the transmitter has to wait before sending more frames and the utilization is less than 1.
When instead N*Tt is greater or equal than (Tt + 2*Tp) the acknowledgement for the first frame has been received before all the frames are sent, thus the transmitter can keep on transmitting without idle time.

The relationship between Tp, Tt, and N is captured by the following timeline diagram: