CIS307: Bridges, IP Addresses, Routers

[Chapters 10, 12, 16 - Comer (1999)]

 

Bridges

A Bridge is a box with ports (usually two) to LAN segments. It operates in promiscuous mode at the data link layer (i.e. at the level of frames, not signals), it examines all frames and it recognizes where they came from, and where they are going to. It selectively (frame filtering) transfers frames from any port to other ports. It does not propagate noise signals and defective frames as it was the case for repeaters (at the physical layer). It adaptively recognizes which machines are reacheable from a port. It reduces traffic on each port and it improves security since each port will only transmit frames directed to nodes reacheable from that port (thus one does not overhear irrelevant traffic).

Bridges are normally used to connect LAN segments within a limited geographic area (local bridges), like a building or a campus. But they are also used in the network of an enterprise to interconnect LANs: a bridge in a LAN is connected through some long distance channel (for example, a line provided by a common carrier) to a bridge in a distant LAN (remote bridges). Usually bridges connect segments using the same data link protocol, but some modern bridges can convert between different data link protocols (for example, ethernet and token ring).
Bridges move packets to the correct physical addresses no matter what is the network protocol that sent them. Though modern routers alleviate the need for bridges, bridges remain useful because of their simplicity and high performance.

Bridges can be transparent (usually in Ethernet lans), also called spanning tree bridges, where the aim is to minimize all work required to set up a bridge and have instead the hardware and software set up the bridge with the information required for routing frames correctly. Or bridges can be source routing bridges (usually in token ring lans) where the route from sender to receiver are preset at the sender and included in the frame. We will only discuss transparent bridges.
All the nodes reacheable from a node through segments and bridges will receive broadcast messages sent by that node. They constitute the broadcast domain of the given node.

Bridges are able to filter frames on the basis of any information available at the data link level in the frame. For example, since an Ethernet frame has a field with information about the higher level protocol used in the data portion of the frame, a bridge could be programmed to filter out frames that use selected protocols.

Loop-free LANs and the Backward Learning Algorithm

We will consider first the case where LANs are connected by bridges in a way that does not create loops (loops are dangerous since on them frames may keep on circulating). In this case it is well defined what nodes are reacheable from each port and there is a unique path between any two nodes. This connectivity information, once collected, is used to decide how to route frames. The information is collected and used with the following algorithm called the Backward Learning Algorithm:

  The algorithm is run independently at each bridge.
  It makes the natural assumption: If a frame from a node
  j is received through port i then messages to j will go
  out through i.
  It keeps a cache of pairs of the form [i,j]
  where i is a port (of the current bridge) and j 
  is the address of a node (usually an ethernet address) 
  reacheable from the current bridge through port i. 
  This cache is initially empty. 

  When the bridge receives a frame from a port i it determines the 
  physical addresses of its source, j, and of its destination, k. 
  If k is a multicast address, then the frame is forwarded through
  all the ports except the one through which it was received 
  (flooding).

  If the pair [i,j] is not already in the cache, it is added to it.

  If [i,k] is in the cache then 
     the frame is discarded.
  else if there is a pair [h,k] in the cache then 
     the frame is forwarded through port h
  else 
     it is forwarded to all ports (flooding) except i.

Example

This algorithm does not assume any knowledge on the part of a bridge about the structure of the network. It just uses the address information in the frames about senders and receivers. With this information, for example, a message from 3 to 5 will not be forwarded by bridge A. Also, if 2 had also sent a message, then local traffic between nodes 1 and 2 would not be forwarded by either bridge A or B. Since bridges operate below the network layer, they do not us IP addresses, instead addressing is done using physical addresses, for instance ethernet addresses.

LANs with Loops and the Distributed Spanning Tree Algorithm

We may want to connect LAN segments with bridges in a way that causes loops to occur. We may want to do so in order: Since the backward learning algorithm does not work in the presence of loops, one computes using the Distributed Spanning Tree Algorithm a spanning tree (i.e. a tree connecting all the nodes of the network) of the network. Then one uses the paths defined in this tree (in a tree there is only one path between any two nodes). Only the ports included in the spanning tree will be used for communication (they are said to be forwarding or active). All other ports are said to be blocked or inactive.

The intent of the Distributed Spanning Tree Algorithm is to identify the node (i.e. bridge) with smallest id, the root-bridge of the network; Then for every other node, to identify the port, root-port, through which goes the shortest path to the root-bridge. Finally for every lan segment to choose the bridge in the shortest path to the root-bridge, the designated bridge of the segment, and the port through which that segment accesses that bridge, the designated port. The spanning tree will include only the bridge ports that are either root-ports or designated-ports. For example, if we take as cost of a path the number of bridges it traverses, the following network is changed as indicated.

The Distributed Spanning Tree Algorithm is interesting as an example of the kind of algorithm that works in a distributed environment where no node has full knowledge of the network. Later in the course we will discuss more in detail the difficulties that arise in distributed environments.

In the Distributed Spanning Tree Algorithm bridges exchange messages using the standard set by IEEE 802.1. These messages, called configuration messages, use multicasting to a multicast group consisting of all and only the bridges on the same segment as the transmitting bridge. These messages are sent at network power-up to acquire information on the network topology, and then again whenever changes in topology are detected. Note that these configuration messages represent control traffic, that is overhead. A configuration message identifies, among other things:

Message M1 is said to be better than message M2 if

Distributed Spanning Tree Algorithm

   Initially each bridge sends a configuration message on each
   of its ports. This message has this bridge's id as root-bridge and as
   transmitter and has 0 as cost. This message is saved at each port as
   configuration for that port and as configuration for the bridge. 

   When a message is received at a port, 
   IF the received message is better than the current configuration 
      of the port THEN
      it becomes the new configuration of this port;
      IF it becomes the port with the best configuration THEN
         it is said to be the root-port of the bridge and to be active.
         [The root-port leads to the root-bridge and the first bridge 
          next on that path acts as the designated bridge of the current 
          bridge.] 
         The configuration of the root-port, with the transmitting bridge 
         field set to this bridge, and the cost incremented by one, if 
         better than the configuration of the bridge, becomes the new 
         configuration of the bridge. Then this new configuration is 
         compared to the configurations of all the non root ports. If the 
         configuration of a port is worse than the configuration 
         of that bridge, the port it is said to be active, 
         and the bridge configuration is transmitted through that port, 
         otherwise the port will become inactive. 
         [In reality if a port is to go from the inactive to the active 
          state, this transition will be delayed some time to make sure 
          that other ports that were supposed to go from active to 
          inactive have actually done so. This is required to avoid
          transient loops between bridges.] 
      ELSE
         this port becomes inactive;
   END IF

   The Spanning Tree consists of the bridges and their active ports
   and the segments thus connected.

Example

Extension

The Distributed Spanning Algorithm as specified can take care of the case where new bridges come on line (have the new bridge send configuration messages and rerun the algorithm), but it does not know what to do in case of failure on bridges or segments. Here is an extension that solves that problem:
   Each stored configuration message keeps an extra field representing
   the age of the message, i.e. the time since the root bridge
   sent the configuration message upon which this message is based. 
   It is incremented each unit of time
   (timer-tick, usually 2 seconds) and when it reaches a preset 
   maximum value maxAge (usually 20 seconds), the stored 
   configuration for the port is reset to the initial value (current 
   bridge as root bridge and source, cost and age set to 0).
   Then at this bridge is recomputed the best configuration (and
   root, cost to root, and root-port), thus another previously inactive
   port can become active.

   The root bridge sends at regular intervals Hello-time an heart-beat,
   an Hello-message to the bridges for which it is the designated 
   bridge. 

   When a bridge receives the Hello  message it resets the age field 
   for the receiving port to 0 and forwards its own configuration with 
   age set to zero to the bridges for which it is the designated bridge. 

   [maxAge should be larger than Hello-time plus the propagation time for the
   hello messages from the root to all the nodes in the spanning tree.]
The effect of the hello message from the root bridge is to eliminate reconfiguration unless necessary.

Notice the interesting story we have seen about bridges:

Routers

A router is a box (often a regular computer) with (at least) two ports, used to connect possibly dissimilar networks and help packets go from a source to a destination. It differs from bridges since it operates at the network level. [It will also use different addresses. For example a bridge may use Ethernet addresses while a router uses IP addresses.] It does all the transformations that may be required by the transfer of packets across the networks it connects.
A router is concerned with where to send next packets as they move from a source to a destination through a set of interconnected networks.
It is convenient to distinguish two activities: Unfortunately common terminology does not always preserve this distinction, and too often one sees "routing" used in place of "forwarding".

Routing (we really mean forwarding) could be of three kinds (at least!):

We will only consider packet routing.

Routing tables contain information that will indicate for each packet on the basis of its final destination (usually an IP address) where to go next (next-hop forwarding - the address of the next router). Cycles can exist in the graph that has routers as nodes and links as edges. Routing tables are built to work well also in the presence of cycles. It is important that routing tables not be too large.

Evaluation of Routing Algorithms:

Routing characteristics:

IP Addresses

IP Version 4 addresses, also called IPv4 addresses, or just IP addresses, are 32 bit integers. [IPv6, which is the new version of IP, and which we do not study, use an IP address with 128 bits.] They are normally written as 4 small integers representing the bytes of the number separated by periods. For example 155.247.182.1 is an IP address. Each IP address consists of two portions, a network identifier and a host identifier. There are 5 classes of IP addresses: A number of IP addresses have a standard meaning:
+------------+------------+----------+-------------------------------+
| Network    | Host       | Type of  | Purpose                       |
| Identifier | Identifier | Address  |                               |
+------------+------------+----------+-------------------------------+
| all 0s     |  all 0s    | this     | Used during bootstrap to      |
|            |            | computer | ask for own's IP address      |
+------------+------------+----------+-------------------------------+
| Network    |  all 0s    | specified| The specified network,        |
| Identifier |            | network  | independent of its hosts      |
+------------+------------+----------+-------------------------------+
| Network    |  all 1s    | specified| Broadcast address for the     |
| Identifier |            | network  | specified network.            |
+------------+------------+----------+-------------------------------+
| all 1s     |  all 1s    | local    | Broadcast on local network    |
|            |            | network  |                               |
+------------+------------+----------+-------------------------------+
| 127        | anything   | loopback | Testing of TCP/IP while not   |
|            |            |          | using the network             |
+------------+------------+----------+-------------------------------+

Forwarding with a Simple Routing Table

Here is a portion of a (real) routing table:
	Destination      Gateway            Interface
	================================================
	155.247.71/24    155.247.71.60      ln0
	127.0.0.1        127.0.0.1          lo0
	default          155.247.71.1       ln0
	================================================
155.247.71/24 is the name of the local network, packets to it should be sent out through the interface ln0 to the IP address 155.247.71.60. 127.0.0.1 is the loopback address, we can use it to test our networking software even without a network: it is sent through the interface lo0. For any other destination, the packet will be sent to IP address 155.247.71.1 through the interface ln0. [The routing table of a Unix machine can be obtained with the command netstat -rn.]

In general, if T is a routing table with entries with fields [destination, gateway, interface], and D is the destination, then we execute the program:

	for each row R of the routing table T
	    if (D == T[R].destination)
		send packet to T[R].gateway through interface T[R].interface;
		return;
	send packet to T[default].gateway through its interface.
Routing and routing tables are more concerned with reaching networks that with reaching hosts. So in the routing table the destination will denote a network, not an host. Once one reaches the correct network, the local system will worry about local delivery [think of delivery to a host on a LAN, the last step involves translation from IP address to physical address and transmission on the shared medium].

Subnetting

The granularity of IP address classes leads often to poor utilization of the address space and to limited ability to address subgroups within a network. The solution is to use Subnetting. Assume that we have a class B network like 155.247. We can partition the host space into 10 bits for subnet id and 6 bits for host id. Thus we have 1024 subnets each with up to 62 hosts (64 - 1 network - 1 broadcast). Subnetting is based on the use of masks. In our example, the subnet mask is 255.255.255.192. The bitwise AND of an IP address with the submask will result in the subnet identity. In our example, if we have the IP address 155.247.182.98, then the subnetwork id, also called extended-network-prefix, is 155.247.182.34 and the subnet is known as 155.247.182.34/26 to stress that it uses 26 bits, leaving the remaining 6 bits for the host-id. Notice that from far away packets will go to the network 155.247/16, and once there packets will go to the specific subnet, and from there to the intended host. [The address 155.247.71/24 we encountered earlier, means that the class B network 155.247 is split into 256 subnets each with 254 IP host addresses.]

To account for subnetting a routing tables T takes the form:

[subnet-id, subnet-mask, next-hop] where the subnet-id is uniquely defined for a network (i.e. all the subnets of a network share the same mask, i.e. they have the same number of bits). The next-hop is the port of the router through which the current packet should be forwarded.

Then when an IP address A has to be routed the algorithm used is:

   For each row i of routing table T
       Let D = T[i].subnet-mask BitwiseAnd IP;
       If (D == T[i].subnet-id) then
       {
          Forward packet to T[i].next-hop;
          return;
       }
   Forward packet to default;
Normally, routing moves packets across the internet until the packet arrives to the destination network. Then the packet is directed to a specific host. With subnetting it becomes possible to route packets across the internet to arrive to a specific subnet, and then to move within the subnet to a specific host.

The ideas of masks and subnetting have been generalized to allow more complex partitions of networks than the one we have just discussed. In particular, variable length subnet masks have been used. This is done with the Classless Inter Domain Routing (CIDR). Now the masks used in routing can be of any size and in matching IP addresses one aims for the longest match. For example, suppose that in a routing table we have a row for the network 1101011110110 and a row for the network 11010111101 then, if we are looking for the destination 11010111101101111110010111010010 we will use the first row since it matches the given destination and it is more specific than the second row. CIDR helps reduce two kinds of problems: the fact that IP addresses are not efficiently allocated using the class oriented schema; and the fact that routing table may grow to be very large.

Some demonstrations of routing from Central Queensland University:

Two Routing Algorithms

We consider two routing algorithms, the link state routing (Dijkstra's shortest path) algorithm, and the vector distance routing (Bellman-Ford) algorithm. In the routing protocol implementing link state routing, each node will send to the whole network information about its neighbors. Instead in distance vector routing each node will send to its neighbors its own idea of what is the distance to each node in the network.

Dijkstra's Shortest Path Algorithm

Given a graph (i.e. we know its vertices and edges with non-negative weights associated with its edges) and a designated source vertex s determine the shortest paths from s to all other vertices and their lengths.

   Initialize arrays R and D so that for each vertex v of the graph, R[v] = NIL,
   and D[v] is infinity except for s where D[s] = 0. 
   Finally let the set S consist of all the vertices of the graph.
   Our intent is that for all vertices v,  D[v] represents the current best 
   estimate of the cost of the paths from s to v (or viceversa) and 
   R[v] is the next node (next-hop) in the current best path from 
   v to s. 


   While S is not empty
      Let u be an element of S with minimal D value and remove it from S.
      For each element v in the neighborhood of u that is in S
         Let w = D[u] + cost-of-edge[u,v];
         If D[v] is greater than w then //We are improving current path to v
            Set D[v] to w;
            Set R[v] to u;

Example


Note that the array R tells the next hop for going back to s from any other node, while the routing table must tell us the next hop for going from s to any other node. You should think of a simple algorithm for computing the routing table given R.

The Dijkstra algorithm is run independently at each vertex after the vertex has collected all the information on the structure of the graph. [There has to be a routing protocol (see below) to make this possible. The routing protocol associated with the Dijkstra algorithm (usually OSPF - Open Shortest Path First) allows each node to inform all other nodes of its own identity and of its links to its neighbors, its Link State.] Link state information is exchanged periodically (every few seconds), or when a new neighbor comes on line, or when it is recognised that a link is disabled. The message traffic generated is limited. The Dijkstra algorithm computes the routing tables fairly quickly, but it tends to use substantial memory resources.

Note that Dijkstra's Algorithm creates a spanning tree with as root the node where the algorithm is run (s in the discussion above). The algorithm will result in different trees when run at different nodes - in fact each router node should run the algorithm). These spanning trees are not necessarily minimal (i.e. the sum of the cost of the branches is not minimal). For example:

However the tree built by the Dijkstra algorithm at s will give the paths of minimal length from to each of the other nodes.

Vector Distance Routing Algorithm

In this algorithm (due to Bellman and Ford), each router exchanges routing information, the Distance Vector, with its immediate neighbors, not with all routers. The distance vector algorithm results in the same distances as the Dijkstra algorithm.
   Now each vertex s only knows itself.
   Each vertex keeps a set of triples of the form 
   [destination, next-hop, distance]. This set is the distance vector.
   Initially this set is {[s,NIL,0]} and it is transmitted to each neighbor.

   When a vertex u receives a distance vector from its neighbor v
      For each triple [d,n,c] in the received distance vector
         Let w = c + distance from u to neighbor v (assumed to be 1);
         [d,v,w] represents the new found path from u to d with
         next step v;
         If there is no triple of the form [d,x,y] in the distance
            vector of u or 
            there is such a triple and (y > w or x=v) then
            Remove [d,x,y], if there, and add [d,v,w] to the distance 
            vector of u. [Note that if x=v, we replace the old triple [d,x,y]
            with the new triple [d,v,w] even if y<=w. This is so because it 
            is assumed that v has a more recent estimate of the path to d.]

   When a vertex u recognizes that the link to a neighbor v has failed, it
      Requests distance vector information from its remaining neighbors and
      Recalculate the distance vector using the new neighborhood information.

   A vertex sends a copy of its distance vector to its neighbors 
   whenever there has been a change in its own distance vector
   or a failure in its neighborhood
Here is what the Vector Distance Algorithm does in the case of the graph above:
Step 0:
  s  {[s,nil,0]}
  A  {[A,nil,0]}
  B  {[B,nil,0]}
  C  {[C,nil,0]}
  D  {[D,nil,0]}
Step 1:
  s  {[s,nil,0],[A,A,9],[C,C,5]}
  A  {[A,nil,0],[s,s,9],[B,B,1],[C,C,2]}
  B  {[B,nil,0],[A,A,1],[C,C,9[,[D,D,6]}
  C  {[C,nil,0],[s,s,5],[A,A,2],[B,B,9],[D,D,4]}
  D  {[D,nil,0],[C,C,4],[B,B,6]}
Step 2:
  s  {[s,nil,0],[A,C,7],[B,A,10],[C,C,5],[D,C,9]}
  A  {[A,nil,0],[s,C,7],[B,B,1],[C,C,2],[D,C,6]}
  B  {[B,nil,0],[s,A,10],[A,A,1],[C,A,3],[D,D,6]}
  C  {[C,nil,0],[s,s,5],[A,A,2],[B,A,3],[D,D,4]}
  D  {[D,nil,0],[s,C,9],[A,C,6],[C,C,4],[B,B,6]}
Step 3:
  s  {[s,nil,0],[A,C,7],[B,C,8],[C,C,5],[D,C,9]}
  A  {[A,nil,0],[s,C,7],[B,B,1],[C,C,2],[D,C,6]}
  B  {[B,nil,0],[s,A,8],[A,A,1],[C,A,3],[D,D,6]}
  C  {[C,nil,0],[s,s,5],[A,A,2],[B,A,3],[D,D,4]}
  D  {[D,nil,0],[s,C,9],[A,C,6],[C,C,4],[B,B,6]}

Notice that in the example we have glibly talked of "step 1", "step 2", etc. In reality distance vectors are transmitted asynchronously without any requirement that all nodes proceed in lockstep.

A problem with the distance vector algorithm is its slowness in propagating the recognition of link failures (it is called the count-to-infinity problem). For example in the following graph

      +---+          +---+           +---+
      | A |----------| B |-----------| C |
      +---+          +---+           +---+
suppose that we have the following distance vectors:
   At A:    {[A,NIL,0], [B,B,1], [C,B,2]}
   At B:    {[A,A,1], [B,NIL,0], [C,C,1]}
   At C:    {[A,B,2], [B,B,1], [C,NIL,0]}
and the link from B to C fails. So B discards the triple [C,C,1] and recomputes the vector using the information from A, thus it adds the triple [C,A,3]. This change in turn is propagated to A that has to change its C triple to [C,B,4], that causes B to change to [C,A,5], .. and so on until "infinity" is reached and A and B can finally conclude that C is unreacheable! An heuristic used is called split-horizon: A would send to B the triple [C,B,+infinity] instead of [C,B,2]. This triple will be ignored by B as long as B is connected directly to C. But when C becomes inaccessible that +infinity informs B that there is no alternative path to C. [This heuristic does not solve more complex cases of count-to-infinity problems.]

Routing algorithms, i.e. algorithms used to exchange the information needed for computing routing tables, are implemented using routing protocols. Examples of such protocols are RIP (Routing Information Protocol), EGP (Exterior Gateway Protocol), IGRP (Interior Gateway Routing Protocol). The packets exchanged in the routing protocols are called routing packets and they contain control information, i.e. they are overhead. ICMP (Internet Control Message Protocol) is a protocol used to propagate echo and reply messages that test the reacheability of nodes in the network, and to report loss of packets due to time expiration. IRDP (ICMP Router Discovery Protocol) is used to identify routers and to report their identity. The routing protocol used with the Shortest Path Algorithm is OSPF(Open Shortest Path First), with the Distance Vector is RIP (Routing Information Protocol).

An Autonomous System is the technical term for somebody's network (or a group of related networks). The exchange of routing information within an autonomous system and among autonomous systems is done using a variety of gateway protocols. An autonomous system may be isolated and protected from its environment through the use of a firewall, which usually is a computer system that controls and limits the traffic between the autonomous system and its environment.

Packet Switch

A packet switch is a box with a number of ports, some to other packet switches, some to computers. Usually the connection to computers are slower than the connections to other packet switches. A packet switch is used to interconnect networks with similar or dissimilar structures and it operates above the data link level. It is a generic term that includes bridges, routers, and gateways, though it used to mean an IMP in the old ARPANET. It tends to stress the switching functionality (i.e. how packets are moved across the network) above the ability to determine routing information. A packet switch uses a store-and-forward strategy with the messages it receives. Thus congestions in the use of a connection do not result (unless we overrun the available buffers) in the loss of data, only in some delays. Packet switches may use hierarchical addresses, with a [switch part, port part]. So a computer will be known by the identity of the packet switch it is connected to and of the position of the port it uses on the switch.

Gateways

It used to mean the same as packet switch, now it usually means a device that works above the network layer and can perform complete translations between different protocol stacks.