CIS 307: Introduction to Distributed Systems I

[SISD,SIMD,MISD,MIMD], [Computational Speedup], [Some Characteristics of Communication Primitives], [Network OS vs Distributed OS], [OSI Architecture], [TCP/IP Architecture], [Client-Server Architecture]

These notes are additions|comments|summaries of material covered in Chapters 9 and 10 (you are not required to know anything on Group Communication) of Tanenbaum's textbook.

SISD, SIMD, MISD, MIMD

A classification often used for computer systems is due to Flynn:

SISD: Single Instruction, Single Data: it is the traditional computer, where a single instruction is executed at a time on scalar individual values.
SIMD: Single Instruction, Multiple Data: a Vector Machine. It can, for example add in parallel the corresponding elements of two vectors.
MISD: Multiple Instructions, Single Data: Not a popular choice!
MIMD: Multiple Instructions, Multiple Data, i.e. the concurrently executing instructions are each with their own arguments. MIMD are distinguished by some into Multiprocessor Systems in the case that they have processors that share memory, and Multi-Computers, i.e. systems that consist of computers that communicate through switches and not shared memory.

We do not consider any of these systems to be Distributed Systems. A distributed system [for us] consists of computer systems that are connected by networks, those other systems are parallel computers. [Of course, there are other kinds of "things" that exhibit parallelism: think of superscalar computers (for instance Pentium). Here you have a single "big" instruction containing "little" regular instructions that are executed simultaneously.]

Computational Speedup

Suppose we run a program using a single processor and it takes time T(1). We then run the program using n processors (we assume the program is written to take advantage of the available number of processors) and it takes time T(n). What is the relationship between T(1), T(n) and n? For example, if the program takes 100sec on one processor, how long should it take on four processors? May be 25sec? in fact T(n) at best should be T(1)/n. [Cases where T(n) is less than T(1)/n, i.e. in our example less than 25 seconds, are called cases of superlinearity. Such cases can be explained and analyzed separately in terms of OS or computational characteristics.]

We call Computational Speedup s the ratio:

s = T(1)/T(n)

and Efficiency e the ratio:

e = s/n

Amdhal has suggested that any computation can be analyzed in terms of a portion that must be executed sequentially, Ts, and a portion that can be executed in parallel, Tp. Then

T(n) = Ts + Tp/n

This is known as Amdhal's Law. It says that we can improve only the parallel part of a computation. Going back to our example, if the computation requires 40 seconds of sequential computation and 60 of parallel computation, then the best that T(n) will ever be is 40sec, even for n growing to infinity. In practice there are many problems where the sequential components are minimal and thus these problems can be effectively parallelized. In general though speedup due to parallelism tends to be not too great. Some people suggest that the speedup grows with the square root of n, i.e, if we quadruple the number of processors, we only double the performance. [Notice that in this case efficiency will change as 1 over the square root of n, i.e. the more the processors, the less efficiently they will be used.] The maximum speedup assuming Amdhal's Law is for n going to infinity. That is

s = (Ts + Tp)/Ts = 1 + Tp/Ts

Some Characteristics of Communication Primitives

Here are some characteristics of communication primitives. Naming, i.e. how is a name associated to a communicating entity? Some possibilities are : the name is hardwired, the name is chosen at random, there is a name server (i.e. somebody responsible for giving names). Blocking, i.e. does the sender wait until message is sent, or just enqueues it and goes on? This is the distinction between synchronous and asynchronous calls [semi-synchronous calls are asynchronous calls where a signal is received at the time the message is actually sent]. Buffering, i.e. what to do when a message arrives and there is no waiting receiver. Reliability, i.e. whether messages are or not acknowledged.

Network OS vs Distributed OS

A Network Operating System is a network of computers with possibly different operating systems that are able to easily communicate with each other. Agent processes take care of the intercomputer communication. In a NOS 1) each computer has a local OS, 2) users work on specific computers 3) no transparency 4) no extra reliability. Example: SunOS.

A Distributed Operating System is a network of computers that appears to the user as a single computer and manages all the resources of the physical computers. The user is not aware of where computations or data are located. The system is intended to have high reliability. Example: Amoeba.

OSI Architecture

The textbook covers this topic. Here are a few observations. Standards and Protocols are used to define representation and interaction modes and make certain functionalities generally available. Standards and protocols are continuously being introduced. They fit within general frameworks called Architectures Thus the OSI architecture is not a specific set of protocols and standards, it is the definition of the functionality of protocols and standards that should be used. Of course, in practice people tend to associate architectures with their most popular standards and protocols.

There is a fundamental difference between the lowest three levels (the communication subnet) and the top four levels of the OS architecture. The bottom layers are between directly connected hosts or involve all the hosts in a path from sender to receiver. The top four layers are end-to-end protocols, that is, the communication is stated in terms of only the original sender and the final destination, independent of how many intermediate hosts are traversed. Intermediate nodes do not participate at all in the processing of the higher level protocols, to them it is data. This has a direct impact on efficiency: for example, error checking in protocols at this level is only done at the sender and receiver, not at the intermediate hosts.

Each message (or frame, or packet) consists of data being transmitted plus information required by the protocol for addressing, error detection, etc. This extra information appears as a header before that data and (may be) a trailer after the data, i.e. the data is encapsulated in the message. The message sent at layer i will be transmitted as data by the layer below it. Assuming that the layer below can transmit this data as a single message we will have the situation

Note that the headers and tails consitute transmission overhead, reducing the utilization of the bandwidth of the communication channel. Of course this is only part of the communication overhead: retransmissions and acknowledgements further reduce bandwidth.

Communication can be connection-oriented or connectionless. In the former case a connection is established between the communicating agents and on that connection messages are exchanged. In the latter instead each message is treated as a free standing entity (datagram). Related concepts at the communication level are circuit switched and packet switched. In the former communication between interlocutors always follows the same path [this may be "forever" or it may be only for a single connection(virtual circuit)]. In the latter individual packets follow their own independent routes. Another related concept is a session. It consists of one or more connections. For example, a program on machine A may be involved in communication with a program on machine B, the connection drops due to communication problems. When communication is reestablished the programs continue from where they were in the session using a new connection.

The transmission units at different layers may be of different maximum sizes. We have here the same distinction that exists between "logical" and "physical" records in file systems. One message at a layer may have to be fragmented over multiple messages at a lower layer. This "packing" and "unpacking" can be complex when different packets take different routes, arrive in undesired order, or completely disappear.

Routing is the process by which a message moves from a sender to a receiver across multiple intermediate hosts. It is a very complex activity worth of its own separate course. Part of the complexity is that we want to have a system that keeps on working while hosts and links fail and come on line at their own pace. Routers are placed between neighboring networks to decide which of the incoming packets should be kept on the local network and which should me moved to the other network. It is not an optimal process: packets keep a hop count that is increment with each transfer. When it gets too big, as in a loop, the packet is discarded.

A number of issues should be kept in mind when analysing a protocol and its implementation. Among them:

Framing: How do we recognise the beginning and end of a message (or packet, or frame, or ..);
Flow Control: How do we make sure we don't send too much information to the receiver so that it cannot handle it and has to "drop it on the floor";
Multiplexing: How a single communication channel can be shared for more than one conversation;
Addressing: How do interlocutors address each other [as we have already seen, host interfaces are known by their IP address (a 32 bit integer); unfortunately in ethernet the address understood by the hardware is the Ethernet address (a 48 bit integer) of the connector; thus there is need of a way to translate from IP to Ethernet addresses. In general, this requires an address resolution protocol.
Error Detection and Correction: what we do to detect if an error has occurred in transmission and how to correct it.

Examples of protocols used at the various levels:

Application layer: telnet, ftp, finger, ..

Presentation layer: XDR

Session layer: RPC

Transport layer: TCP (connection oriented), UDP (connectionless)

Network layer: X.25 (connection oriented), IP (connectionless)

Data link layer: HDLC, ethernet

Physical layer: RS-232-C, ethernet

TCP/IP Architecture

The TCP/IP architecture was developed mainly in the US. It is the one that has the largest number of users. It represents pragmatic solutions to problems as they arose. It involves only four layers: The application layer (same as in OSI), the end-to-end layer (TCP or UDP or ...), the internet layer (IP or ..), and the net access layer (ethernet, or ...).

Client-Server Architecture

Though the OSI architecture arrives up to the "application layer", much work these days is happening above that level, in what people call middleware with things like the Distributed Computing Environment (DCE) and the Common Object Request Broker Architecture(CORBA). These systems help create an infrastructure to support the construction and use of distributed applications.

A typical way in which people organize distributed applications is represented by the Client-Server Model. In this model a program at some node acts as a Server which provides some Service, things like a Name Service, a File Service, a Data Base Service, etc. The Client sends requests to the Server and receives replies from it. The Server can be Stateful or Stateless, i.e. may remember information across different requests from the same client. The Server can be iterative or concurrent. In the former case it processes one request at a time, in the latter it will fork a separate process (or create a separate thread) to handle each requests. Also the Client could be iterative or concurrent, depending if it makes requests to different servers sequentially or concurrently. The traditional way of interaction from the client to the server is with Remote Procedure Calls (RPC).

ingargiola.cis.temple.edu