CIS 307: Introduction to Distributed Systems I
[SISD,SIMD,MISD,MIMD],
[Computational Speedup],
[Some Characteristics of Communication Primitives],
[Network OS vs Distributed OS],
[OSI Architecture],
[TCP/IP Architecture],
[Client-Server Architecture]
These notes are additions|comments|summaries of material
covered in Chapters 9 and 10 (you are not required to know
anything on Group Communication)
of Tanenbaum's textbook.
A classification often used for computer systems is due to Flynn:
- SISD: Single Instruction, Single Data: it is the traditional
computer, where a single instruction is executed at a time on
scalar individual values.
- SIMD: Single Instruction, Multiple Data: a Vector Machine.
It can, for example add in parallel the corresponding elements of two
vectors.
- MISD: Multiple Instructions, Single Data: Not a popular choice!
- MIMD: Multiple Instructions, Multiple Data, i.e. the concurrently executing
instructions are each with their own arguments. MIMD are distinguished
by some into
Multiprocessor Systems in the case that they have processors that
share memory, and Multi-Computers, i.e. systems that
consist of computers that communicate through switches and not
shared memory.
We do not consider any of these systems to be Distributed Systems.
A distributed system [for us] consists of computer systems that are
connected by networks, those other systems are parallel computers.
[Of course, there are other kinds of "things" that exhibit parallelism:
think of superscalar
computers (for instance Pentium). Here you have a single "big" instruction
containing "little" regular instructions that are executed simultaneously.]
Suppose we run a program using a single processor and it takes time T(1).
We then run the program using n processors (we assume the program is written
to take advantage of the available number of processors) and it takes
time T(n). What is the relationship between T(1), T(n) and n? For example,
if the program takes 100sec on one processor, how long should it take on
four processors? May be 25sec? in fact T(n) at best should be T(1)/n.
[Cases where T(n) is less than T(1)/n, i.e. in our example less than 25
seconds, are called cases of superlinearity. Such cases can be
explained and analyzed separately in terms of OS or computational
characteristics.]
We call Computational Speedup s the ratio:
s = T(1)/T(n)
and Efficiency e the ratio:
e = s/n
Amdhal has suggested that any computation can be analyzed in terms
of a portion that must be executed sequentially, Ts, and a portion
that can be executed in parallel, Tp. Then
T(n) = Ts + Tp/n
This is known as Amdhal's Law. It says that we can improve only the
parallel part of a computation. Going back to our example, if the computation
requires 40 seconds of sequential computation and 60 of parallel computation,
then the best that T(n) will ever be is 40sec, even for n growing to infinity.
In practice there are many problems where the sequential components are minimal
and thus these problems can be effectively parallelized. In general though
speedup due to parallelism tends to be not too great. Some people suggest
that the speedup grows with the square root of n, i.e, if we quadruple
the number of processors, we only double the performance. [Notice that
in this case efficiency will change as 1 over the square root of n, i.e.
the more the processors, the less efficiently they will be used.]
The maximum speedup assuming Amdhal's Law is for n going to infinity.
That is
s = (Ts + Tp)/Ts = 1 + Tp/Ts
Here are some characteristics of communication primitives. Naming,
i.e. how is a name associated to a communicating entity? Some possibilities
are : the name is hardwired, the name is chosen at random, there is a
name server (i.e. somebody responsible for giving names). Blocking,
i.e. does the sender wait until message is sent, or just enqueues it and
goes on? This is the distinction between synchronous and
asynchronous calls [semi-synchronous calls are asynchronous calls
where a signal is received at the time the message is actually sent].
Buffering, i.e. what to do when a message
arrives and there is no waiting receiver.
Reliability, i.e. whether messages are or not acknowledged.
A Network Operating System is a network of computers with possibly
different operating systems that are able to easily communicate with each
other. Agent processes take care of the intercomputer communication. In a
NOS 1) each computer has a local OS, 2) users work on specific computers
3) no transparency 4) no extra reliability. Example: SunOS.
A Distributed Operating System is a network of computers
that appears to the user as a single computer and manages all the resources
of the physical computers. The user is not aware of where computations or
data are located. The system is intended to have high reliability. Example:
Amoeba.
The textbook covers this topic. Here are a few observations. Standards
and Protocols are used to define representation and interaction modes
and make certain functionalities generally available. Standards and protocols
are continuously being introduced. They fit within general frameworks called
Architectures Thus the OSI architecture is not a specific set of
protocols and standards, it is the definition of the functionality of protocols
and standards that should be used. Of course, in practice people tend to
associate architectures with their most popular standards and protocols.
There is a fundamental difference between the lowest three levels
(the communication subnet) and the
top four levels of the OS architecture.
The bottom layers are between directly connected hosts
or involve all the hosts in a path from sender to receiver.
The top four layers are end-to-end protocols, that is, the communication
is stated in terms of only the original sender and the final destination,
independent of how many intermediate hosts are traversed.
Intermediate nodes do not participate at all in the processing of the
higher level protocols, to them it is data.
This has a direct
impact on efficiency: for example, error checking in protocols at this level
is only done at the sender and receiver, not at the intermediate hosts.
Each message (or frame, or packet) consists of data being transmitted
plus information required by the protocol for addressing, error
detection, etc. This extra information appears as a header before that data
and (may be) a trailer after the data, i.e. the data is encapsulated
in the message.
The message sent at layer i will be transmitted as data by the layer below it.
Assuming that the layer below can transmit this data as a single message
we will have the situation
Note that the headers and tails consitute transmission overhead, reducing the
utilization of the bandwidth of the communication channel. Of course this
is only part of the communication overhead: retransmissions and
acknowledgements further reduce bandwidth.
Communication can be connection-oriented or connectionless.
In the former case a connection is established between the communicating
agents and on that connection messages are exchanged. In the latter instead
each message is treated as a free standing entity (datagram).
Related concepts at the communication level are circuit switched
and packet switched. In the former communication between interlocutors
always follows the same path [this may be "forever" or it may be only
for a single connection(virtual circuit)].
In the latter individual packets follow their own independent routes.
Another related concept is a session.
It consists of one or more connections.
For example, a program on machine A may be involved in communication
with a program on machine B, the connection drops due to communication
problems. When communication is reestablished the programs continue from
where they were in the session using a new connection.
The transmission units at different layers may be of different maximum sizes.
We have here the same distinction that exists between "logical" and "physical"
records in file systems. One message at a layer may have to be fragmented over
multiple messages at a lower layer. This "packing" and "unpacking" can be
complex when different packets take different routes, arrive in undesired
order, or completely disappear.
Routing is the process by which a message moves from a
sender to a receiver across multiple intermediate hosts. It is a very complex
activity worth of its own separate course. Part of the complexity is that
we want to have a system that keeps on working while hosts and
links fail and come on line at their own pace. Routers are placed
between neighboring networks to decide which of the incoming packets should
be kept on the local network and which should me moved to the other
network. It is not an optimal process: packets keep a hop count that
is increment with each transfer. When it gets too big, as in a loop, the
packet is discarded.
A number of issues should be kept in mind when analysing a protocol
and its implementation. Among them:
- Framing: How do we recognise the beginning
and end of a message (or packet, or frame, or ..);
- Flow Control:
How do we make sure we don't send too much information to the receiver
so that it cannot handle it and has to "drop it on the floor";
- Multiplexing: How a single communication channel can be shared for
more than one conversation;
- Addressing: How do interlocutors address each other
[as we have already
seen, host interfaces are known by their IP address (a 32 bit integer);
unfortunately in ethernet the address understood by the hardware is the
Ethernet address (a 48 bit integer) of the connector; thus there is need
of a way to translate from IP to Ethernet addresses. In general, this
requires an address resolution protocol.
- Error Detection and Correction: what we do to detect if an error has
occurred in transmission and how to correct it.
Examples of protocols used at the various levels:
Application layer: telnet, ftp, finger, ..
Presentation layer: XDR
Session layer: RPC
Transport layer: TCP (connection oriented), UDP (connectionless)
Network layer: X.25 (connection oriented), IP (connectionless)
Data link layer: HDLC, ethernet
Physical layer: RS-232-C, ethernet
The TCP/IP architecture was developed mainly in the US. It is the one that
has the largest number of users. It represents pragmatic solutions to
problems as they arose. It involves only four layers:
The application layer (same as
in OSI), the end-to-end layer (TCP or UDP or ...), the internet layer
(IP or ..),
and the net access layer (ethernet, or ...).
Though the OSI architecture arrives up to the "application layer", much
work these days is happening above that level, in what people call
middleware with things like the
Distributed Computing Environment (DCE)
and the Common Object Request Broker Architecture(CORBA).
These systems help create an infrastructure to support the construction
and use of distributed applications.
A typical way in which people organize distributed applications is
represented by the Client-Server Model. In this model
a program at some node acts as a Server which provides
some Service, things like a Name Service, a File Service,
a Data Base Service, etc. The Client sends requests to the Server and receives
replies from it. The Server can be Stateful or
Stateless, i.e. may remember information across different
requests from the same client. The Server can be iterative
or concurrent. In the former case it processes one request
at a time, in the latter it will fork a separate process (or create a separate
thread) to handle each requests. Also the Client could be
iterative or concurrent, depending if
it makes requests to different servers sequentially or concurrently. The
traditional way of interaction from the client to the server is with
Remote Procedure Calls (RPC).
ingargiola.cis.temple.edu