\section{Protocols and Sessions}\label{sec:upi}

The two main classes of objects supported by the $x$-kernel are {\it
protocols} and {\it sessions}. Protocol objects represent just what
you would expect---protocols such as IP or TCP. Session objects
represent the local end-point of a channel, and as such, typically
implement the code that interprets messages and maintains any state
associated with the channel. The protocol objects available in a
particular network subsystem, along with the relationships among these
protocols, is defined by a protocol graph at the time a kernel is
configured.  Session objects are dynamically created as channels are
opened and closed.  Loosely speaking, protocol objects export
operations for opening channels---resulting in the creation of a
session object---and session objects export operations for sending and
receiving messages.

The set of operations exported by protocol and session objects is
called the {\it uniform protocol interface}---it defines how protocols
and sessions invoke operations on each other. At this stage, the
important thing to know about the uniform protocol interface is that
it specifies how high-level objects invoke operations on low-level
objects to send outgoing messages, as well as how low-level objects
invoke operations on high-level objects to handle incoming messages.
For example, consider the specific pair of protocols TCP and IP in the
Internet architecture. TCP sits directly above IP in this
architecture, so the $x$-kernel's uniform protocol interface defines
the operations that TCP invokes on IP, as well as the operations IP
invokes on TCP, as illustrated in Figure~\ref{upi}.

Keep in mind that the following discussion defines a common {\it
interface} between protocols, that is, the operations one protocol is
allowed to invoke on the other. This is only half the story, however.
The other half is that each protocol has to provide a routine that
{\it implements} this interface. Thus, for an operation like {\var
xOpen}, protocols like TCP and IP must include a routine that
implements {\var xOpen}; by convention, we name this routine {\var
tcpOpen} and {\var ipOpen}, respectively. Therefore, the following
discussion not only defines the interface to each operation, but it
also gives a rough outline of what every protocol's implementation of
that operation is expected to do.

\begin{figure}[ht]
\centering
\leavevmode\hbox{\epsfig{file=upi.ps,height=1.5in}}
\caption{Uniform Protocol Interface.}\label{upi}
\end{figure}

\subsection{Configuring a Protocol Graph}\label{sec:config}

Before presenting the operations that protocol and session objects
export, we first explain how a protocol programmer configures a
protocol graph. Standardization bodies like the ISO and the IETF
define a particular network architecture that includes a specific set
of protocols. In the Internet architecture, for example, TCP depends
on IP, by definition.  This suggests that it is possible to
``hard-code'' TCP's dependency on IP into the TCP implementation.
While this could be done in the case of TCP, the $x$-kernel supports a
more flexible mechanism for configuring a protocol graph. This makes
it easy to plug protocols together in different ways. While this is a
quite powerful thing to be able to do, one has to be careful that it
makes sense to have any two protocols adjacent to each other in the
protocol graph.

Quite simply, a user that wants to configure a protocol graph
specifies the graph with a text file, called {\var graph.comp}, of the
following form:

\begin{quote}
{\var name=lance;}\\
{\var name=eth protocols=lance;}\\
{\var name=arp protocols=eth;}\\
{\var name=ip protocols=eth,arp;}\\
{\var name=icmp protocols=ip;}\\
{\var name=udp protocols=ip;}\\
{\var name=tcp protocols=ip;}
\end{quote}

\noindent This specification results in the protocol graph depicted in
Figure~\ref{prot_graph}. In this example graph, {\var lance} and {\var
eth} combine to implement an Ethernet device driver: {\var lance} is
the device-specific half and {\var eth} is the device-independent
half. Also, {\var arp} is the Address Resolution Protocol (it is used
to translate IP addresses into Ethernet addresses) and {\var icmp} is
the Internet Control Message Protocol (it sends error messages on
behalf of TCP and IP). The {\var name} field in each line specifies a
protocol (by name) and the {\var protocols} field says which other
protocols this protocol depends on. Not shown in this example is a
{\var dir} field that identifies the directory where the named
protocol implementation can be found; by default, it is the same as
the name of the protocol.  The $x$-kernel build program, called {\var
compose}, parses this specification, and generates some C code that
initializes the protocol graph when the system is booted.

\begin{figure}[ht]
\centering
\leavevmode\hbox{\epsfig{file=prot_graph.ps,height=2.5in}}
\caption{Example Protocol Graph.}\label{prot_graph}
\end{figure}

\subsection{Operations on Protocol Objects}\label{sec:protl}

The primary operation exported by a protocol object allows a higher
level entity to {\it open} a channel to its peer. The return value
from an open operation is a session object. Details about the session
object are discussed in the following subsection. For now, think of a
session as a convenient object for gaining access to the channel;
the module that opened the session object can send and receive
messages using the session object. An object that lets us gain access
to something abstract is sometimes called a {\em handle}---you can
think of it as the thing that makes it easy to grab something that is
otherwise quite slippery. Thus, a session object provides a handle on a
channel. 

In the following discussion, we need a generic way to refer to the
entity that opens a channel, since sometimes it is an application
program, and sometimes its another protocol.  We use the term {\it
participant} for this purpose. That is, we think in terms of a pair of
participants at level $i$ communicating over a channel implemented by
a protocol at level $i-1$ (where the level number decreases as you
move down the stack).

Opening a channel is an asymmetric activity.  Generally, one
participant initiates the channel (we call it the client).  This
local participant is therefore able to identify the remote
participant, and is said to do an {\it active} open.  In contrast, the
other participant accepts the channel (we call it the server).  This
participant does not know what clients might try to talk to it until
one of them actually makes contact. The server, therefore, does a {\it
passive} open---it says to the lower level protocol that it is willing
to accept a channel, but does not say with whom the channel will be.

Thus, the exact form of the open operation depends on whether the
higher level entity is doing an active open or a passive open. In the
case of an active open, the operation is:

\begin{quote}
{\var Sessn xOpen(Protl hlp, Protl hlpType, Protl llp, Part *participants)}
\end{quote}

\noindent This operation says that high-level protocol {\var hlp}
is opening low-level protocol {\var llp} so as to establish a channel
with the specified {\var participants}. For a typical channel between
a pair of participants, this last argument would contain both the
local participant's address and the remote participant's address.  The
low-level protocol does whatever is necessary to establish the
channel, which quite often implies opening a channel on a still
lower-level protocol. Notice that {\var Protl} and {\var Sessn} are
the C type definitions for protocol and session objects, respectively.

The {\var hlpType} argument to {\var xOpen} is a bit subtle. What is
really happening is that {\var hlp} is opening a session associated
with {\var llp} {\it on behalf of} high-level protocol {\var hlpType}.
Typically, {\var hlp} and {\var hlpType} refer to the same protocol,
although as we will see in Section~\ref{virtual}, there are some cases
in which {\var hlp} and {\var hlpType} are not equivalent.

A high-level protocol passively opens a low-level protocol with a pair
of operations:

\begin{quote}
{\var XkReturn xOpenEnable(Protl hlp, Protl hlpType, Protl llp, Part *participant)}\\
\\
{\var XkReturn xOpenDone(Protl hlp, Protl llp, Sessn session, Part *participants)}
\end{quote}

\noindent {\var xOpenEnable} is used by high-level protocol {\var hlp}
to inform low-level protocol {\var llp} that it is willing to accept a
connection. (Argument {\var hlpType} has the same meaning as in {\var
xOpen}.) In this case, the high-level protocol usually specifies only
a single {\var participant}---itself. The {\var xOpenEnable} operation
returns immediately; it does not block waiting for a remote site to
try to connect to it. The low-level protocol remembers this enabling;
when some remote participant subsequently connects to the low-level
protocol, {\var llp} calls the high-level protocol's {\var xOpenDone}
operation to inform it of this event. The low-level protocol {\var
llp} passes the newly created {\var session} as an argument to
high-level protocol {\var hlp}, along with the complete set of {\var
participants}, thereby informing the high-level protocol of the
address for the remote entity that just connected to it. {\var
XkReturn} is the return value of all the uniform protocol interface
operations presented here except for {\var xOpen} and {\var xPush};
it indicates whether the operation was successful ({\var XK\_SUCCESS})
or not ({\var XK\_FAILURE}).

In addition to these operations for opening a connection, $x$-kernel
protocol objects also support an operation for demultiplexing incoming
messages to the appropriate channel (session). In this case, a
low-level session invokes this operation on the high-level protocol
that at some earlier time had opened it. The operation is:

\begin{quote}
{\var XkReturn xDemux(Protl hlp, Sessn lls, Msg *message)}
\end{quote}

\noindent It will be easier to understand how this operation is used
after we look at session objects in more detail.

\subsection{Operations on Session Objects}\label{sec:sessn}

As already explained, a session can be thought of as a handle on a
channel that is implemented by some protocol. One can also view it as
an object that exports a pair of operations: one for sending messages,
and one for receiving messages:

\begin{quote}
{\var XkHandle xPush(Sessn lls, Msg *message)}\\ 
\\
{\var XkReturn xPop(Sessn hls, Sessn lls, Msg *message, void *hdr)}
\end{quote}

\noindent The implementation of {\var xPush} and {\var xPop} is where 
the real work of a protocol is carried out---it's where headers are
added to and stripped from messages, and then interpreted. In short,
these two routines implement the algorithm that defines the protocol.

The operation of {\var xPush} is fairly straightforward. It is invoked
by a high-level session to pass a {\var message} down to some
low-level session ({\var lls}) that it had opened at some earlier
time. {\var lls} then goes off and does what is needed with the
message---perhaps using {\var xPush} to pass it down to a still lower
level session. This is illustrated in Figure~\ref{bsdpush}. In this
figure we see three sessions, each of which implements one protocol in
a stack, passing a message down the stack using {\var xPush}.

Passing messages back up the stack using {\var xPop} is more
complicated. The main problem is that a session does not know what
session is above it---all it knows is the protocol that is above it.
So, a low-level session {\var lls} invokes the {\var xDemux} routine
of the protocol above it. That protocol, since it did the work of
opening the high level session {\var hls} to which this message needs
to go, is able to pass the message to {\var hls} using its {\var xPop}
routine. How does a protocol's {\var xDemux} routine know which of its
potentially many sessions to pass the message up to? It uses the {\it
demultiplexing key} found in its header.

In addition to the {\var hls} that is being called and the {\var
message} being passed to it, {\var xPop} takes two other arguments.
First, {\var lls} identifies the low-level session that handed up this
message via {\var xDemux}. Second, since {\var xDemux} has to inspect
the message header to select the session on which to call {\var
xPop}---i.e., it has already gone to the effort of extracting the
protocol's header---it passes the header ({\var hdr}) as the final
argument to {\var xPop}. This chain of events is illustrated in
Figure~\ref{bsdpop}.

To see how this works in practice, imagine we want to send a message
using the TCP and IP protocols. An application program opens a channel
by performing {\var xOpen} on TCP; TCP returns a session object to the
application. TCP opens an IP channel by performing {\var xOpen} on IP; IP
returns a session object to TCP. When the application wants to send a
message, it invokes the {\var xPush} operation of the TCP session; this
session in turn invokes the {\var xPush} operation of the IP session, which
ultimately causes the message to be sent.

Now suppose an incoming message is delivered to the IP session. This
session has no idea about the TCP session above it, so it does the
only thing it knows how to do---it passes the message up to the TCP
{\em protocol} using {\var xDemux}. The TCP protocol knows about all
the TCP sessions, and so passes the message up to the appropriate TCP
session using {\var xPop}. TCP's {\var xDemux} uses the demux key it
found in the TCP header to select among all the TCP sessions.

The final operation that we need to be able to perform is one to close
a session, which in effect closes the channel to the other machine.

\begin{quote}
{\var XkReturn xClose(Sessn session)}
\end{quote}
 
\begin{figure}[ht]
\centering
\leavevmode\hbox{\epsfig{file=bsdpush.ps,height=2in}}
\caption{Using {\var xPush} to pass a message down a stack.}\label{bsdpush}
\end{figure}

\begin{figure}[ht]
\centering
\leavevmode\hbox{\epsfig{file=bsdpop.ps,height=2.0in}}
\caption{Using {\var xDemux} and {\var xPop} to pass a message up a stack.}\label{bsdpop}
\end{figure}

In addition to sessions and protocols, this discussion has introduced
two other $x$-kernel object classes: messages and participants. Both
classes represent exactly what you think they do---the messages that
protocols send to each other (corresponding to type definition {\var
Msg}), and the addresses of participants that are communicating over
some channel (corresponding to type definition {\var Part}). Message
and participant objects are discussed in more detail in later sections.

\subsection{Asynchronous versus Synchronous Protocols}

As described so far, the {\xk} supports {\it asynchronous}
protocols---protocols that do not block waiting for a reply from their
peer. Some protocols, however, are {\it synchronous}---the caller
blocks until a reply can be returned. Clearly, the {\var
xPush/xDemux/xPop} paradigm just described is not going to work for
synchronous protocols since it makes no provision for a return value.
The $x$-kernel accommodates synchronous protocols by providing a
parallel set of operations for sending and receiving messages:

\begin{quote}
{\var XkReturn xCall(Sessn session, Msg *request, Msg *reply)}\\
\\
{\var XkReturn xCallPop(Sessn session, Msg *request, Msg *reply, void *hdr)}\\
\\
{\var XkReturn xCallDemux(Protl hlp, Sessn session, Msg *request, Msg *reply)}
\end{quote}

\noindent The key difference, of course, is that each operation now 
returns a reply message ({\var reply}); for clarity, we refer to the
message given as an argument as {\var request}. The operations are
synchronous in the sense that each cannot return until the reply
message is available.

So far so good: some protocols are purely asynchronous (they export
{\var xPush}, {\var xPop} and {\var xDemux} operations), and some are
purely synchronous (they export {\var xCall}, {\var xCallPop} and {\var
xCallDemux} operations). However, if all protocols were either
asynchronous or synchronous, then the entire protocol graph would have
to consist of only asynchronous or synchronous protocols---an
asynchronous protocol can only call {\var xPush} on an adjacent
protocol, meaning it could never be composed with a synchronous
protocol. 

Fortunately, there can be hybrid protocols that are half synchronous
and half asynchronous. This does not mean that they support all six
operations, but rather they look like a synchronous protocol to higher
level protocols, and like an asynchronous protocol to lower level
protocols. Such a protocol supports the {\var xCall} operation rather
than {\var xPush} on top, while from below, it still supports the
asynchronous {\var xDemux/xPop} interface, that is, it turns an
underlying asynchronous communication service into a synchronous
communication service. It does this by having the sending process
(caller) block on a semaphore waiting for a reply message.

\subsection{Process Models for Protocols}

As we have said, protocol implementors typically have to be concerned
about a lot of operating system issues. This subsection introduces one
of the most important of these issues---the process model.

Most operating systems provide an abstraction called a {\em process},
or alternatively, a {\it thread}.  Each process runs largely
independently of other processes, and the OS is responsible for making
sure that resources, such as address space and CPU cycles, are
allocated to all the current processes. The process abstraction makes
it fairly straightforward to have a lot of things executing
concurrently on one machine; for example, each user application might
execute in its own process, and various things inside the OS might
execute as other processes. When the OS stops one process from
executing on the CPU and starts up another one, we call this a {\em
context switch}.

When designing a protocol implementation framework, one of the first
questions to answer is: ``Where are the processes?''  There are
essentially two choices, as illustrated in Figure~\ref{process}. In
the first, which we call the {\it process-per-protocol} model, each
protocol is implemented by a separate process.  This implies that as a
message moves up or down the protocol stack, it is passed from one
process/protocol to another---the process that implements protocol $i$
processes the message, then passes it to protocol $i-1$, and so on.
How one process/protocol passes a message to the next process/protocol
depends on the support the host OS provides for interprocess
communication.  Typically, there is a simple mechanism for enqueuing a
message with a process.  The important point, however, is that a
context switch is required at each level of the protocol
graph---typically a time-consuming operation.

\begin{figure}[ht]
\centering
\leavevmode\hbox{\epsfig{file=process.ps,height=3.0in}}
\caption{Alternative Process Models.}\label{process}
\end{figure}

The alternative, which we call the {\it process-per-message} model,
treats each protocol as a static piece of code, and associates the
processes with the messages. That is, when a message arrives from the
network, the OS dispatches a process to be responsible for the message
as it moves up the protocol graph. At each level, the procedure that
implements that protocol is invoked, which eventually results in the
procedure for the next protocol being invoked, and so on. For
out-bound messages, the application's process invokes the necessary
procedure calls until the message is delivered.  In both directions,
the protocol graph is traversed in a sequence of procedure calls.

Although the process-per-protocol model is sometimes easier to think
about---I implement my protocol in my process and you implement your
protocol in your process---the process-per-message model is generally
more efficient. This is for a simple reason: a procedure call is an
order of magnitude more efficient than a context switch on most
computers. The former model requires the expense of a context switch
at each level, while the latter model costs only a procedure call per
level.

The $x$-kernel uses the process-per-message model. Tying this model
back to the operations outlined above, this means that once a session
(channel) is open at each level, a message can be sent down the
protocol stack by a sequence of calls to {\var xPush}, and up the
protocol stack by alternating calls to {\var xDemux} and {\var xPop}.
This asymmetry---{\var xPush} going down and {\var xDemux}/{\var xPop}
going up---is unappealing, but necessary. This is because when sending
a message out, each layer knows which low-level session to invoke
{\var xPush} on because there is only one choice, while in the
incoming case, the {\var xDemux} routine at each level has to first
demultiplex the message to decide which session's {\var xPop} to call.

Notice that the high-level protocol does not reach down and {\it
receive} a message from the low-level protocol. Instead, the low-level
protocol does an {\em upcall}---a procedure call up the stack---to
deliver the message to the high-level protocol. This is because a
receive-style operation would imply that the high-level protocol is
executing in a process that is waiting for new messages to arrive,
which would then result in a costly context switch between the
low-level and high-level protocols. By having the low-level protocol
deliver the message to the high-level protocol, incoming messages can
be processed by a sequence of procedure calls, just as outgoing
messages are.

We conclude this discussion of processes by introducing three
operations that the $x$-kernel provides for process synchronization:

\begin{quote}
{\var void semInit(Semaphore *s, int count)}\\
\\
{\var void semSignal(Semaphore *s)}\\
\\
{\var void semWait(Semaphore *s)}
\end{quote} 

\noindent These operations implement conventional counting semaphores.
Specifically, every invocation of {\var semSignal} increments
semaphore {\var s} by one, and every invocation of {\var semWait}
decrements {\var s} by one, with the calling process blocked
(suspended) if decrementing {\var s} cause its value to become
less than zero. A process that is blocked during its call to {\var
semWait} will be allowed to resume as soon as enough {\var semSignal}
operations have been performed to raise the value of {\var s} above
zero. Operation {\var semInit} initializes the value of {\var s} to
{\var count}. 
