michael.eichberg@dhbw.de, Raum 149B
1.0
The programming abstractions offered by middleware hide some of the heterogeneity and manage some of the complexity that programmers of a distributed application have to deal with:
Middleware always masks the heterogeneity of the underlying networks and hardware.
Middleware usually masks the heterogeneity of operating systems and/or programming languages.
peripheral:Some middleware even masks the heterogeneity between implementations of the same middleware standard by different vendors.
Middlware ...
A software layer above the operating system and below the application program that provides a common programming abstraction in a distributed system.
A building block at a higher level than the APIs provided by the operating system (e.g. sockets)
History
Old middleware standards - such as CORBA - were very complex and the implementations of different manufacturers were usually not fully compatible.
Middleware provides transparency (when programming) in relation to one or more of the following dimensions:
Location
Concurrency
replication
Failures (but only to a limited degree)
Behind programming abstractions is a complex infrastructure that implements these abstractions
Middleware platforms can be very complex software systems.
As the programming abstractions reach ever higher levels, the underlying infrastructure that implements the abstractions must grow accordingly.
Additional functionality is almost always implemented through additional software layers.
The additional software layers increase the scope and complexity of the infrastructure required to use the new abstractions.
For decades, it has been observed that middleware has become increasingly complex, to the point where the complexity was barely manageable. At these points in time, new approaches were often developed that reduced the complexity until this in turn found its way into more complex middleware products.
Approaches such as REST have proven to be quite successful, but present developers with new challenges.
Non-functional Requirements
The infrastructure takes care of non-functional properties that are normally ignored by data models, programming models and programming languages:
Performance
availability
resource management
reliability
etc.
Supporting Development and Operations
Middleware supports additional functions that make development, maintenance and monitoring easier and more cost-effective (excerpt):
Logging
Recovery
(Programming) Language primitives for transactional delimitation
(E.g., advanced transaction models (e.g. transactional RPC) or transactional file systems)
Today, the generation of stubs and skeletons - if required at all - is now typically carried out automatically by the middleware.
Middleware aims to hide the details of hardware, networks and distribution at a low level.
Continuing trend towards ever more powerful primitives (events) that have additional properties or allow more flexible use of the concept.
The development and appearance for the programmer is dictated by the trends in programming languages:
RPC and C
CORBA and C++
RMI (Corba) and Java
"Classic" web services and XML
RESTful web services and JSON
Focus: hiding network communication.
A process can call a procedure whose implementation is located on a remote computer:
Distributed system programmers no longer have to worry about all the details of network programming (i.e. no more "explicit" sockets).
Bridging the conceptual gap between calling local functionality via procedures and calling remote functionality via sockets.
A server is a program that implements certain services.
Cients want to use these services:
Communication takes place by sending messages (no shared memory, no shared disks, etc.)
Some minimal guarantees must be given (handling of errors, call semantics, etc.)
Question
Should remote calls be transparent or non-transparent for the developer?
Remark
A remote call is completely different from a local call; should the programmer be aware of this?
Question
How can data be exchanged between machines that may use different representations for different data types?
Complex data types must be linearized:
the process of preparing the data into a form suitable for transmission in a message.
the process of restoring the data on arrival at its destination in order to obtain a faithful representation.
How do you find and bind the service you actually want in a potentially large collection of services and servers?
Remark
The aim is that the customer does not necessarily need to know where the server is located or even which server offers the service (location transparency).
How to deal with mistakes more or less elegantly:
Server is down
Communication is disrupted
Server busy
duplicate requests ...
For programmers, a "remote" procedure call looks and works almost identically to a "local" procedure call - this is how transparency is achieved.
To achieve transparency, RPC introduced many concepts of middleware systems:
Interface Description Language (IDL)
Directory and naming services
Dynamic binding
Marshalling and unmarshalling
Opaque references to refer to the same data structure or entity on the server for different calls.
(The server is responsible for providing these opaque references).
Suppose a client makes an RPC request to a service of a particular server. After the timeout expires, the client decides to resend the request. The final behavior depends on the semantics of the call (aka Call Semantics):
Maybe (no guarantee)
The target method may have been executed and the response message(s) were lost or the method was not executed at all because the request was lost.
XMLHTTPRequests and fetch() in web browsers use this semantics.
At least once
The procedure will be executed as long as the server does not finally fail.
However, it is possible that it will be executed more than once if the client has resent the request after a timeout.
At most once
The procedure is either executed once or not at all. Sending the request again does not result in the procedure being executed more than once.
Exactly once
The system guarantees the same semantics as for local calls under the assumption that a crashed server will restart at some point.
Orphaned calls, i.e. Calls on crashed server computers are retained so that they can later be taken over by a new server.
RPC provides a mechanism to implement distributed applications in a simple and efficient way.
RPC enables the modular and hierarchical structure of large distributed systems:
Client and server are separate entities
The server encapsulates and hides the details of the backend systems (such as databases)
RPC is not a standard, but has been implemented in many different ways.
RPC enables developers to set up distributed systems, but only solves selected aspects.
The Network File System (NFS) and SMB are two well-known applications based on RPC.
The problems of enabling cross-company point-to-point integration led to the development of the next generation of middleware technologies.
Based on Web Services - Concepts, Architectures and Applications; Alonso et al.; Springer 2004
Each company uses its own "concrete" message broker(s) - if we want to communicate with multiple companies, we need to implement and maintain multiple adapters/solutions.
Webservices are self-contained, modular business applications that have open, internet-oriented, standards-based interfaces.
—UDDI Konsortium
SOAP is the protocol of classic web services and enables communication between applications.
SOAP comprises the following parts:
A message format that describes how a message can be wrapped in an XML document (envelopes, headers, body...)
A set of encoding rules for data
A description of how a SOAP message should be transported using the underlying transport protocol (HTTP or SMTP). How a SOAP message can be embedded in an HTTP request or in an e-mail (SMTP).
A set of rules to follow when processing a SOAP message and the entities involved in this processing; which parts of the messages should be read and by whom, and what action these entities should take if they do not understand the content.
SOAP is a further development of XML-RPC and originally stood for Simple Object Access Protocol.
SOAP (from version 1.2) is a standard of the W3C.
Messages are envelopes in which the application's user data is enclosed.
A message has two main components:
Intended for infrastructural data such as security or reliability.
Intended for application level data. Each part can be divided into blocks.
1<SOAP-ENV:Envelope
2xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
3SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" />
45
<SOAP-ENV:Header>
6<t:Transaction xmlns:t="ws-transactions-URI" SOAP-ENV:mustUnderstand="1">
757539
8</t:Transaction>
9</SOAP-ENV:Header>
1011
<SOAP-ENV:Body>
12<m:GetLastTradePrice xmlns:m="Some-URI">
13<symbol>DEF</symbol>
14</m:GetLastTradePrice>
15</SOAP-ENV:Body>
1617
</SOAP-ENV:Envelope>
1POST /StockQuote HTTP/1.1
2Host: www.stockquoteserver.com
3Content-Type: text/xml; charset="utf-8"
4Content-Length: nnnn
5SOAPAction: "Some-URI"
67
<SOAP-ENV:Envelope
8xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
9SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
1011
<SOAP-ENV:Body>
12<m:GetLastTradePrice xmlns:m="Some-URI">
13<symbol>DIS</symbol>
14</m:GetLastTradePrice>
15</SOAP-ENV:Body>
16</SOAP-ENV:Envelope>
1HTTP/1.1 200 OK
2Content-Type: text/xml; charset="utf-8"
3Content-Length: nnnn
45
<SOAP-ENV:Envelope
6xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
7SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" />
89
<SOAP-ENV:Body>
10<m:GetLastTradePriceResponse xmlns:m="Some-URI">
11<Price>34.5</Price>
12</m:GetLastTradePriceResponse>
13</SOAP-ENV:Body>
14</SOAP-ENV:Envelope>
ZeroMQ is a messaging infrastructure without an explicit server ("broker").
ZeroMQ supports connection-oriented but asynchronous communication.
ZeroMQ is based on classic sockets, but adds new abstractions to enable the following messaging patterns:
request-reply
pub-sub (publish-subscribe)
pipeplining (processing in parallel)
ZeroMQ enables N-to-N communication.
ZeroMQ supports many programming languages; the user is responsible for the appropriate marshalling or unmarshalling.
If, for example, the server is written in Java and the client in C, then the understanding of how a string is transferred may be different (e.g. terminated with null or provided with an explicit length).
Enables the "usual" communication between a client and a server. However, buffering may take place if the server is not available.
Allows clients to subscribe to a specific topic and then receive all messages published on that topic. A message with a specific topic is sent to all registered clients.
Enables a task to be sent to exactly one worker from a set of (homogeneous) workers.
1import static java.lang.Thread.currentThread
2import org.zeromq.SocketType;
3import org.zeromq.ZMQ;
4import org.zeromq.ZContext;
56
public class Publisher {
7public static void main(String[] args) throws Exception {
8try (ZContext context = new ZContext()) {
9ZMQ.Socket publisher = context.createSocket(SocketType.PUB);
10publisher.bind("tcp://*:5556");
11publisher.bind("ipc://" + <endpoint>);
1213
while (!currentThread().isInterrupted()) {
14int zipcode = <some zipcode>
15// Send to all subscribers
16String update = String.format("%05d %s", zipcode, <some msg>);
17publisher.send(update, 0);
18} } } }
1import java.util.StringTokenizer;
2import org.zeromq.SocketType;
3import org.zeromq.ZMQ;
4import org.zeromq.ZContext;
56
public class Subscriber{
7public static void main(String[] args) {
8try (ZContext context = new ZContext()) {
9ZMQ.Socket subscriber = context.createSocket(SocketType.SUB);
10subscriber.connect("tcp://localhost:5556");
11subscriber.subscribe(<zipcode(Str)>.getBytes(ZMQ.CHARSET));
12while(true) {
13String string = subscriber.recvStr(0);
14// e.g. take string apart:
15// part1: zipcode
16// part2: message
17System.out.println(string);
18} } } }
1import signal
2import time
3import zmq
45
signal.signal(signal.SIGINT,
6signal.SIG_DFL)
78
context = zmq.Context()
9socket = context.socket(zmq.PUB)
10socket.bind('tcp://*:5555')
1112
for i in range(5):
13socket.send(b'status 5')
14socket.send(b'All is well')
15time.sleep(1)
1import signal
2import zmq
34
5
signal.signal(signal.SIGINT,
6signal.SIG_DFL)
78
context = zmq.Context()
9socket = context.socket(zmq.SUB)
10socket.connect('tcp://localhost:5555')
11socket.setsockopt(zmq.SUBSCRIBE, b'status')
1213
while True:
14message = socket.recv_multipart()
15print(f'Received: {message}')
Bzgl. des Handlings von Signalen in Python siehe auch: https://docs.python.org/3/library/signal.html#signal.signal
MOM or message-queueing systems support persistent asynchronous communication.
Very large messages are supported.
There is only a guarantee that messages are ultimately placed in the recipient's queue and that the messages arrive in the correct order.
(In particular, there is no guarantee that the message will be read).
The sender and recipient are not necessarily active at the same time.
Messages always have a unique recipient and virtually arbitrary content.
Operation |
Beschreibung |
---|---|
PUT |
Places a message in a specific queue. |
GET |
Blocks at a specific queue until a message is available. Removes the first message. |
POLL |
Checks whether a message is available in a specific queue. Removes the first message if necessary. POLL never blocks. |
NOTIFY |
Registers a handler (callback) that is called when a message is added to a specific queue. |
Queue managers are the central building block of message queueing systems. In general, there is (at least conceptually) one local Queue Manager per process. A Queue Manager is a process that stores and manages messages in queues. If required, it can manage several queues and forward them to other Queue Managers.