An Architecture for a Collaborative Computing Grid

Grid computing may be thought of as a set of distributed services that are accessed securely by various remote clients. Example services include secure remote command execution, file management, database access, and access to queuing systems. To create, run, and access such diverse services in a consistent way, the Open Grid Services Architecture (OGSA) defines the framework one may follow when building and running new services. Grid clients put services to work in useful ways. Computing web portals are typical Grid clients that may connect to, and invoke, remote services in a point-to-point fashion (via name or IP label). Grid computing has an extensive literature, including several overviews and a current snapshot of the state of the art.

We believe there is an important component missing that should link the services and client environments: a software layer that manages the messages and data traffic between the two. In current systems, this traffic is point-to-point messages. This makes it difficult to build collaboration services, provide reliability and redundancy for downed services, work through firewalls, and address many other problems, such as logging and backups. We propose that by inserting a message-relaying framework between the clients and services, we may elegantly add collaboration, resiliency, and redundancy to Grid computing. Many of the services required by the VLab, such as job submission, file and metadata management, are standard Grid tools that we will adopt. Our innovation will be to place these services within a natural collaborative framework.

Our proposed architecture is based on the so-called publish/subscribe paradigm. It effectively abstracts all connections to the middleware in terms of topics. A simple analogy is a news service. One or more publishers submit news items to newsgroups (identified by topics), while subscribers choose which newsgroups they wish to read: they subscribe to one or more topics. The user is not interested in where the news items are actually stored or which machines execute the requests, except as relates to efficiency. At the core, VLab will be driven by a collection of servers whose sole purpose is to accept input messages from publishers and redirect them to subscribers. Unless specifically disallowed, several publishers will be able to submit to the same topic, while several subscribers will be able to retrieve the contents of identical topics. This strategy, at its core, supports the notion of collaboration.

We believe that the publish/subscribe approach supports all the stated requirements of VLab. In such a system, messages have many encoding (SOAP-RPC, audio-video streams, binary data files, etc.) and are transmitted by any one of several protocols (TCP, UDP). The only requirement is the presence of a topic in the message header. Resources (storage, computational, portals, project managers, databases, visualization servers, Grid services and infrastructure, audio-video services, etc.) connect to the middleware. Because all resources communicate only through topics, it is straightforward to add any number of useful utilities such as logging and backup services. These would respond to a broader range of topics than the more specialized resources.

The NaradaBrokering (NB) system, developed at the Community Grids Laboratory, is a distributed publish/subscribe system. Instead of a single message broker (i.e., server), NB uses a distributed brokering network that can perform sophisticated and failsafe message routing. Advanced features for security and reliable messaging are in development. Previous work by Dr. Pierce and co-workers has shown that these systems support traditional Grid and Web service systems.

sponsors | home | links | contact us