For decades, high performance computing devices have been available whose performance far exceeds that of traditional computers. Parallel and distributed computers have existed since the sixties, and with the advent of clusters of computers built from commodity parts, are becoming more widespread. More recently, reconfigurable computing technology based on FPGAs has also shown tremendous promise for providing performance far beyond that of sequential computers. Many fields in science and engineering have a need for the kind of computing power that these technologies promise. Yet, the fact remains that many scientists and engineers writing code to perform research in these fields still fail to make use of the available computing power. This phenomena appears to be the result of the one common characteristic of high performance computing (HPC) systems, whether parallel, distributed, reconfigurable or supercomputers: compared to traditional computers, designing applications is much more complicated.
Computer scientists have attempted to alleviate this problem in three ways: by creating compilers which can automatically detect parallelism, by creating new languages or extensions to existing languages to make the creation of parallel programs simpler, and by creating the application programs themselves in the form of general-purpose solvers or custom codes. Each of these approaches has shown shortcomings. Automatic detection of parallelism in dusty deck programs has proven to be somewhat of a holy grail for parallel computing. New languages, while parallel enough to express available concurrency, have not seen widespread acceptance among the Fortran using scientific community, largely because they still require the programmer to understand and make explicit the parallelism available in the problems. Creating custom applications exposes the programming problems in reverse; computer scientists seldom have the inclination to spend years becoming an expert in the field in which the research is taking place.
Recently, more attention has been paid to the idea of creating problem solving environments for fields in computational science and engineering. Problem solving environments attempt to bridge this widening gap between computer scientists and application domain scientists. However, many of the current environments fall short in either scope or power; they are either too restrictive to solve more than a few specific problems, or lack enough of a bridge to prevent the application domain scientist from having to understand the complexities of parallel programming.
What follows is a proposed infrastructure for creating problem solving environments for high performance computing systems. The environment infrastructure can be customized to create environments to support a wide range of applications on a wide range of computer systems. The infrastructure is flexible enough to allow the creation of tools that operate on the same design at different levels of abstraction, allowing the complexities of the design of the application to be hidden when possible, but available when necessary. The fundamental component of the environment is the Algorithm Description Format (ADF) which essentially is an electronic representation of the structure of the task at hand. An interface has been developed to allow for the creation of agents, tools which interact with ADF to manipulate and transform the design. The environment infrastructure consists of ADF itself, an ADF manager and library, and a core set of these agents which should be useful in most environments. In addition, extensive support is provided for the creation of new agents to customize the environment, including a parameterized user interface which allows new agents to easily share or build upon the interface of existing agents.
The final sections of the paper demonstrate the utility of the environment infrastructure by presenting two examples of custom environments targeted at specific classes of environments built from this infrastructure, specifically an environment for creating electromagnetic simulations using integral equations to run on dedicated cluster computers, and an environment to generate satellite image processing applications to run on a reconfigurable computing system.
There have been many attempts in the past to manage the complexity of creating applications for HPC systems. One approach is to create libraries to extend existing languages, as has been done most successfully by PVM and MPI. While both of these message passing libraries have gained widespread support, they still require the user to explicitly understand and express the parallelism in their applications, so they do not solve the whole problem. New languages have been created, such as Data Parallel C,Fortran D and its successors, SR, Mentat, and many proprietary varieties to hide the details of the actual message passing, however, the user still must understand the parallelism in the problems in order to use them successfully. A number of complete environments have been created to facilitate the creation of good applications, but many, such as Enterprise or Trapper are more programming environments than problem solving environments - they ease the process of programming in parallel dramatically, but still require the intelligent creation of parallel programs. Many others, such as GEMS and CHARM++ restrict dramatically either the kinds of problems to be solved or the kind of parallel structure which can be used to do it.
The key to the design of an effective problem solving environment is open and easy access to the structure of the problem and the implementation of the solution. To achieve this we have developed the Algorithm Description Format (ADF). The ADF specifies the algorithm in question as a directed graph. Each element in the graph has a list of attributes. The primary strengths of the ADF are it's flexibility and open representation. ADF is discussed in detail in section 3.1.
Once the structure of the algorithm has been laid bare through its description in the ADF, any number of tools, known as agents can act on the design, either augmenting or transforming it as needed. An ADF manager coordinates the actions of the agents on the ADF. An SQL database stores the library of designs. The current incarnation of the environment infrastructure is implemented in Java. In addition to the pieces of the infrastructure mentioned above, a number of completed agents written in Java are provided, as well as a set of Java classes to make the construction and incorporation of new agents into the system an easy process. A block diagram of the environment infrastructure is shown in figure 1.
The ADF specifies the algorithm in question as a directed graph. The nodes, the ports and the graph itself each have a list of attributes used to describe their salient properties. The attribute list provides much of the flexibility of the ADF. There are no constraints in the ADF limiting the number of attributes or requiring particular attributes. Each of the tools knows about a particular set of attributes upon which it operates and ignores the rest. This offers the potential for one ADF description to be implemented in multiple target architectures.
The top-level structure in ADF is the Design Unit (DU). The DU contains a single graph of nodes and links which describe an algorithm. The node represents a piece of computation. Attributes on the node describe how the node is to be implemented and properties of the computation. The granularity of a node can be as small as a single operation or as large as an entire application. Hierarchical designs can be created by having the attribute that specifies the implementation of the node point to another DU. Nodes also contain lists of input and output ports. Ports are abstractions of the data entering and leaving each piece of the computation. They also have attribute lists to describe the characteristics of the data being communicated. A binding between any pair of ports in the design form a link.
The manager is the guardian of the DU. The manager is responsible for all I/O related to the design unit, access to the library of design units, and coordinating the actions of agents attempting to operate on the DU. The manager serializes attempts by agents to change the design, and notifies running agents when the design is changed. The manager is also responsible for maintaining a list of all running agents. The ADF Library is an SQL database consisting of all the DU's in the system. Since each attribute is a field in the database records, attributes of the DU can be used to categorize the designs into many ``Virtual Libraries'' . For instance, an agent can present a user with a ``library'' consisting of all the DUs in the system which operate on double precision floating point data and have implementations suitable for Xilinx 4000 chips.
The ADF editor provides a graphical representation of any design specified in ADF. An example of a simple design displayed by the ADF editor is shown in figure 2. The editor can traverse hierarchical designs, can merge existing designs, and has complete editing capabilities for nodes, ports, and all attributes in the design. The editor communicates with the manager to keep its display of the design up to date, making it an excellent debugging tool to monitor the effects of other agents.
The ADF translator provides support for a text based ADF language. It provides support for translating from the ADF language to the ADF binary format shared by all the agents and back. The ADF language supports several syntactic constructs to make composing designs easier, such as the definition of named sets of attributes.
A number of Java classes are provided to support the creation of new agents. The most fundamental one is the AgentFrame. By inheriting from the AgentFrame, agents are assured of proper communication with the manager, easy access to the library, and proper interaction with the current design. AgentFrame contains methods to make the generation of menus simple. In addition, a set of view classes and a display engine are provided, which facilitates the construction of graphical interfaces to agents. The view is essentially a table which describes the interface to the display engine. Through the use of the view, the agent can either select from a library, provide custom code, or inherit from another agent the procedure for displaying ADF components and responding to the user interactions. Interfaces can be as simple as a few buttons on a panel or as complex as the ADF editor.
Computational ElectroMagnetics (CEM) is a prime candidate for taking advantage of the performance HPC offers. Our environment concentrates on the solution of Method-of-Moment (MoM) type problems, which involve solving integral equations. While numerically intense and possessing a regular structure, this particular class of problems remains ill-suited for general purpose solvers, as small changes in the geometry of the problem result in drastically different formulations of the solution.
The key to the CEM environment involves creating templates in which common structures of the integral equation problems are described as ADF graphs. An agent with an interface that uses terminology familiar to a CEM user presents the structure of the problem to the user, and asks them to fill in the details of the computation by selecting from a library or supplying custom code written in a traditional language (Fortran, C, or Matlab). A code generator agent then takes these implementation details along with the template, and generates a PVM program that calls the user supplied code when appropriate. Since the parallelism is embedded in the template, the CEM researcher can focus on the solution to the equations rather than the structure of the problem.
Figure 2 shows an example of a typical ADF template for a CEM simulation as it would appear in the ADF editor. Figure 3 shows the interface the CEM programmer would see. The details of the ADF graph are hidden, the user is simply asked to provide the code to compute the matrix elements, enter details about geometry, and choose a matrix solver and visualization method. The user also selects the machine the resulting parallel code should be run on from a list of machines currently available to CEM researchers at Clemson University. Just as the details of the structure of the program are hidden from the CEM researcher, the details of the implementation are hidden from the template writer, making it possible to create several templates for the same problem each optimized for a different type of computer system, ranging from the shared-memory Sun multiprocessor system to a cluster of PCs,or a distributed memory Intel Paragon.
Reconfigurable computing is an excellent target for many satellite image processing algorithms. These algorithms often exhibit both fine grain and coarse grain parallelism and are usually compatible with the fixed-point data representations typically used in reconfigurable computers. Creating applications for reconfigurable computers, however, is significantly different than writing standard software. The programmer is required to think in terms of basic operations, consider device area used, consider clock rates, and manage data in fixed-point formats. The reconfigurable computing environment abstracts many of these problems through the use of several agents. An editor allows the programmer to enter the design as a data flow described in ADF. A precision tool is provided which allows the user to specify the precision of input data and any constraints on output precision. The precision tool then calculates the bit widths necessary along the data path. Optimizers can then be used to balance the area permitted for each operation with overall application performance, while also considering the performance impacts of moving operations to the host. A component selector is then invoked which chooses an implementation for each operation fitting the constraints given. Finally, the design is converted into a VHDL representation which can be synthesized and is in some sense optimal for a particular architecture.
The infrastructure proposed here provides a toolkit for the creation of problem solving environments. The environments which can be created from this infrastructure can operate at many levels of abstraction, attack a wide range of applications, and be used on many types of HPC systems. The full paper will contain a more thorough description of the use of ADF and the environment infrastructure, and performance results of applications created in the target environments.
2. Message Passing Interface Forum, ``MPI: A Message-Passing Interface Standard'',Journal of Supercomputing Applications, Vol. 8, No. 3/4, 1994.
3 ``C* Programming Guide'', Technical Report, Thinking Machines Corporation, May, 1993.
4 G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer, C.W. Tseng, and M. Wu. ``Fortran D Language Specification'', Technical Report TR90-141, Dept. of Computer Science, Rice University, December 1990.
5 Ronald A. Olsson, Gregory R. Andrews, Michael H. Coffin, Gregg M. Townsend, ``SR: A Language for Parallel and Distributed Programming'', Technical Report, TR 92-09, Dept. of Computer Science, University of Arizona, March, 1992.
6 A. S. Grimshaw, W. T. Strayer, and P. Narayan,''Dynamic Object-Oriented Parallel Processing'',IEEE Parallel & Distributed Technology: Systems & Applications, pp. 33-47, May, 1993.
7 Jonathan Schaeffer, Duane Szafron, Greg Lobe, Ian Parsons, ``The Enterprise Model for Developing Distributed Applications'', Parallel and Distributed Technology, 1995.
8 Lorenz Schafers, Christian Scheidler, and Ottmar Kramer-Fuhrmann, Software Engineering for Parallel Systems: The TRAPPER Approach'', HICSS28, Proc. of the 28th Hawaiian International Conference on System Sciences, January, 1995.
9 Bernd Bruegge, Erik Reidel Armistead G. Russell, and Gregory J. McRae, ``Developing GEMS: An Environmental Modeling System'',IEEE Computational Science and Engineering, Fall, 1995.
10 John Browne and James Werth,''Experimental Evaluation of a Reusability-oriented Parallel Programming Environment'',IEEE Transactions on Software Engineering,pp. 111-120, Vol. 16, Feb, 1990.