HECIOS: The High End Computing I/O Simulator


Motivation

As high end computing systems (HECs) grow to several tens of thousands of nodes, file I/O is becoming a critical performance issue. Current parallel file systems such as PVFS2 and others, can reasonably stripe data across a hundred nodes and achieve good performance for bulk transfers involving large aligned accesses. Serious performance limits exist, however, for small unaligned accesses, metadata operations, and accesses impacted by the consistency semantics (any time one process writes data that is read by another).

Proposal Summary
Accepted Proposal

Simulator Details

HECIOS is a simulation package being developed in the Clemson PARL lab in order to improve parallel file I/O performance. Presently, large cluster computers with proportionally large I/O storage subsystems are extremely rare. In order to experiment with high end I/O configurations we are developing HECIOS, the high end computing I/O simulator. Our goal is to provide a freely available cluster storage system simulator capable of providing extremely detailed simulations of I/O performance for parallel and scientific applications.

HECIOS is implemented using the OmNet++ simulation library, and leverages existing networking and disk components to provide an extremely detailed simulator for MPI-IO applications using a cluster storage system such as a parallel file system. OmNet++ provides extensive simulation capabilities for developing generic simulation models, scheduling events, state machine development, parallel simulation, and capturing and visualizing relevant simulation data.

In order to provide a detailed simulation environment for cluster internetworking, we are relying on the INET package provided for use with OmNet++. INET provides flexible network simulation components that simulate link level detail for TCP/IP traffic over switched ethernet. Our intention is to eventually add networking components capable of simulating high speed cluster interconnects such as Myrinet and Infiniband.

Downloads

HECIOS is available in source form only.
HECIOS 0.1.0

In order to use HECIOS you will need to either generate traces using the LANL MPI Tracing framework, or simply use the traces we have already collected and translated to the HECIOS internal trace format.
FLASH I/O 8 processes
FLASH I/O 16 processes
FLASH I/O 32 processes
FLASH I/O 64 processes
FLASH I/O 128 processes
FLASH I/O 256 processes
FLASH I/O 512 processes
FLASH I/O 1024 processes
FLASH I/O 2048 processes
ANL's MPI I/O Test 8 processes
ANL's MPI I/O Test 16 processes
ANL's MPI I/O Test 32 processes
ANL's MPI I/O Test 64 processes
ANL's MPI I/O Test 128 processes
ANL's MPI I/O Test 256 processes
MPI Tile Read 8 processes
MPI Tile Write 8 processes
MPI Tile Read 512 processes
MPI Tile Write 512 processes

Members

HECIOS is developed at Clemson University's parallel architecture and research lab, or PARL.
Current team members include:
Prof. Walter Ligon
Brad Settlemyer
Michael Bassilly
Pooja Verma

Support

Hecios is being developed under the NSF CISE/CCF Division's HECURA program award #CCF-0621441
Project Title: HECURA: Improving Scalability in Parallel File Systems for High End Computing

Resources

PVFS2 Website
OmNet++ Community Site
OmNet++ Manual
INET Documentation
UMd I/O Trace files