Had a little bit of a conundrum on what title to give this list. I gather that “Grid” and “Cluster” computing would resonate better than “Distributed” and “Parallel”. There’s been a lot of activity lately in this area, however the one I’ve keenly interested in are the ones that integrate with Amazon’s S3 and Elastic Cloud efforts. These tools are really great but can be made more appealing it they can be implemented in the cloud. Here are some of the great tools I have discovered so far. (Note: see “Open Source Distributed Cache” for other related clustering tools )
- GridGain – GridGain is a computational grid framework that aims to improve the performance of processing intensive applications by splitting and parallelizing the workload. GridGain’s unique feature is that it is an Aspect Oriented Programming (AOP)based grid solution. With AOP-based grid enabling you simply attach annotation to a method and it is automatically grid enabled. The goal is to combine a simple programming model with the state of the art grid computing features. GridGain comes with grid topology management, customizable failover and collision resolution, split-and-aggregate, in-grid and external invocation, pluggable deployment, checkpoint and automatic peer-to-peer deployment.
- Hadoop – Hadoop is a platform that lets one easily develop applications that process vast amounts of data. Hadoop can reliably store and process petabytes. By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures. Hadoop implements Google’s MapReduce in Java. MapReduce divides applications into many small blocks of work. Hadoop has been demonstrated on clusters with 2,000 nodes.
- Rio – Rio provides a dynamic architecture for developing, deploying and managing distributed applications composed of services. The differentiating features of Rio are a set of dynamic capabilities and reliance on policy-based and QOS mechanisms. Rio is based on Jini and enhances it by introducing a simple component model.
- JPPF – Java Parallel Processing Framework (JPPG) is a grid framework that integrates with existing J2EE application servers. It has a programming model that abstracts the complexity of distributed and parallel processing. It provides graphical tools for fine-grained monitoring and administration of the grid. It provides redundancy, recovery and failover capabilities. JPPF also includes a screensaver node that enables the automatic use of idle computers
- ProActive – ProActive is a platform for parallel, distributed and multi-threaded computing. ProActive features an Eclipse based IDE, a resource acquisition and deployment framework and a parallel programming framework. ProActive supports parallel programming concepts such as Master-Slave, Branch and Bound, Single Program Multiple Data processing, composable structured patterns and active objects. ProActive also supports fault-tolerance, load-balancing, mobility, and security.
- Cougaar – The Cognitive Agent Architecture [cougaar] project provides a framework for distributed multi-agent systems. Cougaar supports the construction of large-scale distributed agent-based applications. It is the result of a multi-year DARPA research project into large scale agent systems. It also includes a variety of demonstration, visualization and management components to simplify the development of complex, distributed applications.
- GridEngine – The Grid Engine project provides enabling distributed resource management software for wide ranging requirements from compute farms to grid computing. Grid Engine provides policy-based workload management and dynamic provisioning of application workloads. A Grid Engine master can manage a grid of up to ten thousand hosts, meeting the scalability needs of even the largest grids. Grid Engine supports flexible resource quotas, boolean expressions for requesting resources, DRMAA 1.0 Java and C-language bindings and DTrace support.
- Cleversafe – Cleversafe uses Cauchy Reed-Solomon Information Dispersal Algorithms to separate data into unrecognizable Data Slices and distribute them, via secure Internet connections, to multiple storage locations on a Dispersed Storage Network. Cleversafe ensures that data remains secure even if transmissions are intercepted and decrypted. Cleversafe ensures that data remains secure even if storage is stolen and decrypted.
- Darkstar – Darkstar is a server to support the development of massively multi-player online games. Darkstar allows developers to scale their applicaions without requiring the need to know distributed computing. Computing however by the server can be spread across any number of servers in a scalable and fault tolerant manner. The system is further more optimized for the low latencies required for game applications. It is a fully distributed, fault tolerant communication and event processing system. Darkstar servers have notable grid like properties: Zoneless and Shardless are not required for scaling, automatic fail-over, transparent persistence and a custom enterprise level transactional data storage system.
- H2O – H2O is a scalable, stateless, and lightweight platform for building and deploying distributed applications. H2O is an open container that allows any authorized third parties to deploy services into the container. H2O features APIs for remote component deployment and management, and inter-component communication. H2O components can communicate via synchronous or asynchronous remote method invocations or through a publisher-subscriber distributed event model. The communication layer offers a selection of messaging protocols (JRMP, SOAP, RPC) and customizable transport stacks (SSL, compressed sockets, JXTA sockets, single-port tunneling, in-process sockets, all of which can be mixed in many combinations). H2O has been designed to support wide range of distributed programming paradigms, including self-organizing applications, widely distributed applications, massively parallel applications, task farms, component composition frameworks, and more. H2O is focused on security, ensuring the safety of shared resources and that of users data via the well-established technologies like SSL, JSSE, JAAS and the Java Platform Security.
- Ibis – The Ibis project currently consists of several sub-projects that include communication libraries, programming models, a grid interface toolkit and a peer-to-peer grid framework. Satin is a programming model that makes it convenient to develop divide and conquer style programs. MPJ provides is a pure Java implementation of the MPI messaging protocol. GMI is a group based implementation for RMI. JavaGAT interfaces with other grid middleware, such as Globus, Unicore, SSH and Zorilla. JavaGAT provides a common uniform interface that provides file access, job submission, monitoring, and access to information services. Zorilla is a peer-to-peer overlay network.
- Pegasus – Pegasus (Planning for Execution in Grids) is a workflow mapping engine developed and used as part of several NSF ITR projects. Pegasus bridges the scientific domain and the execution environment by automatically mapping the high-level workflow descriptions onto distributed infrastructures such as the TeraGrid and the Open Science Grid. Pegasus is used in a variety of scientific applications ranging from astronomy, biology, earthquake science and gravitational-wave physics.
- Java Cog Kit -
The Java CoG Kit framewok provides basic and advanced abstractions that aid in the development of Grid applications. These abstractions include job executions, file transfers, workflow abstractions, and job queues and can be used by higher level abstractions for rapid prototyping. The Java CoG Kit is extensible, users can include their own abstractions and enhance its functionality. Grid providers allow different Grids to be integrated into the framework.
- JGrid – The JGrid is a service-oriented grid where components (services and clients) can dynamically join and leave the system at any time, and users can have simple and seamless access to grid services. The infrastructure is fault-resilient and reliable. JGrid provides a wide-area, distributed service discovery mechanism with content-based query routing and powerful matching semantics. The system allows users to access and use services, execute sequential or parallel programs in a secure environment, and interface with legacy systems. Secure communication and execution environment employs standard security mechanisms (X509, Kerberos, SSL).
- JOMP – The JOMP project goal is to define and implement an OpenMP-like set of directives and library routines for shared memory parallel programming in Java.
So by now you may be bit confused as to what a Grid or Cluster Computing means. Well, you are not alone. Tim Bray wrote a good piece to resolve this ambiquity. Additional Grid information can be found here. Nevertheless, if you know on any more projects that fit the ‘Grid’ or ‘Cluster’ computing category, please feel free to contact me.