AROM is a distributed processing framework based on Data Flow Graphs (DFG). It has specifically been designed to address Big Data problems.
The programming model proposed by AROM is based on directed acyclic graphs (DAG). In this model, the nodes of the DAG represent the operators which perform an operation on the data and the edges define the flow of the data between the operators. The AROM processing framework then schedules each of the operators contained in the DFG on the working nodes of a AROM cluster and coordinates the data transfers between the nodes.
The AROM programming model based on DFG boasts only few constrains and offers flexibility to the user in the design of the jobs. DFG models are often considered to be at a lower level than MapReduce, making it possible to directly translate MapReduce jobs into DFG and run them on a DFG framework. In this scope AROM also provides a MapReduce API allowing to run defined MapReduce jobs on the DFG framework.
AROM preferably uses a distributed storage as source for the processed data, as it aims at co-locating a processing operator close to its data. For the moment AROM fully supports the Hadoop File System (HDFS) used by Hadoop.
AROM is currently in its incubation phase within EURA NOVA (http://euranova.eu). Once the code is mature and stable enough, AROM will be provided via this website under the Apache Licence Version 2. If you'd like to have an early preview of the code, please take contact with EURA NOVA.
AROM has been presented to the BeScala in November 2012. The presentation used for this occasion is published hereinafter.
To fully enjoy this presentation, you'll need a navigator that supports the svg format.
You can also access the full size arom presentation.
AROM is the continuation of the master thesis of Arthur Lesuisse, then student at the Université Libre de Bruxelles (ULB), Brussels, Belgium.
For more information, you can read the full master thesis that lead to AROM: