A Hadoop distribution for engineering simulation
Abstract— In this paper, we discuss on the VELaSSCo project (Visualization for Extremely LArge-Scale Scientific Computing). This project aims to develop a specific platform to store scientific data for FEM (Finite Element Method) and DEM (Discrete Element Method) simulations. Both of these simulations are used by the engineering community to evaluate the behavior of a 3D object (for example fluid simulation in a silo). These simulations produce large files, which are composed of different time steps of a simulation. But the amount of produced data is too big to fit into a single node. Some strategies decompose data between nodes, but after several time-steps, some data has to be scratched to free memory.
In this project, we aim to develop a platform, which enables the scientific community to store huge amounts of data on any kind of IT systems. We target to store data on any IT systems because most of scientists have access to modern computation nodes and not huge storage nodes. Our platform will try to fill the gap be-tween both worlds.
In this paper, we give an overview of the VELaSSCo project, and we detail our platform and deployment software. This platform can be deployed on any kind of IT system (dedicated storage nodes, HPC nodes, etc.). This platform is specially designed to store data from DEM and FEM simulations. In this paper, we present a performance analysis of our deployment tool compared to the well-defined myHadoop tool. With our tool we are able to increase computation capabilities with containers and virtualization.
To cite this version:
Benoit Lange, Toan Nguyen. A Hadoop distribution for engineering simulation. [Research
Report] INRIA Grenoble - Rhˆone-Alpes. 2014.