Abstract |
The goals of a large number of complex scientific and industrial applications are deeply linked to the effective use of high-performance computing (HPC) infrastructures and the efficient extraction of knowledge from vast amounts of data. However, the way complex workflows are developed is currently fragmented in multiple components, using different programming models, with different processes for computing and data management.
eFlows4HPC aims to deliver a workflow software stack and an additional set of services to enable the development and integration of HPC simulation and modelling with big data analytics and machine learning in scientific and industrial applications. The software stack will allow to develop innovative adaptive workflows that efficiently use the computing resources and also consider innovative storage solutions. To widen the access to HPC to
newcomers, the project will provide HPC Workflows as a Service (HPCWaaS), an environment for sharing, reusing, deploying and executing existing workflows on HPC systems. With these goals, the project will build a set of catalogs, repositories and registries to store data sets and software components, including whole workflow instances that will be leveraged by the HPCWaaS.
The workflow technologies, associated machine learning and big data libraries used in the project are based on previous open-source European initiatives. Specific optimization tasks for the use of accelerators (FPGAs, GPUs) and the EPI will be performed in the project use cases by selecting specific kernels in the Pillars workflows.
For demonstrating the workflow software stack, the eFlows4HPC consortium has selected three use cases for three thematic pillars on manufacturing (Pillar I), climate (Pillar II) and natural hazards (Pillar III). Pillar I focuses on the construction of Digital Twins for the prototyping of complex manufactured objects integrating state-of-the-art adaptive solvers with machine learning and data-mining, contributing to the Industry 4.0 vision. Pillar II will develop innovative adaptive workflows for climate modelling able to make efficient use of the computing resources by means of performing a dynamic pruning of the simulations; it will also leverage the workflow software stack to study Tropical Cyclone (TC) track (i) multi-model analysis in the context of the CMIP6 experiment, and (ii) in-situ analytics, integrating and comparing data-intensive versus data-driven approaches from different perspectives. Pillar III explores the modelling of natural catastrophes -in particular, earthquakes and their associated tsunami shortly after such an event is recorded. Leveraging twoexisting workflows, the Pillar will work on integrating them with the eFlows4HPC software stack and on producing policies for urgent access to supercomputers.
The project will involve application stakeholders (as partners) in the process of defining the pillar use cases, with the goal of defining complex workflows that include modelling and simulation together with HPDA and AI. The final goal is the adoption of the solutions defined in the project by the community, both by enabling impact in industrial cases and its exploitation in future HPC systems, by means of providing secure and simple access to these
solutions, in some cases in the form of community services. To reinforce this, the area Centers of Excellence (CoEs) will be involved in the process, in some cases as partners in the project. |