This section reflects the 2015 vision and the recommendations of more than 150 worldwide experts in scientific and industrial applications and in all sciences and technologies required for Exascale.
It takes into account existing strengths in the European HPC and ICT communities. It addresses key strategic areas for which there is an urgent need for funded programs of work, beyond the classical and conservative HPC approaches, to develop and improve European competitiveness and to achieve leadership.
The EESI data centric vision has been elaborated in 2014. Though very new in Europe it is essential for approaching the ultra complex and interdependent challenges of Extreme Computing and Extreme Data.
Discover the 3 pillars to be funded by European Commission to build efficient Exascale applications
Recommendations in this pillar concern programming models and methods, heterogeneity management, software engineering and cross-cutting issues like resilience, validation and uncertainty quantification with a strong focus on the specificity of Exascale in these domains.
Recommendations encompass the following programs:
- High productivity programming models for Extreme Computing,
- Holistic approach for extreme heterogeneity management of Exascale supercomputers,
EESI2 experts consider that Exascale will require heterogeneity at an unprecedented level . This means embedding in the same system near-data, data-parallel, function specific accelerators, general purpose multicores with multiple ISA, with same ISA but multiple energy-performance trade-offs, networking accelerators as well as volatile and non-volatile memories. Such recommendations are coherent with the European Technology Platform for High Performance Computing (ETP4HPC) vision.
This set of recommendations aim to design and develop new efficient HW/SW APIs for the integrated management of heterogeneous systems, near-data technologies and energy-aware devices, to enable exascale-ready applications.
- Software Engineering Methods for High-Performance Computing,
To be able to cope with the complexity of exascale systems in terms of very large number of nodes, heterogeneity of resources, energy awareness, or fault tolerance, to mention just a few challenges, new productivity-enhancing methods and tools are needed to use the manpower available for the development and maintenance of HPC software more effectively, maximizing the potential of applications under given resource constraints.
- Holistic approach to resilience,
The evolution of the software and hardware technologies will lead to an increase of fault rates that will translate into higher error and failure rates at Exascale. HPC hardware itself cannot detect and correct all errors and failures. Maintaining European resilience capabilities is mandatory to be able to develop efficient Exascale applications. But besides the development of resilience techniques, a holistic detection/recovery approach covering and orchestrating all layers from the hardware to the application appears necessary to be able to run simulations and data analytics executions to completion and produce correct results.
- Verification Validation and Uncertainties Quantifications tools evolution,
for better exploitation of Exascale capacities
The mathematical background behind uncertainty analysis is very strong and comes from the field of statistics. Europe can claim world leading experts on these topics, but the link between this community and the HPC community must be strengthened. The recommendation aims at preparing an unified European VVUQ package for Exascale computing by identifying and solving problems limiting usability of these tools on many-core configurations; facilitating access to the VVUQ techniques to the HPC community by providing software that is ready for deployment on supercomputers; and making methodological progresses on the VVUQ methods for very large computations.
Recommendations in this area concern specific and disruptive algorithms for Exascale computing, taking a step-change beyond “traditional” HPC. It will lead to the design and implementation of extremely efficient scalable solvers for a wide range of applications.
The following recommendations are proposed for funding by the European Commission:
- Algorithms for Communication and Data-Movement Avoidance
The objectives of this recommendation is to coordinate the multiple groups working on many different aspects and in different algorithmic areas.
The scope is to explore novel algorithmic strategies, far beyond the well-known communication hiding techniques, to minimize data movement as well as the number of communication and synchronization instances in extreme computing; minimization should happen at both local (e.g. within a multiprocessor, across a memory hierarchy) and remote (e.g. network) levels.
- Parallel-in-Time: a fundamental step forward in Exascale Simulations (disruptive approach).
The efficient exploitation of Exascale systems will require massive increases in the parallelism of simulation codes, and today most time-stepping codes make little or no use of parallelism in the time domain; the time is right for a coordinated research program exploring the huge potential of Parallel-in-Time methods across a wide range of application domains.
This area links extreme computing and extreme data. For the transition to Exascale, current data life cycle management techniques must be fully rethought, as described in the first joined document “Software for Data Centric Approaches to Extreme Computing” which is more a vision than a concrete recommendation. This pillar gathers together key strategic issues for Exascale applications which are not enough addressed until now in Europe.
Ensuing from the EESI holistic vision of “Software for Data Centric Approaches to Extreme Computing”, this section describes the final roadmap and recommendations issued by EESI2 Experts for critical R&D challenges to be funded by European Commission in order to develop efficient Exascale applications.
- Towards flexible and efficient Exascale software couplers,
As stated by applications experts in many recent reports, the rise of extreme computing with data intensive capacities will allow only few “hero” applications to scale out to such full systems in capability mode (one simulation scaling on billions of cores). The major potential of such upcoming architectures will rely capacity simulation based on multi-scale and multi-physics scientific codes running individually on hundreds thousands of cores and smartly coupled together with highly loaded models that exchange data with a high frequency.
Such coupled models are challenging to develop due to the need to coordinate execution of the independently developed model components while resolving both scientific and technical heterogeneities. Despite some existing specific initiatives, there is a crucial need to develop new and common European-wide coupling methodologies and tools in order to support major scientific challenges in research (evolution of the climate, astrophysics and materials) and engineering (combustion, catalysis, energy, …).
To improve the performances of coupled applications in terms of usability and scalability on Exascale machines, the recommendations are to work on coupling libraries as well as on coupled models and their environment.
- In Situ Extreme Data Processing and better science through I/O avoidance in High-Performance Computing systems,
The ultimate goal of the in situ extreme data processing is to promote new data transformations and compressions that reduce drastically extreme raw data, generated during HPC simulations, by preserving the information required for a particular analysis while sacrificing most everything else and store the only relevant data.
All these theoretical ideas should be aligned with practical challenges of in-situ, in-transit and real-time high- performance computation where extreme data must be processed under severe communication and memory constraints.
The goal of this recommendation is to fund R&D programs in order to explore data analysis framework from a post-process centric to a close to real-time concurrent approach based on either in-situ or in-transit processing of the raw data of numerical simulations, processed as they are computed during massively parallel post-petascale and Exascale applications.
- Declarative processing frameworks for big data analytics, extreme data fusion,
Exascale systems provide an incredible huge amount of synthetic data that need to be processed (e.g. for visualization) in order to get a full understanding of what they simulate, conversely, data acquisition system gather an incredible amount of diverse real data that need to be processed to get a better understanding of the situations they have been gathered from. Computer scientists and specialists of statistics used to manage and treat these data. The current Variety, Volume and Velocity of data imply a synergy and collaboration between different fields of science in order to extract full intelligence and knowledge from these data in close to real time.
The rise of multi petascale and upcoming Exascale HPC facilities will allow to turbulent simulations based on LES and DNS methods to address high fidelity complex problems in climate, combustion, astrophysics or fusion. These massive simulations performed on tens to hundreds thousands of threads will generate a huge volume of data, which is difficult and inefficient to post process asynchronously later after by a single researcher. The proposed approach consists on post processing this rough data on the fly by smart tools able automatically to extract pertinent turbulent flow features, store only a reduced amount of information or provide feedback to application in order to steer its behaviour.
As data mining in large-scale turbulent simulations applied climate, combustion, traditional CFD, astrophysics, fusion … will become more and more difficult because of the size and the complexity of data generated, it is mandatory to develop a complete toolbox of efficient parallel algorithms.
Not all of the recommendations are at the same level of generalization but they are complementary and linked to each other by their global common objective: enabling the emergence of a new generation of intensive data and extreme computing applications. Some of them are fully disruptive; all need to go beyond known HPC technologies and methods.
All these recommendations should be supported and funded. Some of these recommendations could be addressed in part by being strategic themes for new Centres of Excellence (CoEs).
Read full EESI 2014 recommendations.
Key R&D programs for Exascale
EESI expertise and deep studies helps to build an Exascale computing capability in Europe and leads the way at a worldwide level in the transformation of society and industry.
Experts have investigated state-of-the-art of world development to detect disruptive technologies, cross cutting issues, numerical processing and software engineering. Such knowledge has enabled to build a gap analysis, provide orientations and monitoring of R&D actions and training.
Following its vision of a data centric approach, EESI2 has led the way toward exascale by funding and chairing an international initiative on Big Data, and issuing its recommendations.
EESI2 proposes 8 recommendations for R&D programs to implement efficient exascale applications and the development of supporting software environments:
5 absolutely critical programs:
- Ultra scalable algorithms,
- Big Data,
- High productivity programming models.
3 high priorities:
- Mini apps,
- Software Engineering Methods for High- Performance Computing
- Verification, Validation and Uncertainty Quantification.
At the end of EESI2, the new data centric approach give more recommentations related to the 3 pillars for exascale.
Industrial application recommendations
Large petaflop computers are operational now in several companies in Europe.
Big data management is already a reality. Nevertheless, the European ecosystem is outpaced by other regions.
Indeed, users of such infrastructures are mainly large companies or academia, and most of computing resources and applications are developed outside Europe. Actions need to be undertaken so as to keep Europe competitiveness.
For the industrial applications, the EESI1 roadmap is followed by most companies with a slight delay. In aeronautics, there is the need to couple LES simulations and to develop both simulation codes as well as flexible coupler and optimised coupling techniques.
For the Weather/Climate and Solid earth Sciences, the climate community is now preparing all the workflows and the simulations codes toward the next IPCC campaign. This will need to address data management issues and improving scalability of the applications, developing dynamical cores, coupling multiple codes, assessing uncertainties of the models
In fundamental Sciences a lot of massive simulations in cosmology are posing crucial issues for handling very large amount of data to be treated on the fly during the computation and made available during years to worldwide communities.
In Life Sciences the European Commission announced in early 2013 the launch of 2 major Flagships : Human Brain Project and Graphene with a funding of more than 1B€ during 10 years for each project. These two initiatives will lead to the creation in Europe of strong structured communities highly connected with existing research infrastructures like PRACE or GEANT.
Disruptive technologies in the following fields :
- Simulation and Optimization with uncertainties
- Simulation and Optimization in Complex Networks
- Data Driven Simulations
- Bridging scales by ultra-large simulations
- Advanced Adaptive Resolution
- Multiscale: hybrid and beyond
Additional recommendations are the following:
- To reinforce the training program of PRACE across Europe and the support of HPC and numerical simulation trainings from education (under graduate programs) to permanent training. MOOC tools need to be developed in complement of physical trainings in order to train more people from new disciplines or coming from remote locations.
- To fund co-educational programs on the master level which provides a broad knowledge of hardware technology and system- and application software not only for computer scientists but also for students in scientific computing.
- Foster educational programs for training of postdoctoral and senior scientists, who have to follow the trends and developments of hardware architectures, software and optimization and tuning of codes.
- With the rise of Open Innovation between academia and industry, it become mandatory to industrialize the different codes developed by academia in order to make them usable by industry with higher TRL. For the European Commission as well as the others funding agencies there is a need to invest on structures/teams which are able to provide software engineering methodologies for developing new standard components
Enabling technologies recommendations
Enabling technologies are such technologies that are behind any exascale computation and are necessary to realise the potential of future Exascale systems. Recommendations encompass the following:
- To establish a single initiative for the study of numerical algorithms for exascale computing at worldwide level
- To follow up on new paradigms for parallel programming that are able to support the requirements of exascale applications
- To follow up on the evolution of memory technologies that can largely increase the available bandwidth
- To align and merge efforts with the ETP4HPC forum, organizing common meetings in order to align positions and optimize resources.
Cross cutting issues recommendations
Big Data for extreme computing, Resilience, Uncertainties and Validation, are key issues on which working groups concluded to propose urgent recommendations
This section reports the work done by EESI2 in the first year of activity.
The activity proceeded first focusing better the state of the art of the topics addressed, then continued to better understand the evolutions in the domains and trying to identify a gap analysis and some recommendations for approaching the Exascale goal.
As the activity was new to EESI2, the gap analysis activity has to be refined in the next reporting period.
Discover the WP5/ D5.1 gap analysis and recommendations