Director, National Center for Supercomputing Applications
Grainger Distinguished Chair in Engineering
Siebel School of Computing and Data Science
University of Illinois Urbana-Champaign
Urbana, Illinois
Looking for the head (chair) of the CS Department? You want the Director of the Siebel School of Computing and Data Science, Nancy Amato
Phone: | 217 244 6720 |
Fax: | 217 265 6738 |
email EM>: | wgropp at illinois.edu |
ORCID: | orcid.org/0000-0003-2905-3029 |
American Association for the Advancement of Science
I am a Council Member for Section T - Information, Computing, and Communication. AAAS service a vital role in advocating for science and in informing policy makers and the public about science and its impact on society. Computing and data, especially but not only AI and ML, are transforming science and society, and it is more important than ever that the AAAS speak up for both the opportunities and the risks in applying computing technologies.Computing Research Association and Computing Community Consortium
These organizations provide community input on computing and computer science, and contribute to the discussion of research directions and policy in the US. I serve on the Computing Research Association (CRA) board as one of the IEEE Computer Society representatives. I also am a member of the Computing Community Consortium (CCC) as well as a member of its executive committee.Research Interests
My interest is in the use of high performance computing to solve problems that are too hard for other techniques. I have concentrated in two areas: the development of scalable numerical algorithms for partial differential equations (PDEs), especially for the linear systems that arise from approximations to PDEs, and the development of programming models and systems for expressing and implementing highly scalable applications. In each of these areas, I have led the development of software that has been widely adopted. PETSc is a powerful numerical library for the solution of linear and nonlinear systems of equations. MPI is the mostly widely used parallel programming system for large scale scientific computing. The MPICH implementation of MPI is one of the most widely used and is the implementation of choice for the world's fastest machines.
Current Major Research Projects
These are some of my major research projects. I have other projects and collaborations, particularly in parallel I/O and parallel numerical algorithms, as well.I lead several projects that have deployed cyberintrastructure that supports research. The major projects are two GPU-rich supercomputers. Delta is funded by the National Science Foundation and is designed to accelerate the adoption of new computing technologies such as GPUs and non-POSIX file systems by the computational science community. This resource is available through the ACCESS program.
DeltaAI was awarded by NSF in May 2023, and provides an AI/ML optimized supercomputer that is connected to Delta, allowing users to take advantage of the capabilities of both systems.Some of the funding for the operations of DeltaAI came from the National Discovery Cloud for Climate (NDC-C) and climate researchers are encouraged to make use of both Delta and DeltaAI.
A smaller system for AI/ML for Illinois researchers, deployed in 2016, is HAL, Deep Learning Major Research Instrument, which combines my interests in HPC software and numerics and high-performance I/O with the revolution in machine learning.
A related project is Illinois Computes, which provides both systems and expert support for Illinois Researchers.
The Center for Exascale-enabled Scramjet Design is funded by the US Dept. of Energy in the Predictive Science Academic Alliance Program (PSAAP III). The project has openings for graduate students with interests in numerical analysis, scientific computing, parallel program, performance and I/O, among others.
The Midwest Big Data Hub (MBDH) is one of four NSF-funded regional big data an innovation hubs, which serve to bring communities together to apply data science to a wide range of challenges. I led the MBDH from 2017 through mid-2020, and currently am one of the co-principal investigators.
The MPI Forum is the organization that continues to develop the Message-Passing Interface standard, which is the dominant programming system for massively parallel applications in science and engineering. I'm lead several of the chapter committees and am the overall editor. I also do work in MPI implementations and design.
- High performance I/O is another focus of my research, and I have several projects looking at everything from using and implementing collective I/O to using database ideas to better manage data from simulations. One feature that these have in common is that they do not require POSIX I/O semantics - specifically, the strong consistency semantics that contribute to complex implementations and that lead to performance and robustness problems. The Delta and DeltaAI systems, mentioned above, feature a fast, non-POSIX file system.
Research Opportunities
I have an active research program and currently have openings for graduate students, both Ph.D. and Masters students, and scientific programmers. A brief description of these is given below; more information can be found here.
Achieving high performance requires paying close attention to data movement and taking a quantitative approach to performance analysis. These projects address different challenges in achieving high performance in applications.
- Process mapping for MPI programs. How MPI processes are assigned to cores, sockets, and nodes can impact the amount of data that moves between each level of that hierarchy - and thus the scalability and performance of an application. This project looks at ways to improve that mapping of MPI processes and thus provide better performance and scalability with little to no work by application programmer. This project is appropriate for an advanced undergraduate or a masters student; generalizations of this to more complex process mappings is appropriate for Ph.D. students.
- Data movement optimizations for I/O and data management. Modern systems have complex and hierarchical data storage systems and applications rarely make good use of these capabilities. This project is part of a larger effort to improve tools for managing data for science applications. This project is appropriate for a Ph.D. students.
- Programming tools for heterogeneous systems. While there has been
much research in this area, there are few working systems. The first
project is to evaluate the current technology by creating a version of
the Top500 HPL benchmark for multi-GPU nodes.
This will enable use of "evergreen" (frequently updated)
parallel hardware for challenging applications.
Related to this project is the development of a new set of MPI benchmarks appropriate for today's heterogeneous nodes - both CPU-only nodes with large numbers of cores (e.g., Delta's CPU nodes have 128 cores) and GPU-CPU nodes with multiple GPUs (e.g., Delta's GPU nodes have 4 or 8 GPUs). This project is appropriate for an advanced undergraduate or a masters student.
- Code transformations for performance. The reality is that compilers often don't have enough information to produce the best performing code. This project uses code transformations to improve performance of applications, particularly for codes limited by memory performance. This project is appropriate for an advanced undergraduate or a masters student; generalizations of this to more complex code transformations is appropriate for Ph.D. students.
- MPI datatypes provide a language to describing data movement, particularly for non-continuous data. This project seeks to improve performance of applications that make use of MPI datatypes, especially for applications using GPUs and multicore CPUs. This project is appropriate for an advanced undergraduate or a masters student; generalizations of this to more complex data movement is appropriate for Ph.D. students.
Of Special Interest
I was named as one of the 35 HPC Legends in 2024 by HPC Wire.
I was honored to receive the HPCWire Readers Choice Award for Outstanding Leadership in HPC. in 2023.
The MPI 4.1 Standard has been released! MPI is the dominant programming system for distributed memory systems in scientific research, and is used in everything from aircraft design to training AI models (MPI was used in training ChatGPT, for example). There is also an unofficial HTML version of MPI 4.1.
Our report on Next Generation Earth Systems Science at the National Science Foundation (2022) is available. I served on the study committee for this report and I am particularly happy with the community's recognition of the importance of computing and computing professionals for progress in this area.
Our report on Future Directions for NSF Advanced Computing Infrastructure to Support U.S. Science and Engineering in 2017-2020 is available! The report is freely available at that link, and provides a framework for NSF's advanced computing for the future (not just until 2020). It continues to be referenced in policy documents and legislation, most recently in the CHIPS act.
Current Conference Committees
Conference | Role |
---|---|
SC25 | Test of Time (chair) |
IPDPS'25,/a> | Program Committee |
EuroMPI'24 | Program Committee |