For Immediate Release
The Accelerated Box of Flash: Accelerating Intensive Data Operations with Computational Storage
Radically new approach to storage acceleration aids data manipulation for research and discovery
San Diego, March 18, 2022
- Los Alamos National Laboratory, nVidia (Mellanox), Eideticom Aeon Computing and SK Hynix co-developed the “Accelerated Box of Flash” or ABOF platform
- The ABOF platform incorporates accelerator technology to offload performance critical storage functions from host systems.
Data is a vital part of solving complicated scientific questions, in endeavors ranging from genomics, to climatology, to the analysis of nuclear reactions. However, an abundance of data is often only as good as the ability to efficiently store, access and manipulate that data. To facilitate discovery with big data problems, researchers at Los Alamos National Laboratory, in collaboration with industry partners, have developed an open storage system acceleration architecture for scientific data analysis, which can deliver 10 to 30 times the performance of current systems. The architecture enables offloading of intensive functions to an accelerator-enabled, programmable and network-attached storage appliance called an Accelerated Box of Flash or simply ABOF. ABOF systems are destined to be a key component of the Laboratory’s future HPC platforms.
“Scientific data and the data-driven scientific discovery techniques used to analyze that data are both growing rapidly,” said Dominic Manno, researcher with Los Alamos National Laboratory’s High Performance Computing division. “Performing the complex analysis to enable scientific discovery requires huge advances in the performance and efficiency of scientific data storage systems. The ABOF programmable appliance enables high-performance storage solutions to more easily leverage the rapid performance improvements of networks and storage devices, ultimately making more scientific discovery possible. Placing computation near storage minimizes data movement and improves the efficiency of both simulation and data-analysis pipelines.”
Scalable computing systems are adopting Data Processing Units (DPUs) placed directly on the data path to accelerate intensive functions between CPUs and storage devices; however, the ability to leverage DPUs within production-quality storage systems for use in complex HPC simulation and data-analysis systems has proven difficult. While DPUs have specialized computing capabilities that are tailored to data processing tasks, their integration into HPC systems has not fully realized available efficiencies.
The ABOF appliance is the product of hardware and storage system software co-design. It enables simpler use of NVIDIA BlueField-2 DPUs and other accelerators for offloading intensive operations from host CPUs without major storage system software modifications and allows users to leverage these offloads and the resulting speedups with no application changes. The current ABOF implementation accelerates three critical functional areas necessary to storage system function – compression, erasure coding and checksums – by applying specialized accelerators. Each of these functions represents time, expense and energy-use in storage systems. It utilizes BlueField-2 DPUs with 200Gb/s InfiniBand networking. The performance-critical functions of the popular Linux Zettabyte File System (ZFS) are offloaded to the accelerators in the ABOF. This ZFS offload is accomplished by using a new ZFS Interface for Accelerators (available at the GitHub software platform). The Linux DPU Services Module, also on GitHub, is a Linux kernel module that enables the use of DPUs from directly within the kernel, irrespective of where they exist along the data path.
The project underwent a successful internal demonstration following the January release of the ABOF appliance hardware and its supporting software. Collaborators included NVIDIA, which built the data processing units and provided a scalable storage fabric; Eideticom, which created the NoLoad computational storage stack used to accelerate data-intensive operations and minimize data movement; Aeon Computing, which designed and integrated each component into a storage enclosure; and SKHynix, which partnered on providing fast storage hardware. “HPC is solving the world’s most complex problems, as we enter the era of exascale AI,” said Gilad Shainer, senior vice president of networking at NVIDIA. “NVIDIA’s accelerated computing platform dramatically boosts performance for innovative exploration by pioneers such as Los Alamos National Laboratory, allowing researchers to drastically speed up breakthroughs in scientific discoveries.”
“The Next Generation Open Storage Architecture enables a new level of performance and efficiency thanks to its hardware-software co-design, open standards and innovative use of technologies such as DPUs, NVMe and Computational Storage.” said Stephen Bates, Chief Technology Officer at Eideticom. “Eideticom is proud to work with Los Alamos National Laboratory and the other partners to develop the computational storage stack used to showcase how this architecture can achieve these new levels of performance and efficiency. The efficient use of accelerators, coupled with innovative software and open standards are key to the next generation of data-centers.”
“Developing a cutting-edge storage product with an end-user has been a very positive experience,” said Doug Johnson, cofounder of Aeon Computing. “Working together with the technology vendors and end- user in collaboration allowed for rapid iteration and enhancement of a new type of storage product that will serve the most important goal a product can have, acceleration of the end-user’s workflow.”
“SK hynix joined this collaboration building ABOF because we understand the need for a new flash memory-based system that can accelerate data analysis,” said Jin Lim, vice president of the Solution Lab at SK Hynix. “Building on this showcase technology, we are committed to work with the collaboration partners in further defining the new architecture of the computational storage device and requirements that are critical to its best use cases.” Building on the file system acceleration project, researchers plan to next pursue integrating a set of common analysis functions in the system. That functionality would allow scientists to analyze the data using the existing programming, potentially warding off the need for additional data movement and supercomputing resources. This functionality would be specialized and tailored to the scientific community – another robust tool for tackling the complicated, data-intensive questions that underlie the challenges in our world.
About Aeon Computing
Aeon Computing is based in San Diego, California and has over 55 years of staff experience in high performance computing, enterprise computing architectures, and data storage, with a focus on architecting perfectly suited customer solutions. Their customers include academic, government, and commercial institutions that prefer high performance design over stock solutions.
About Los Alamos
Los Alamos National Laboratory, a multidisciplinary research institution engaged in strategic science on behalf of national security, is operated by Los Alamos National Security, LLC, a team composed of Bechtel National, the University of California, BWX Technologies, Inc. and URS for the Department of Energy’s National Nuclear Security Administration. Los Alamos enhances national security by ensuring the safety and reliability of the U.S. nuclear stockpile, developing technologies to reduce threats from weapons of mass destruction, and solving problems related to energy, environment, infrastructure, health, and global security concerns.
Co-founder, Aeon Computing
Follow Aeon at @AeonComputing