Catalog Record

Report from S&T Project 22041: Evaluation of file formats for storage and transfer of large datasets in the RISE platform

The Reclamation Research and Development Office funded an evaluation of file formats for large datasets to use in RISE through the Science & Technology Program. A team of Reclamation scientific and information technology (IT) subject matter experts evaluated multiple file formats commonly utilized for scientific data through literature review and independent benchmarks. The network Common Data Form (netCDF) and Zarr formats were identified as open-source options that could meet a variety of Reclamation use cases. The formats allow for metadata, data compression, subsetting, and appending in a single file using an efficient binary format. Additionally, the Zarr format is optimized for cloud storage applications. While support of both formats would provide the most flexibility, the maturity of the netCDF format led to its prioritization as the preferred RISE file format for large datasets. This report documents the evaluation and selection of large data file formats for the RISE platform. Additionally, a preliminary list of identified changes to the RISE platform needed to support the netCDF format is provided. The intent is to frame future RISE development by providing a roadmap to support large datasets within the platform.
Generation Effort S&T Project 22041: Evaluation of file formats for storage and transfer of large datasets in the RISE platform
Location Worldwide
Themes Water, Water Quality, Environmental
Tags Modeling, Open Data, Storage
Reclamation Project
Reclamation Program Science and Technology Program

Location Information

Location Name Worldwide
Location Description
Location Tags World, Western US, North America, South America, Europe, Asia, Australia
Location Parent
State(s) Alaska, Alabama, Arkansas, Arizona, California, Colorado, Connecticut, District Of Columbia, Delaware, Florida, Georgia, Hawaii, Iowa, Idaho, Illinois, Indiana, Kansas, Kentucky, Louisiana, Massachusetts, Maryland, Maine, Michigan, Minnesota, Missouri, Mississippi, Montana, North Carolina, North Dakota, Nebraska, New Hampshire, New Jersey, New Mexico, Nevada, New York, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, Utah, Virginia, Vermont, Washington, Wisconsin, West Virginia, Wyoming
Unified Region(s) North Atlantic-Appalachian, South Atlantic-Gulf, Great Lakes, Mississippi Basin, Missouri Basin, Arkansas-Rio Grande-Texas-Gulf, Upper Colorado Basin, Lower Colorado Basin, Columbia-Pacific Northwest, California-Great Basin, Alaska, Pacific Islands
Elevation [ N/A ]
Vertical Datum [ N/A ]
Coordinates (lat, long) See Location Details
Horizontal Datum WGS84