DHPC Adelaide

DHPC Technical Report DHPC-021

On-Line Data Archives

K.A. Hawick, P.D. Coddington, H.A. James, C.J. Patten

Archived: 1 December 1997

Published in Proc. of Hawai'i International Conference on System Sciences (HICSS-34), Maui, January 2001.

Abstract

Digital libraries and other large archives of electronically retrievable and manipulable material are becoming widespread in both commercial and scientific arenas. Advances in networking technologies have led to a greater proliferation of wide-area distributed data warehousing. This presents particular challenges associated with distributed data management. We review the available tools and technologies for supporting distributed on-line data archives and explain the key concept of ``active'' data archives, in which data can be processed on-demand prior to delivery.

We present a summary of our On-Line Data Archives (OLDA) program of work in developing wide-area data warehousing software infrastructure. Our system primarily targets geographically distributed archives of large scientific data sets, such as satellite image data, that are stored hierarchically on disk arrays and tape silos and accessed by a variety of scientific and decision support applications. We discuss the issues faced in building such an infrastructure, and the key areas that are the subject of current research, such as efficient bulk data storage, processing and delivery.

Interoperability is a major issue for distributed data archives, and requires standards for server interfaces and metadata. There is currently considerable activity in developing such standards for different application areas. We provide an overview of some of this work, and of our experiences in implementing an active data archive of satellite images based on evolving interface standards for accessing and processing geospatial image data.

PDF version

PostScript version (gzip compressed)


[ DHPC Adelaide | DHPC Bangor | Contacts | People | Projects | Reports ]

webmaster@dhpc.adelaide.edu.au