Last modified 6/23/1999

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  

 

 

 

 

 

 

 

 

 

 

  

 

 

 

 

 

 

 

  

 

 

 

 

 

 

 

  

 

 

 

 

 

 

 

  

 

 

 

 

 

 

 

  

 

 

 

 

 

 

 

  

 

 

 

 

 

 

 

 

[Back to Top]

Data and Information Access Link (DIAL)

A Web Based Data Distribution System for Scientific Data

Ken McDonald, NASA GSFC
Ramachandran Suresh, Liping Di, Lakshmi Kumar, Douglas Ilg, Raytheon ITSS


 
 


Abstract

This paper describes a web-based system called DIAL consisting of a package of tools, specifications, and documentation that will allow a data provider to create a data server for cataloging, documenting, and distributing scientific data and provide data display and analysis tools. The DIAL software is freely available for users. The system was developed as a part of National Aeronautics and Space Administration’s EOSDIS technology prototype efforts and in collaboration with National Center for Supercomputing Applications. The system was primarily developed for the Earth science community, but could be useful for other science and education communities. Since DIAL is a modular, extensible system, tools developed by others can easily be integrated with this system.


 


Introduction and Background

WWW technologies have greatly increased the on-line accessibility of science data. Thousands of Web pages have been created worldwide by government agencies, universities, and private industry. However, a single package of tools to create a scientific data server and an inventory of data holdings as well as display and analyze data is not available.

The Earth Observing System (EOS) is a very large, ambitious project funded by NASA as part of the Mission to Planet Earth. The EOS Data and Information System (EOSDIS) is the portion of the project charged with handling the vast amounts of data gathered by EOS. Possibly the most visible functions of EOSDIS is the archiving and distribution of the enormous amounts of data. The EOS-AM satellite, planned to be launched in the middle part of 1999, will generate nearly one terabyte of data per day from its five scientific instruments on board. The current plan calls for the functions of processing and distribution of the EOS-AM data to be carried out by eight Distributed Active Archive Centers (DAACs). Each of these DAACs will contain a large, complex data system designed to handle a large volume of data search and ordering transactions.

EOSDIS will produce and distribute over 200 data products. Historically, science data are archived in many different native and standard formats (e.g., HDF, CEOS Superstructure, FITS, netCDF, CDF, BUFR, GRIB, etc.). The WWW community has benefited by adopting just a few standards (e.g., HTML, HTTP, and GIF). Similarly, adoption of a few standard formats will greatly facilitate the access to and exchange of science data in EOS. Therefore, NASA’s EOSDIS project adopted HDF-EOS as the standard format for the production and distribution of science data from the EOS project. HDF-EOS, an extension of NCSA's Hierarchical Data Format (HDF), has added three more data models, namely Swath, Grid, and Point, into HDF. HDF-EOS software library is available on UNIX and Windows platforms. More information about HDF-EOS and sample HDF-EOS data sets can be found at http://hdf-eos.gsfc.nasa.gov.

Recent recommendations by the National Research Council (NRC) suggested that an EOSDIS built from many small interoperable data systems working loosely together, rather than a few tightly coupled large data systems, may be desirable. NASA is experimenting with NRC's recommendations by implementing the Earth Science Information Partnership (ESIP) program to build a federation of Earth science information providers.

All above implicit and explicit requirements call for a simple, compact, scalable, flexible, interoperable, and standards-complied data system for distributing Earth science data through the Web. In response to such requirements, we have developed a system called Data and Information Access Link (DIAL). This paper describes the architecture and functionality of DIAL.


 

Objectives and Functionality

The main objective of DIAL is to provide an integrated package of software tools to data providers for distributing data through the Web. With DIAL, a data provider can

set up a low end workstation (Windows 95/NT or UNIX) as a Web server,

populate it with data and metadata,

establish Web pages to provide search and selection of data,

provide client tools to be used with Web browsing software to further examine or manipulate the data.

By connecting to a DIAL site through their web browsers, data users can do the following manipulations:

spatial, temporal and parameter based search

view data and metadata

browsing, subsetting and subsampling of data

on-line downloading of data in multiple formats

x-y plotting of tabular data

DIAL is flexible, scalable, and interoperable. Data providers ranging from individual researchers to large data centers can use it. Once a DIAL site is set up, any user with a Web browser can access data and need not know the technical details of the server-side software.

DIAL has potential applications in where a federation of data systems needs to inter-operate. The DIAL is a compact suite of software, specifications, and documentation developed and assembled primarily from off-the-shelf public domain software and easily customizable by the site administrator. DIAL is Web-based client-server system that takes many advantages of the Web technologies.

DIAL supports both HDF and HDF-EOS formats. An earlier version of DIAL called Scientific Data Browser (SDB) developed by NCSA works with FITS and netCDF data. DIAL can also easily be extended to work with other data formats.


Applications of DIAL

The current version of DIAL is 2.0, which provides a very low cost Web-based client/server package to individuals and groups desiring to provide access to collections of science data.

DIAL can be used in the following scenarios:

  • As a data distribution system
  • Data search and access system
  • Extended catalog system

Although the current DIAL implementation is tailored to work with Earth science data, the architecture of DIAL will make it easy for extending DIAL to be used for other science data. The users of the DIAL might include principal investigators of scientific research, field-campaign data collectors, developers of special or experimental data products, or K-12 and university Earth science educators. With continuing development and implementation, the eventual potential of such a system is nearly as large as that of the Web, itself.
 
 

Availability of DIAL

Currently, DIAL supports the following platforms: DEC Alpha, SGI, Sun Solaris, HP, Windows 95 and NT. The DIAL package is less than 5 MB and is freely available from the web site at http://laits.gmu.edu/DownloadInterface.html. The only requirement for DIAL is that data to be distributed have to be in HDF or HDF-EOS.
 
 

The Architecture

The Web client-server based DIAL architecture is shown in Figure 1. DIAL consists of a number of client helper application tools as well as server utilities and CGI executables. The client side browsers have the capability of direct accessing and manipulating files on a DIAL server. Some components of the architecture (on both the client and server sides) are available as off-the-shelf public domain tools, while others will be developed specifically for the DIAL.


 
 
 

Figure 1. DIAL architecture


 
 
 

One of our design goals was to have the more generic functions implemented on the server and the more application specific functions on the client side as helper applications. The server, then, is responsible for helping the user to reduce his/her network bandwidth requirements by providing easy identification of the desired data file and a first approximation of the exact data within that file. The helper applications can then aid the user in performing discipline specific analysis on the portion of the file actually downloaded to the local system. This philosophy will preserve the generality of the server, while reducing network bandwidth requirements.
 
 

Server Side Components

Currently, two server-side components, namely DIAL Server System and Data Management, are available. The following sections will discuss in some details about those components.
 
 

Data Management

The functions of data management include the ingestion of data into DIAL, the creation of inventory for data search, and the administration of data and the software.

Data Ingestion and Inventory creation

If the data to be distributed are already in HDF or HDF-EOS, no data conversion is needed. If the data are not in the above mentioned DIAL supported formats, the conversion of data into DIAL supported formats is needed. The main function of the ingestion component is to prepare data files to be added into the server. Such preparation includes translation of "foreign" data formats into HDF or HDF-EOS and the addition of a standardized metadata block (in PVL or ODL) into the files. Any tool that can output an HDF file can work in this capacity in conjunction with a simple metadata extraction/encoding tool. Currently, DIAL provides data translators to convert ARC/INFO exchange format into HDF-EOS. The GeoTIFF and Shape translators will be available in 1999. More data translators and some generic data description tools are planned to be developed. In addition, data providers can develop their data translators to translate their specific data into HDF or HDF-EOS. Both HDF and HDF libraries are freely available for building the customized data translators. To add the standard metadata block into the HDF or HDF-EOS files, DIAL provides a tool called "meta". These metadata will be used by DIAL to automatically create the searchable inventory tables. Data providers can also use DIAL without creating metadata to view HDF and HDF-EOS files. In addition, metadata stored as global attributes within an HDF file can be viewed through DIAL. We are planning to update the metadata tools in the next release.

The inventory tables are created in DIAL by using a tool called "crinv", which builds the inventory table based on the metadata in the individual data files. A data provider can create their customized inventory tables for their data files by using the configuration file.

DIAL provides two options to store inventory tables:

For data providers with no access to commercial database packages, DIAL provides a simple flat-file database (as an Vdata object in the inventory HDF file) for storing the inventory tables.

For data providers with access to JDBC capable databases, "crinv" will store the inventory table in their databases through the JDBC connection.

Administration Tools

Administration tools consist of inventory maintenance and data ingest support tools. These tools will aid the site administrator in populating the server and tailoring the interface to the specific needs of the target user community. The users' view of the server contents can be customized through the choice of inventory fields presented and the creation of indexes to support specific access paths. An advertisement function can also be added to the administration tools. For instance, in EOSDIS, we have plans to extract information from the inventory and include it as a part of a higher-level directory system located elsewhere on the Web. Utilities available or planned to be developed are:

sorting utilities

various data maintenance utilities

data availability advertisement tools

software maintenance utilities


These tools will help data providers keep track of their data holdings and facilitate user access. These tools can help to create the inventory, add, delete, and edit records from the inventory.

The Server System

The DIAL server system consists of two major components: dib_search and dib_view. Both of them are working as CGI programs. The dib_serach works interactively with users to find the data the users want through searching the inventory tables against users' search criteria. Currently DIAL provides the combination of spatial, temporal, and parameter-based search. dib_view can talk to either ODBC capable database through its JDBC interface or the flat-file inventory table. Both HTML and Java-based user interface are provided.

Once the required data are found, users can exploit the data through interaction with dib_view. dib_view provides following data manipulation and access functions:

geographic, temporal, parameter, array coordinate, and record -based subsetting and subsampling, and downloading.

On-the-fly browse image generation and display.

multi-variant X-Y plot.

On-the-fly reformatting for users to download data in HDF, HDF-EOS, plane binary, ASCII, HTML, and GIF formats.

Metadata display.

Currently in server side, dib_view only works with data in HDF or HDF-EOS formats. However, we plan to add more DIAL supported formats in the near future.

Client Side Components

The client side tools will be able to access an HTTP server containing HDF and HDF-EOS files. The client will consist of Java GUI, web browser and a suite of helper applications to enhance and extend the capabilities of the server. The Java GUI is composed of Java applets displaying spatial, temporal and attribute search panels. The spatial panel component also includes a two dimensional world map.

Helper applications to display data, to dump and extract the data, and to analyze the data can all be linked to this system via MIME types and proper browser configuration. Many such applications like Java HDF Viewer (JHV) and hdp (an HDF dumper utility) are already freely available on the Web. Commercial tools and programs using IDL and other image processing packages can also be linked to the system.

Browsers: Netscape and MS Internet Explorer

Helper Applications:

Java based HDF Viewer JHV

Link Winds

EOSView can display HDF files

hdp (HDF dumper) provides limited subsetting capabilities

Many other useful helper applications can be added to this list supporting UNIX and PC

Configuration documentation for several browsers (Netscape Navigator and MS Internet Explorer)

Software tools to subset data

Conclusion

Although the focus of the current architecture is to provide access to HDF and HDF-EOS files, the server concept can be extended to provide access to data in other formats. This system can eventually provide access to data in many formats.

Some DIAL test sites:

EOSDIS test site
ACE
NOAA PEML
JPL DAAC
NASDA
DERA



Related Publications

MTPE EOS Reference Handbook, 1995. EOS Project Science Office, NASA Goddard Space Flight Center.

Encyclopedia of Graphics File Formats, 1994. J. D. Murray and W. Van Ryper

Data Transport Within the Distributed Oceanographic Data System, 1995. James Gallager and G. Milkowski, WWW Journal, Fourth International WWW Conference proceedings, Dec 11-14, 1995

HDF Users Guide, Version 4.0, 1996. National Center for Supercomputer Applications, University of Illinois at Urbana-Champaign


Acknolowedgement

The authors are thankful for the NASA technology prototype grant for financial assistance. Several people have participated in the development of the software and design. We would like to thank Ted Meyer, Mike Folk, Nancy Yeager, G.Ponnamparampillai, Jon Pals, Radhakrishna Garge, and Khoa Doan.
 

 

[DIAL Home][DIAL Demo][Get This Software][Documentation][FAQ][Selected DIAL Sites][Related Links]