Description

The Compass DataBase (CDB) is a lightweight system designed for storing the COMPASS (IPP Prague) tokamak experimental data. It can equally well be used for any tokamak or generally any device that repeatedly produces experimental data.

CDB uses HDF5 (or NetCDF 4) files to store numerical data and a relational database, actually MySQL, to store metadata. The core application is implemented in Python (pyCDB). Cython is used to wrap the Python code in a C API. Matlab, IDL etc. clients can then be built using the C API.

There are several major advantages of this scheme:

  • Vast of the required functionality is readily implemented and available for numerous operating systems and applications (high/low level data input/output, database functionality).
  • Data can be stored on any file system (local or remote), no need for specific protocol.
  • Rapid, platform independent development in Python.

CDB has a possibility to store the information about the data acquisition sources (channels) of the data. The database contains information about DAQ channels associacions to physical quantities.

An important point in CDB is its never overwrite design. Anything stored in CDB cannot be overwriten (at least using the standard API); instead, revisions are possible as corrective actions.

The data model

A relational database is used to store metadata of the numerical date. Metadata include physical quantities names, units, information about axes (which themselves are physical quantities and are the same entities as any other quantities).

generic_signals
Describe physical quantities stored in the database. In particular, contain names, units, axes id’s (axis are treated as any other signals), description, signal type (FILE or LINEAR), record numbers validity range, and data source.
data_sources
Used as primary grouping criterion. Contain name, description and the directory name of the data files.
data_files
List of all data files in the system. Each file can contain one or more signals. Data files are stored in subdirectories specified in the data source under the main CDB directory. files_status stetes whether a file is ready for reading.
data_signals

In fact, data signals are instances od generic signals. A data signal either points to data in a data file or contains only coefficients for linear function (LINEAR signal). Data signal contains record number to which the data belong. Revisions can be created when a correction to a signal is needed.

Each data signal contains offset and coefficient used either for a linear signal construction or for linear tranformation of data stored in a file. See Linear signals with get_signal_data for details. time0 specifies the time of the first data point for time-dependent signals.

shot_database
Contains record numbers—unique numbers characterizing a data set. This is mostly a tokamak shot, can however be a simulation, a DAQ system test etc. Tokamak shots also have shot_numbers. Data files for a particualr shot are stored in record_directories.
FireSignal_Event_IDs
CDB can be used as storage system for FireSignal. In this case, this table relates FireSignal id’s and CDB record numbers.

Data acquisition management

CDB has a possibility to track information about DAQ A/D channels. Each data acquisition system can be associated (“attached”) with a physical quantity (generic_signal) it outputs.

DAQ_channels

List of all DAQ channels available. Unique identification consist of computer_id, board_id, channel_id. In case CDB is used with FireSignal, nodeuniquid, hardwareuniqueid and parameteruniqueid relates the two databases.

Each DAQ channel has a default_generic_signal_id, which is the id of a (unique to the channel) generic signal associated with the channel if no other generic signal is associated. The reason for this is that, in CDB, every data_signal must have a generic_sinal_id.

channel_setup
This table tells to which generic_signal_id is a DAQ channel associated (attached). It’s an event-style table containing association events (date, time, user id and a note of a DAQ channel association to a generic signal.)

Database structure

_images/CDBdb.png