Introduction:Capability - MdsWiki
Personal tools

From MdsWiki

Jump to: navigation, search
Capabilities and Concepts

Data Management

Data is stored in a user-defined hierarchical structure: This allows logical associations within a complex data set to be reflected in the structure of the archive - thus the data derives some of its meaning from its context. The result is that the data can be more easily interpreted and has an extended useful life. A casual user can often find the data they are interested in by browsing through the archive. The hierarchical architecture also allows generic applications to be written which are keyed to structure and context. When additional data items are added, these applications can treat them exactly like other items which occur at the same place in the hierarchy. The hierarchical structure along with support for a wide variety of data types allows users to put all related data into the same structure.

Data on a particular MDSplus server is divided into distinctly named TREES, each of which contains a number (perhaps a very large number) of NODES arranged in a hierarchical structure. Individual trees exist under the server's operating system as a collection of three files one of which defines the structure, a second contains the data, and the third contains node CHARACTERISTICS - metadata, which describes the type and size of each data item along with the time of storage, ownership, etc. The trees themselves may have hierarchical relationships or may be independent. For each tree, many SHOTS typically exist. These correspond to particular runs of an experiment or of a code.

All data is available through the same interface: The MDSplus API (applications programming interface) consists of a small number of simple calls. (These are described in the beginners guide and in more detail in the users manual.) A user issues commands to specify the server and tree, and then evaluates an EXPRESSION which in most cases will simply be the name of a node in the data hierarchy. Nodes are specified by their full path in the hierarchy, by a path relative to a current "default" location or by a TAG NAME - a hierarchy-independent alias for a particular data element. Node characteristics are obtained by evaluating expressions which specify the node name and the name of the desired characteristic.

A built-in expression evaluator extends the capabilities of MDSplus: In fact, all interfaces to MDSplus data are based on the evaluation of an expression. These expressions are written in a language called TDI (tree data interface) which supports a large number of functions and commands. The simplest expressions are just node names and the evaluation returns the data in that node. Simple mathematical and logical operations are supported, along with string manipulation, simple programming instructions and commands to analyze or create specific MDSplus constructs. External routines written in other programming languages can be invoked, providing almost limitless flexibility. This capability has been widely used to provide access to legacy data through the MDSplus API.

TDI expressions can be stored as the "data" in a node of an MDSplus tree. When these nodes are referenced, the expressions are evaluated returning the results of the TDI commands. This feature can be used to give users transparent access to data which is processed as the commands are issued rather than calculated ahead of time and stored. For example raw data can be corrected for offsets and calibrations (all of which are stored in the trees) providing users with calibrated data without storing essentially redundant information. Further, if the calibrations were to be corrected, they would apply automatically whenever the data were accessed.

The MDSplus data structure is self descriptive: In addition to the data itself, a substantial amount of information is available for every node in each tree. This metadata can include: the data type, array dimensions, data length, units, independent axes, the place of the node in the overall hierarchy, tag names, the date when the data was stored, the name of the user who wrote data, and so forth. The data structure can be TRAVERSED independent of reading any particular data item. The hierarchy provides further self-documentation through the structural relations and naming choices which have been defined explicitly by users. For example, every data node in a particular branch of the tree may have child nodes with comments, labels, etc.

MDSplus data is available remotely via the client/server model: A user with appropriate permissions, can access data from an MDSplus server anywhere on the internet. Access is "service" rather than file oriented. Data on remote servers is read directly from within applications, no file transfers are used. (The underlying technology is the TCP/IP sockets API.) Since MDSplus trees are dynamic rather than static data structures, this approach obviates problems of data consistency that arise when data is shared through file replication. For a user, there is no difference between the commands used to access data on a remote vs a local server aside from the initial specification of that server.

MDSplus supports a rich set of primitive and composite data types: The simple data types include signed and unsigned integers (1, 2, 4, 8, and 16 Bytes), single and double precision real numbers, single and double precision complex numbers, and character data. Arrays of these data types with up to 7 dimensions can be represented. The most important composite types include: SIGNALS, which contain data plus associated independent axes (eg. temperature vs space and time); and DEVICES, which are used to associate setup parameters, actions (task descriptions), and data for data acquisition or automated analysis. MDSplus devices provide a well structured mechanism for implementing data driven applications.

Data Acquisition and Automated Analysis

MDSplus provides a set of tools for performing data acquisition and analysis for pulsed experiments: The entire data acquisition and analysis process is driven by an experimental MODEL. (The MDS acronym stands for Model Driven System.) The models, which contain all of the structure and setup data for an experiment, serve as templates for SHOTS which contain the setup information plus all the data stored and processed after the experimental pulse. The tasks which make up the data acquisition cycle are managed by a DISPATCHER which collects ACTIONS and oversees their execution by SERVER processes. In the tree, each action defines what is to be done, when, and by which server. Actions can be dispatched sequentially, asynchronously, or conditionally and can issue a named EVENT when they complete. A typical data acquisition cycle might involve:

  • Initialize
    • copy model file to shot file
    • find all the action descriptions and order them into a dispatch table
    • dispatch all initialization actions
  • Pulse
    • fire high-speed timing system which runs experiment and data acquisition hardware
  • Store
    • dispatch all store actions
    • dispatch all analysis actions