Basic Control Environment API

From Daxtoolkit
Jump to: navigation, search

The intention of this document is to provide a documentation-style specification for the Dax control environment. The scope of the document is meant for the Dax end user perspective. That is, the public interface to the control environment. Private members and other implementation-specific features do not belong in this document.

Basic Objects

All objects in Dax, with perhaps a few minor exceptions, should have a consistent creation, reference, and delete mechanism. For example, VTK has a mechanism where all concrete objects have a static New method for creation, are always referenced by pointer, and are managed and deleted by reference counting.

All Dax control objects should directly or indirectly inherit from dax::cont::BaseObject. This object will serve as a location to implement some of the conventions that all Dax control objects should follow.

Originally (back with the first OpenCL implementation) we specified that all Dax objects should be held by reference in smart pointer objects that are typedef’ed to the object type. Lately, this convention has not been followed partly because there are technical difficulties. Thus here I place multiple options, one of which will eventually win out. --Kenneth Moreland 12:24, 13 December 2011 (EST)

Option 1:

All Dax objects are stored and passed by value. That is, they generally are not referenced via pointers or smart pointers. Any reference and value sharing is handled internally with pointer ivars. Shallow copies must be handled by internal reference counting within some Dax object.

Pros: Simplest syntax for user. Not error prone (from user perspective) – cannot make pointer mistakes. Matches convention for execution environment. Can provide/require arguments for creation.

Cons: Harder to implement. How do you handle circular references? Semantics for stored arrays unclear: If you copy an array class and modify values in the copy, are the values in the original also modified? Cannot have virtual method polymorphic without references.

Option 2:

All Dax objects are stored and passed by reference. All references are maintained with smart pointers. Every object implementation creates a typedef for an appropriate smart pointer to itself that will be easy to type.

Pros: Copy semantics are simple, explicit, consistent, and widely applicable.

Cons: Implementation gets screwy when templates get involved: You can’t typedef the smart pointer and safe dynamic casting can get weird. Smart pointers are a pain to trace in a debugger.

Basic Arrays

Arrays are the basic storage container that is used to build up more complicated structures. The basic array container is dax::cont::Array, and it manages array much like std::vector. (In fact, it will probably be a thin veneer over std::vector.) It is a templated class, where the template type dictates what type of data the array holds. The interface should look something like this. (Note that this code, like all other class declarations, contains no implementation details. It contains only interface method declarations, and probably a partial one at that.)

namespace dax { namespace cont {
template<class T>
class Array : public dax::cont::BaseObject
  typedef T ValueType;
  dax::Id GetSize() const;
  const ValueType &GetValue() const;
  void SetValue(const ValueType &value);
  // Iteraters?
  // Operators?
  // Resize?
} }

Data Sets

A data set object is a collection of arrays that have semantic meaning applied to them to define the topology and fields. The basic structure of the data sets follows closely that in VTK. A DataSet superclass contains all the basic interface methods to get the number of cells/points and manipulating point and cell field arrays.

namespace dax { namespace cont {
class DataSet : public dax::cont::BaseObject
  virtual dax::Id GetNumberOfPoints() const = 0;
  virtual dax::Id GetNumberOfCells() const = 0;

  dax::cont::FieldData &GetPointData();  // const version too
  dax::cont::FieldData &GetCellData();   // const version too
} }

The field data is basically just a collection of named arrays. The arrays are assumed to be the same size as the number of points or cells (depending on whether it is point or cell data).

namespace dax { namespace cont {
class FieldData : public dax::cont::BaseObject
  dax::cont::Array<dax::Scalar> &GetFieldScalar(const std::string &name); // const version too
  const std::map<std::string, dax::cont::Array> GetAllFieldScalar() const;
  void AddFieldScalar(const std::string &name, dax::cont::Array &array);
  void RemoveFieldScalar(const std::string &name);

  // Repeat for Vector3, Vector4, Id, and any other possible field types.
} }

The most obvious difference between the Dax field data and the VTK field data is that the Dax FieldData object maintains a separate associative array for each value type (scalar, vector, etc.) whereas VTK lumps together all tuple sizes and basic types. The reason for separating the field types is to make it easier to manage the templating of the worklets.

It is not really that clear that this separation solves all of our problems. If it doesn’t we will probably have to revisit this interface. --Kenneth Moreland 12:24, 13 December 2011 (EST)

Another more minor difference is that Dax does not maintain named attributes (like scalars, vectors, normals, tensors, texture coordinates, etc.). This named attribute mechanism is fairly antiquated anyway and usually just gets in the way of things.

Another implied minor difference is that instead of maintaining the field names in the Array class, they are maintained in an associative array (std::map) in the FieldData object. Maybe it’s just me, but I’ve never really bought into the idea that the array holds the name and the field data object has to search through them to find it. Also, by maintaining the names in the FieldData object, it can easily return an std::map of all the arrays from which you can support queries and related operations on the user side. --Kenneth Moreland 12:24, 13 December 2011 (EST)

Subclasses to DataSet define sets for specific topologies. A UniformGrid defines extent, origin, and spacing information. A CurvilinearGrid defines extent and point coordinates. An UnstructuredGrid defines point coordinates and connectivity arrays. There is probably no point in having a separate polygon data class (which would just be an unstructured grid with no 3D polyhedra).

Does an unstructured grid have separate arrays for each cell type? If it doesn’t, how do you invoke the correct template instance of a worklet? If it does, how do you reorganize the output of worklets that generate cells of different types? Maybe a compromise is to have a group of unstructured grid classes, each holding a single cell type (for example, only tetrahedra or hexahedra or triangles) and then have one (or a small set) of unstructured grid class that handles a general and mixed cell type. --Kenneth Moreland 12:24, 13 December 2011 (EST)

Composite data sets can be built of hierarchies of these basic data types.

Pipeline Modules

There are three types of pipeline modules. Sources feed the pipeline with some given data set. Filters process data by running worklet algorithms on them. Sinks do something with the results.


A Source object simply holds a DataSet and provides the mechanisms (i.e. output ports) to connect it to other module objects. There are no other exposed methods.

namespace dax { namespace cont {
class Source : dax::cont::BaseObject
  const dax::cont::DataSet &GetDataSet() const;
  void SetDataSet(const dax::cont::DataSet &data);
} }


The Dax build should create a separate filter object for every worklet defined. The filter object created is probably a typedef of a more general Filter class. It could look something like this:

typedef dax::cont::Filter<dax::cont::internal::ModuleFieldMap<worklets::Elevation> FilterElevation;

Details of how filters are connected and how they otherwise interface with the worklet parameters are given later.


A Sink object is connected to the bottom of a pipeline (it has an input but no output). The Sink object does something with the results of the execution. There can be different types of Sink objects to perform different operations. A common Sink object would be to deliver the data back to the control environment.

namespace dax { namespace cont {
class SinkCollect : dax::cont::BaseObject
  void Update();
  dax::cont::DataSet GetResult();
} }

Once again, perhaps this should be templated. Also, this interface implies that the pipeline always has pull semantics. Is that what we want? --Kenneth Moreland 12:24, 13 December 2011 (EST)

There could be other types of Sinks as well. Perhaps a render sink would be useful, but probably unnecessary in the short term.

Module Connections

Sources, filters, and sinks are connected together with the dax::cont::Connect function. The first argument is the upstream module; the second is the downstream module. An optional third argument specifies the named input port for worklets that generate multiple inputs. (This feature can wait, as we do not yet have any worklet types that can support such a thing.)

dax::cont::Source source;
FilterSlice slice;	// Hmm.  In what namespace should filter objects be defined?
FilterStreamlines streamlines;
dax::cont::SinkCollect sink;
dax::cont::Connect(source, slice);
dax::cont::Connect(source, streamlines, "mesh");
dax::cont::Connect(slice, streamlines, "seeds");
dax::cont::Connect(streamlines, sink);

On second though, I’m wondering if this is the best way to make connections. The issue I am thinking of is that what if you want to get information about the data object that is passed from one module to another. For example, what is its type and what fields are available? That might be easier of there was a connection object. --Kenneth Moreland 12:24, 13 December 2011 (EST)

Worklet Interfaces

Worklets function parameters must be somehow translated to the control environment’s filters in order to hook them up. There are three types of input parameters: work objects, fields, and constant arguments (such as the isovalue of a contour).

The first (and only) parameter of a worklet is always a work object. The work object defines what type of work is being done and implicitly sets the type of filter object (and/or associated classes). Input fields are named by their argument. The filter object contains a mechanism to associate a named input array with that argument name. Constants have similar names with similar filter functions to set them except these filter methods take an actual value instead of a field name.

namespace dax { namespace cont {
class Filter : dax::cont::BaseObject
  std::vector<std::string> GetArgumentNamesField() const;
  void SetArgumentField(const std::string &arg_name, const std::string &field_name); // Should this have a type associated with it?

  std::vector<std::string> GetArgumentNamesScalar() const;
  void SetArgumentScalar(const std::string &arg_name, dax::Scalar value);

  // Duplicate for other valid types (Vector3, Vector4, Id)
} }

What about argument lists? How do you specify multiple isovalues? Should you specify multiple isovalues? Maybe a better mechanism is to call the worklet multiple times. But then you have to revisit the same memory multiple times. --Kenneth Moreland 12:24, 13 December 2011 (EST)

Putting it all together, here is what the code might look like to connect a source to a contour filter, assuming the contour filter is defined such that it takes a field named "scalars" and an argument named "isovalue" and you want a contour of the field "Temp" at value "400."

dax::cont::Source source;
FilterContour contour;
dax::cont::Connect(source, contour);
contour.SetArgumentField("scalars", "Temp");
contour.SetArgumentScalar("isovalue", 400);


Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

SandiaLogo.png DOELogo.png

SAND 2011-9332P