Dax Design/Proposed Simplified Worklet Data Model and API

From Daxtoolkit
Jump to: navigation, search

This document is a brain dump of thoughts about changes to the interface provided for Dax worklets. My main contention with the current API is that it is too complicated. The current data model requires odd syntax that can scare people away from adoption and that is relatively complicated and error-prone to express even the most simple and common topological connections. These are my own personal preferences and not all may really be justified, so push back on anything.

This document specifically addresses the execution environment API (for worklets run on a device). The rest of the discussion is implicitly talking about this API.

Naming Conventions

The identifiers Dax provides are in three categories: functions, types, and modifiers. Each should be identified as part of the Dax API and should be distinguishable from each other.

Functions

All function names should be prefixed with "dax" and distinguish words in camel case. The current API already does this. In addition, I also recommend adhering to the following rules.

  • Words in the name should always go from general groups to specific. For example, use "daxMatrixAdd" and "daxMatrixMultiply" instead of "daxAddMatrices" and "daxMultiplyMatrices." Although this sometimes leads to more awkward phrasing, it makes it much easier to find functions in alphabetized documentation and easier to predict how functions are named.
  • Functions that operate on an object of a particular type should identify the name of the object right after "dax." Thus, any function that operates on a work object will start with daxWork, and function that operates on a cell object will start with daxCell, etc. This rule is basically a refinement of the previous rule and makes sure to avoid name clashes.
  • Functions that get or set state in an object should be named "dax<object-type>Get<data-name>" or likewise for Set. For example, "daxCellGetNumberOfPoints" and "daxArraySetValue3." The rational follows that of the previous rule.
  • Never use "Get" or "Set" in a function that does not access the state of an object. Don’t use "Get" when computing values. For example, use "daxCellDerivative" instead of "daxGetCellDerivative."
Seems pretty reasonable. I like it. -- Utkarsh 10:59, 11 May 2011 (EDT)

Types

Types should be named like functions with camel case identifiers. However, they should start with "Dax" instead of "dax" to differentiate them.

I think this is a good idea, but if you have reason to leave type names with the lowercase dax, I won't fight it. --Kmorel 15:19, 10 May 2011 (EDT)
I think it's a good idea too. No objections.Utkarsh 11:07, 11 May 2011 (EDT)

Like functions, the words in type names should go from general groups to specific. For example, "DaxCellPolygon" instead of "DaxPolygonCell".

I also suggest that we have a Dax version of all basic types. These could be types like "DaxFloat", "DaxInt", "DaxDouble", "DaxFloat3", etc. However, instead we might consider more generic types like "DaxId", "DaxScalar", "DaxVector3", "DaxVector4", etc. The advantage of this latter approach is that we could easily change computation types between, for example, float and double to adjust to different hardware architectures or tradeoff precision for speed and memory. The disadvantage is that the basic types become less expressive for algorithms.

DaxScalar, DaxId etc. gets my vote too. -- Utkarsh 08:21, 12 May 2011 (EDT)

All types for opaque objects (i.e. for work, cell, etc.) should encapsulate any pointer within it. There should be no reason to ever have a pointer to DaxCell. Thus, objects should not need "*" in their declaration or need "&" when calling a function.

OpenCL inlines all functions, so pass-by-value/pass-by-reference doesn't matter. But if we build for a platform where not all functions are not inlined, don't we want to be careful about passing-by-value/reference? -- Utkarsh 08:21, 12 May 2011 (EDT)
No, because you can embed the pointer in the typedef. For example, instead of typedef struct {...} DaxObject have typedef struct {...} *DaxObject. If you encapsulate the pointer like this, it gives us the freedom to use either one. There may be issues with not using the "&" though. If so, we can relax that one. --Kmorel 10:23, 12 May 2011 (EDT)

Modifiers

Modifiers should be named like macros (since they actually will be macros and experienced programmers will recognize this familiar usage). They should be all caps with words separated by underscores. All modifiers should start with "DAX_". For example, the modifier for worklet functions should be "DAX_WORKLET" (as opposed to "__worklet__").

Work Objects

There is currently a single type of work object, daxWork, that encapsulates the location of the data to be accessed by the worklet instance. I propose that the function of the work object be broadened to also identify the type of worklet. Thus, there will be multiple types of work objects, one for each of the different types of worklet execution. Given the current capabilities, there should be three different work objects:

  • DaxWorkMapField The worklet can read field scalars/vectors at a single point or cell (or face or edge) and create a new value for a field at the same place.
  • DaxWorkMapCell The worklet runs on a cell, can query the structure of the cell, and can read field data for any part of the cell (such as point or cell). The worklet creates a new value for a cell field.
  • DaxWorkMapPoint The worklet runs on a point, can read field values for that point, and can also read field values on adjacent cells. (I think it makes sense to NOT allow querying the cell structure, but maybe it makes no difference.) The worklet creates a new value for a point field.
I'm not enamored with these names, so feel free to propose new ones. --Kmorel 15:58, 10 May 2011 (EDT)

We will soon need to create new work types as we tackle new problems like contours.

The current implementation implies the worklet type by the parameters given and their dependencies. I think identifying the worklet type by the work object has several advantages. First, it is easier to identify when a user makes a mistake. If the arguments do not fit the work trying to be done, the system can give a specific error rather than kernels that do not do what you want. It also means we do not need to define these dependencies amongst parameters. We can specify what we mean by type. It also assures that we can always determine what type of worklet is being created. I am not confident that we will always be able to identify worklets by the existing data model.

Work objects should also encapsulate any topology information that is applicable to it. Thus, it is not necessary to declare "in_connections" as one of the parameters to the worklet. Instead, cell information can be retrieved directly from the work object.

DaxCell cell = daxWorkGetCell(work);
I was thinking of extending the "connections" to include both the CellArray as well as CellLinks in VTK terminology. CellLinks are needed for cell-to-point, for example. If the worklet does not explicitly tell us what it needs, we will always have to upload all the information (generation of cell-links may not be trivial) or parse the body of the worklet to determine what arrays it may use. Making all that's required explicit in the arguments makes it easier for the executive to determine what's needed.Utkarsh 08:24, 12 May 2011 (EDT)
I think that this can be encoded in the worklet type as well. The DaxWorkMapCell only allows you to look at the topology of the local cell, so you cannot see neighbors and do not need CellLinks. Likewise a DaxWorkMapPoint allows you to look at neighbor cells, but not see their topology, so you need CellLinks but not CellArray. You can imagine other derivative worklet types that allow both. Perhaps a DaxWorkMapCellNeighbors that creates a cell field and allows you to look at neighbors. It should be easier to determine what is needed from set abilities in worklets rather than arbitrary requests in parameters. The only issue might be the combinatorial growth of worklet types. --Kmorel 10:39, 12 May 2011 (EDT)

The daxWorkGetCell function could only be called on work objects that make sense. You could not call it on a DaxWorkMapField, but you could on a DaxWorkMapCell.

Here we see an example of a polymorphic function. Can we actually do this in C somehow? That is, could we make a daxWorkGetCell that returns an error for work types that do not work but succeeds for those that do? Do we need different names for each cell type (daxWorkMapCellGetCell)? --Kmorel 16:19, 10 May 2011 (EDT)
Alas, yes. We cannot do this with C. We will have to provide different names. Alternatively, we can have DaxWorkMapCell, DaxWorkMapField etc. be typedef to same thing, but with an ivar identifying the type. Since DaxWork* is never created directly, one always calls some function to create it or is passed as a command line argument, so the type can be set then. Utkarsh 08:24, 12 May 2011 (EDT)
Now that I think about it, this might be an issue in the future. In talking about the project at workshops, several people have expressed that there will be a need for the ability to create new communication patterns in Dax, and that might mean creating new worklet types. That would be even more problematic if you have to also create a whole new set of functions in the execution environment. We should brainstorm on this for a bit. --Kmorel 10:39, 12 May 2011 (EDT)

Field Objects

First, let's get rid of the daxArray type. I hate that. We want to enforce that the worklet operates on a single unit of the data. We give it local data, not arrays of everything.

Field objects point to fields specified by the work. Field objects come in different types based on the topological units (point, cell, edge, or face) on which they are defined or a generic type to be used with DaxWorkMapField when the topological unit is indeterminate. Since all our worklet types visit a single element at a time (and I can't think of a use case otherwise), the topological unit completely determines the dependencies. Optionally we can also have a special coordinates type, which is really the same thing as a point type but provides a hint to the underlying system to by default use point positions as the field.

Field objects also have to be tagged as either input or output. I can think of two ways to do this. The first is to make separate field object types for input and output. So your total list of field object types is DaxInField, DaxInFieldPoints, DaxInFieldCells, DaxOutField, DaxOutFieldPoints, and DaxOutFieldCells (plus others if coordinates, edges, and faces are to be included). The nice thing about this approach is that we can enforce constraints about not writing to inputs and not reading from outputs. The not so nice thing is that it is a bit inelegant and requires managing twice as many types. It also does not address potential future worklets that might operate on values instead of fields (such as a statistics aggregation).

Another way to specify input or output is to use the modifiers DAX_IN and DAX_OUT. These modifiers wouldn't adjust the code in any way (although perhaps DAX_IN would be a macro for const), so they would not do much to enforce correct reading and writing. However, they could easily be picked up by the parser.

Modifiers get my vote. DaxInField/DaxOutField sounds a bit too cumbersome. We hardly ever do that in C/C++. Plus, like you said, we can easily enforce correctness by expanding DAX_IN to const. Utkarsh 08:52, 12 May 2011 (EDT)
Fair enough. I was thinking we would want to prevent reading from out variables, but if there is no technical reason not to allow this there might be valid reasons to do so. I'll change the examples. --Kenneth Moreland 10:52, 12 May 2011 (EDT)

There should be ample functions that easily interpolate any point field within a cell (where available). You should be able to do that without copying all the point values to a temporary array.

This, I totally agree with on a conceptual level. In fact, that's what I tried to implement in the first pass too. However I soon ran into a major implementation obstacle. Since OpenCL pretty much inlines all function (although I could never find any official statement that says anything to this effect), "recursion" is not allowed. And by recursion, they don't mean just logical recursion, even syntactic recursion e.g. the following code leads to OpenCL compiler to hang without any sensible error messages.

void Call(int type)
{
  switch (type)
    {
  case 0:
    CallO();
    break;

  case 1:
    Call1();
    break;

   case 2:
    break;
    }
}

void Call0()
{
  ....
  Call(1);
}

void Call1()
{
  ....
  Call(2);
}
You need such kind of chaining to implement a function as daxWorkGetScalarValue() where it will cause the input worklet to execute to generate the value(s) at the requested location. daxCellInterpolate, for example, will need to call daxWorkGetScalarValue to get the values for each point-scalar internally. If the above code worked, the DaxCell (or DaxInFieldPoint) could have information about which worklet is "producing" it and we could have used that nicely inside the daxCellInterpolate function.
Since that's not possible, in currently implementation, I did the following:
  • every occurance of daxWorkGetScalarValue would replaced with daxWorkGetScalarValue_<Worklet Name>. This was done for all daxWorkGet... functions. Then, I provided implementations for daxWorkGetScalarValue_<Worklet Name> that had a switch similar to before, but only including the calls to worklets that are "producers" for this worklet. Thus avoid recursion as long as there are no loops in the pipeline. This meant that there cannot be any daxWorkGetScalarValue call done internally by the framework. All of those have to happen in the worklet code.

Ick. This is another good topic for a brainstorm. Perhaps there is a way around the problem. What if the evaluation was less lazy? That is, rather than call the previous worklet when a particular value is requested, first compute all values of the upstream worklet. Since we have broken worklets into types with particular access patters, we have a pretty good idea what will and won't be used. For example, if you use a DaxWorkMapCell worklet, you're pretty assured that all the point scalars will be used. It would be an odd algorithm that arbitrarily picked scalars to use. Thus, you might as well compute all scalar values before the worklet is called. In addition, computing scalars in advance could help with the issues we've been having with duplicate execution. --Kenneth Moreland 11:00, 12 May 2011 (EDT)
I just verified that such a call sequence is valid in CUDA and compiles without any issues. Utkarsh 14:17, 12 May 2011 (EDT)
CUDA also has function overloading. CUDA also supports classes. It looks like the upcoming version 4.0 even supports virtual functions. Perhaps we should seriously consider switching to CUDA for now and wait for OpenCL to catch up. --Kenneth Moreland 14:58, 12 May 2011 (EDT)
I couldn't agree more. I was experimenting with CUDA yesterday. I am going to play a bit with CUDA in the next week so we can make an informed decision at the "summit". Utkarsh 07:51, 13 May 2011 (EDT)

Examples

Here are some examples of what I would like worklet implementations to look like.

Calculator

A simple worklet that provides some calculator-like function would look like this.

DAX_WORKLET void CalculatorWorklet(
  DAX_IN DaxWorkMapField work,
  DAX_IN DaxField in_scalar,
  DAX_OUT DaxField out_scalar)
  {
  DaxScalar in_value = daxWorkGetScalarValue(work, in_scalar, 0);
  DaxScalar out_value = ...;
  daxWorkSetScalarValue(work, out_scalar, 0, out_value);
  }

One thing that is never addressed here is how to differentiate fields that are scalars with those that are vectors. I tried to address that above by adding a component argument to the get/set scalar value functions. You could imaging creating the same worklet that loops over components just in case it's given a vector.

DAX_WORKLET void CalculatorWorklet(
  DAX_IN DaxWorkMapField work,
  DAX_IN DaxField in_scalar,
  DAX_OUT DaxField out_scalar)
  {
  DaxId num_components = daxWorkGetScalarNumberOfComponents(work, in_scalar);
  for (DaxId component = 0; component < num_components, component++)
    {
    DaxScalar in_value = daxWorkGetScalarValue(work, in_scalar, component);
    DaxScalar out_value = ...;
    daxWorkSetScalarValue(work, out_scalar, component, out_value);
    }
  }

Of course, you should also be able to get vector values like so:

DAX_WORKLET void MagnitudeWorklet(
  DAX_IN DaxWorkMapField work,
  DAX_IN DaxField in_vector,
  DAX_OUT DaxField out_scalar)
  {
  DaxVector3 in_value = daxWorkGetScalarValue3(work, in_scalar);
  DaxScalar out_value = daxVectorMagnitude(in_value);
  daxWorkSetScalarValue(work, out_scalar, 0, out_value);
  }

Perhaps a better approach would be to somehow identify the number of components desired and then have some pipeline mechanism to select or loop. However, that is just moving the complexity elsewhere and will probably make everything more difficult.

Point to Cell

Point to Cell is a simple cell map that uses what should be an interpolation function for point fields.

DAX_WORKLET void PointToCell(
  DAX_IN DaxWorkMapCell work,
  DAX_IN DaxFieldPoint in_scalar,
  DAX_OUT DaxFieldCell out_scalar)
  {
  DaxVector3 parametric_cell_center = (DaxVector3)(0.5, 0.5, 0.5);
  DaxCell = daxWorkGetCell(work);
  DaxId num_components = daxWorkGetScalarNumberOfComponents(work, in_scalar);
  for (DaxId component = 0; component < num_components, component++)
    {
    DaxScalar value = daxCellInterpolate(
      cell,
      parametric_cell_center,
      in_scalar,
      component);
    daxWorkSetScalarValue(work, out_scalar, component, value);
    }
  }

Cell To Point

I wonder if defining a DualCell would make it easier to deal with cell-links. In that case the Cell To Point can be written as follows:

DAX_WORKLET void CellToPoint(
  DAX_IN DaxWorkMapPoint work,
  DAX_IN DaxFieldCell in_scalar,
  DAX_OUT DaxFieldPoint out_scalar)
  {
  DaxVector3 parametric_cell_center = (DaxVector3)(0.5, 0.5, 0.5);
  DaxDualCell dual_cell = daxWorkGetDualCell(work);
  DaxId num_components = daxWorkGetScalarNumberOfComponents(work, in_scalar);
  for (DaxId component = 0; component < num_components, component++)
    {
    DaxScalar value = daxDualCellInterpolate(
      dual_cell,
      parametric_cell_center,
      in_scalar,
      component);
    daxWorkSetScalarValue(work, out_scalar, component, value);
    }
  }

Some observations. First, having a mechanism for dual grids is a good idea. It's a good way to treat cell data like point data to do things like isosurfaces. Th recent VTK work on AMR grids has shown that dual grids are a convenient way of handling refinement interfaces. Interestingly, the dual grid would have to be an internal mechanism to implicitly change the topology. The worklet itself is exactly the same as the point to cell map. Thus, the work object should be DaxWorkMapCell, and this implementation would be exactly the same as the PointToCell worklet (and in fact you would use the same function in both cases).

That said, I'm not convinced that a dual grid is the best way to handle cell to point in general. I think it introduces a larger overhead than is required in general. It requires building cell links for this pseudo-cell and probably compute centroids on the cell neighbors to get pseudo-vertex coordinates. In this example of just averaging the point data, none of this cell building is necessary. In fact, it's not even correct as the parametric center of the dual cell is not necessarily on the point of the original grid.

For the simple cell to point worklet DaxWorkMapPoint, I was thinking you could get a count of the neighboring cell, and then get a cell data value for each neighboring cell. Stephane Marchesin had a slightly more expressive idea that is a constrained map reduce. The idea is that you would first execute a map worklet on each cell that would output a separate value for each point (or equivocally run on all neighboring cell-point pairs). Then you would run a reduce-type worklet on each point given the data generated for that point from all relevant map worklets.

--Kenneth Moreland 14:38, 13 May 2011 (EDT)

Gradient

Gradient is structured about the same way as cell to point. The computation is different and returns a vector.

DAX_WORKLET void CellGradient(
  DAX_IN DaxWorkMapCell work,
  DAX_IN DaxFieldCoordinates in_positions,
  DAX_IN DaxFieldPoint in_scalar,
  DAX_OUT DaxFieldCell out_vector)
  {
  DaxVector3 parametric_cell_center = (DaxVector3)(0.5, 0.5, 0.5);
  DaxCell = daxWorkGetCell(work);

  DaxVector3 value = daxCellDerivative(
    cell,
    parametric_cell_center,
    in_positions,
    in_scalar,
    0);
  daxWorkSetScalarValue3(work, out_vector, value);
  }

Acknowledgements

Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

SandiaLogo.png DOELogo.png

SAND 2011-3271P