ArrayHandle not Templated on Device Adapter

From Daxtoolkit
Jump to: navigation, search

The current implementation of ArrayHandle has three template arguments: a base (component) type, a container tag, and a device adapter tag. This provides all of the functionality necessary to change the behavior of ArrayHandle to any underlying data storage and managing it in the control and execution environments.

However, a problem with this approach is that it means that the device used in the ArrayHandle must be set when the ArrayHandle is constructed. This should be unnecessary since the ArrayHandle is generally created with data in the control environment. The decision on what device to use can be deferred for later. There is also some weirdness involved in matching the devices for ArrayHandles with each other (when multiple ArrayHandles are used, which is very common) and other types of objects (such as dispatchers).

Thus, here is a proposal for a change to the ArrayHandle that removes the device adapter tag template parameter and instead allows it to polymorphically interface with a device when it is used.

Basic Approach

The gist of the change is to remove the template parameter for the device adapter. This, of course, will affect any part of the ArrayHandle that refers two the execution environment. Fortunately, there are only three places where this happens.

The first place is where the ArrayTransfer class is accessed. The ArrayTransfer is used as the "ExecutionArray" for transferring data to and from the card. The ArrayTransfer is wrapped in a new class called an ArrayHandleExecutionManager that has a base class that uses run-time virtual-table polymorphic methods. More details on this are given in the following section.

The second place is the definition of the execution array portals.

typedef typename ArrayTransferType::PortalExecution PortalExecution;
typedef typename ArrayTransferType::PortalConstExecution PortalConstExecution;

Instead of providing these typedefs directly, they are wrapped in a templated structure that gives the portal for a particular device.

template <typename DeviceAdapter = DeviceAdapterTag>
struct ExecutionTypes {
  typedef typename ExecutionManagerType
      ::template ExecutionTypes<DeviceAdapter>::Portal Portal;
  typedef typename ExecutionManagerType
      ::template ExecutionTypes<DeviceAdapter>::PortalConst PortalConst;

The third place is in the Prepare* methods where data is transferred to the device (as necessary) and an execution array portal is returned. These methods change to templated methods that take the device adapter as an argument and push the data to that device. For example, the declaration for the PrepareForInput method becomes the following.

template<typename DeviceAdapterTag>
typename ExecutionTypes<DeviceAdapterTag>::PortalConst
PrepareForInput(DeviceAdapterTag) const

Polymorphic Execution Array Management

As stated previously, the direct use of ArrayTransfer is replaced with a new class named ArrayHandleExecutionManager. The names are a little confusing here because we have three similar classes interacting here: ArrayHandleExecutionManager wraps an ArrayTransfer, which wraps ArrayManagerExecution. All of these are necessary and provide a specific purpose (run-time polymorphism, specialization on container, and specialization on device, respectively).

The ArrayHandleExecutionManager has a base class (named, appropriately, ArrayHandleExecutionManagerBase). The base class has the same template parameters as ArrayHandle: value type and container. The base class also has pure virtual methods that match most of the functions in ArrayTransfer: GetNumberOfValues, LoadDataForInput, LoadDataForInPlace, AllocateArrayForOutput, RetrieveOutputData, Shrink, and ReleaseResources. Since the interfaces for these do not rely on the device adapter type, the ArrayHandleExecutionManager subclass (which is templated on value type, container, and device) easily specializes the functionality.

The methods that retrieve an execution array portal (GetPortalExecution and GetPortalConstExecution) are exceptions. These methods return an array portal whose type is specified by the device adapter. Thus, they must be templated by the device adapter, which is something you cannot do to a virtual method.

The ArrayHandleExecutionManager gets around this problem by creating a pure virtual method that accepts a pointer to an empty array portal object as a void *, casts the pointer to the appropriate type, and copies the appropriate array portal over. The GetPortal* methods can then be templated in the base class and internally create the appropriate array portal and pass its reference to the aforementioned virtual method.

Run-Time Identification of Device Adapters

One problem with using run-time polymorphism is that the compiler can no longer check to ensure that the device is “right.” It is sometimes necessary to identify and compare device adapter tags at run time. For example, in the process of getting an execution array portal in the ArrayHandleExecutionManager, the base class should first check with the subclass whether it is called with the appropriate device adapter. To do this, we need a general “id” that can be set at run time and compared.

The first thing we need is to ensure that all device adapters have a unique id associated with them. This is solved by creating a macro named DAX_CREATE_DEVICE_ADAPTER and using this macro in place of defining a device adapter directly. Obviously, DAX_CREATE_DEVICE_ADAPTER defines the tag. It also creates a traits class that has a method named GetId that returns an id in a generic type. The simplest way to define the type is to use an std::string and set the contents to the name of the tag.

Since we are creating auxiliary features of device adapter tags anyway, one convenient thing we can add is a mechanism to check to see if a device adapter tag is valid at compile-time. This is done with a templated field that evaluates to false by default. When the tag is defined, a specialization that evaluates to true is created. This value can then be wrapped in a Boost static assert as a readable way to check the validity of device adapter tag arguments.

The Ugly

This change can simplify client code quite a bit by removing a lot of unnecessary device adapter templating. There are a few negative consequences, though. The first most obvious is that the change touches just about every portion of code in Dax. But there are a few more subtle changes.

Device Adapter Namespaces

Previously, device adapter tags were placed in a namespace that fit where the device resides. For example, the DeviceAdapterTagTBB was defined in dax::tbb::cont::DeviceAdapterTagTBB. Because device adapter tags must all be defined with the DAX_CREATE_DEVICE_ADAPTER macro, it limits the namespace for the tags to be dax::cont regardless of where they are defined.

This could be either a good thing or a bad thing depending on your point of view. The good thing is that now you just know that all device adapter tags are in dax::cont, which means you should have to think less when you use them. The bad thing is that they don’t match the header file they are defined in as well. For example, the TBB tag is created by including dax/tbb/cont/DeviceAdapterTBB.h. However, even so the tag and the header file never matched up perfectly anyway, so maybe it does not matter so much.

CopyInto Broken

The ArrayHandle has had a method named CopyInto that is templated on an STL-compatible iterator that allows you to copy data from the ArrayHandle directly from the execution environment. The problem is that you cannot do that correctly through the ArrayHandleExecutionManager: It would require a templated virtual method. Thus, this functionality must be removed.

That said it’s not particularly important. It is mostly just used for testing and trivial code. Production code should define its own containers and write directly to them.

Divining Tags for OpenGL Interop

The OpenGL Interop feature takes an ArrayHandle and creates an OpenGL buffer with those contents. The method used depends on the device the ArrayHandle data is in.

When the ArrayHandle is templated on the device adapter, determining what method to do the interop with is straightforward. However, the device is now chosen at run time. The most straightforward way to use the interop is to provide the correct device adapter tag when it is invoked. This, however, is a poor solution, as it does not enforce a match between the interop function and where the data is actually stored.

A better solution is surely possible, but it will take some more careful design.


Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

SandiaLogo.png DOELogo.png

SAND 2014-0883P