Array Portals

From Daxtoolkit
Jump to: navigation, search

An important internal mechanism of Dax is the ability to pass around a "pointer" to an indexed contiguous storage (array) so that values can be read and written. These pointers cannot be actual C pointers because the array may not be an actual C array. (For example, it could be a vtkDataArray with a wrapper around it to convert between VTK tuples and Dax tuples.) The document proposes the concept of an array portal as an implementation of an object that points to and accesses an array.

What is an Array Portal?

An array portal is a simple object that holds a reference to a random-access container object. It has a simple interface like a container, but because it is a reference it can be passed around like a pointer. The interface would look something like this.

class ArrayPortal {
public:
  typedef ___ ValueType;

  ValueType Get(dax::Id index) const;

  // Can be missing for const array portals
  void Set(dax::Id index, const ValueType &value) const;

  dax::Id GetNumberOfValues() const;

  typedef ___ IteratorType;
  IteratorType GetIteratorBegin() const;
  IteratorType GetIteratorEnd() const;
};

The intention of this interface is to satisfy two needs. First, the interface is easy to use. The majority of access code in Dax (not counting third party generic algorithms) loads/stores values from a specific index in a container. Second, the interface should be easy to implement. To demonstrate this second point, consider this implementation for a standard C array of scalars (which will later be subsumed by a more general iterator adapter).

class ArrayPortalCScalarArray
{
  dax::Scalar *Array;
  dax::Id Size;
public:
  typedef dax::Scalar ValueType;
  ArrayPortalCScalarArray(dax::Scalar *array, dax::Id size)
    : Array(array), Size(size) {  }
  dax::Scalar Get(dax::Id index) const { return this->Array[index]; }
  void Set(dax::Id index, dax::Scalar value) { this->Array[index] = value; }
  dax::Id GetNumberOfValues() const { return this->Size; }
  
  typedef dax::Scalar *IteratorType;
  dax::Scalar *GetIteratorBegin() const { return this->Array; }
  dax::Scalar *GetIteratorEnd() const { return this->Array + this->Size; }
};

Now for a slightly more complicated example: an array of scalars masquerading as an array of dax::Vector3 objects.

class ArrayPortalCVector3Array
{
  dax::Scalar *Array;
  dax::Id Size;
public:
  typedef dax::Vector3 ValueType;
  ArrayPortalCVector3Array(dax::Scalar *array, dax::Id size)
    : Array(array), Size(size) {  }
  dax::Scalar Get(dax::Id index) const {
    dax::Scalar *location = this->Array + index;
    return dax::make_Vector3(location[0], location[1], location[2]);
  }
  void Set(dax::Id index, dax::Scalar value) {
    dax::Scalar *location = this->Array + index;
    location[0] = value[0];
    location[1] = value[1];
    location[2] = value[2];
  }
  dax::Id GetNumberOfValues() const { return this->Size; }
  
  // Returning iterators discussed later.
};

As can be seen, this interface yields a very easy mechanism to pass values as Vector3 to and from an array of scalars. At this point we have left out a fairly complicated bit involving returning iterators. This is discussed later in #Converting portals to iterators.

Why not iterators?

Iterators are the de facto standard for this type of pointer/accessor object and are already used by many of the generic algorithms we leverage, so why not just use that? The simple answer is that they can be a pain to implement.

Iterators are designed to capture the interface to lots of different data structures that we simply do not care about like linked lists and sets. Because of this, the primary operations of iterators involve tracking relative movements to a current position. This iterator movement is really just a hassle to implement and only gets in the way of the Dax code. The nature of Dax often has a thread read a single or small range of values somewhere in the middle of the array. This access pattern requires the random access class of iterators. Although the simplest type of iterator, a C pointer, is random access, random access iterators are actually the most feature rich type and thus the most difficult to implement.

But the real problem with iterators is the mechanism used to set a value.

  *iter = value;

For this to work, the dereference operator must return an l-value that can later be set. If the value types in the storage match that returned by the iterator, then this is usually just returning a reference to the position in the array. However, when these two do not match, as is the case in the previous example treating a flat scalar array as an array of Vector3s, then returning such a reference is problematic because there is no actual memory location to write to. For this to work, there must be a "reference object" that will do the right thing when a value is assigned to it. Ugg.

Why not ranges?

The group has discussed multiple times the possibility of replacing our concept of iterators with the concept of ranges as described by Andrei Alexandrescu's though provoking keynote "Iterators Must Go." In fact, it was an exploration in this possibility that brought me to this proposal of array portals.

The first problem I ran in to is that the description of ranges, particularly a random access range that we would need in Dax, is not completely specified by Alexandrescu's slides. I searched for a paper that would discuss more details, but could not find one.

However, in reviewing the details that Alexandrescu does give, it occurred to me that ranges also suffer from the problem that they implement mechanisms not particularly helpful to Dax. Popping the front or back of a range is not really useful and the front/back access mechanisms are clumsy for our purposes.

Really, the one advantage ranges give us (in the Dax project) over iterators is that they provide methods that set values via arguments. The concept of array portals is born off that idea. It is a class that basically provides only those set methods (with associated get), and dispenses with all the other iteration mechanisms we don't really need.

Converting iterators to portals

Many useful data structures will come with iterators, so it would be helpful to have a generic adapter from iterators to array portals. As it turns out, this is a simple adapter class that implements an array portal with a pair of iterators.

template<class IteratorT>
class ArrayPortalFromIterators
{
public:
  typedef IteratorT IteratorType;

private:
  IteratorType Begin;
  IteratorType End;

public:
  typedef typename std::iterator_traits<IteratorType>::value_type ValueType;

  ArrayPortalFromIterators(const IteratorType *begin, const IteratorType *end)
    : Begin(begin), End(end) {  }

  ValueType Get(dax::Id index) const {
    IteratorType location = this->Begin;
    std::advance(location, index);
    return *location;
  }
  void Set(dax::Id index, ValueType value) {
    IteratorType location = this->Begin;
    std::advance(location, index);
    *location = value;
  }

  dax::Id GetNumberOfValues() const { return std::distance(this->Begin, this->End); }
  
  dax::Scalar *GetIteratorBegin() const { return this->Begin; }
  dax::Scalar *GetIteratorEnd() const { return this->End; }
};

Converting portals to iterators

Despite my assertion that array portals are more suited for Dax than iterators, our source code still requires iterators to leverage the generic algorithms in thrust. Thus, if we use array portals we need a mechanism to get to iterators.

Some of the previous examples have a trivial implementation of iterators and can just return that. However, that is not always the case. For example, in the scalar to Vector3 portal it is very difficult to build an appropriate iterator as described in the #Why not iterators? section. So if we have to implement this iterator anyway, what have we really gained?

A lot actually because we can create one generic implementation of a random access iterator on any array portal and reuse that on any array portal we wish.

template<class ArrayPortalType>
class IteratorFromArrayPortal;

I am skipping the implementation details because it is, as already described several times, complicated.

Once this adapter is created, any array portal can return iterators using this adapter. As an example, here is the completed ArrayPortalCVector3Array class.

class ArrayPortalCVector3Array
{
  dax::Scalar *Array;
  dax::Id Size;
public:
  typedef dax::Vector3 ValueType;
  ArrayPortalCVector3Array(dax::Scalar *array, dax::Id size)
    : Array(array), Size(size) {  }
  dax::Scalar Get(dax::Id index) const {
    dax::Scalar *location = this->Array + index;
    return dax::make_Vector3(location[0], location[1], location[2]);
  }
  void Set(dax::Id index, dax::Scalar value) {
    dax::Scalar *location = this->Array + index;
    location[0] = value[0];
    location[1] = value[1];
    location[2] = value[2];
  }
  dax::Id GetNumberOfValues() const { return this->Size; }

  typedef IteratorFromArrayPortal<ArrayPortalCVector3Array> IteratorType;
  IteratorType GetBeginIterator() const { return IteratorType(*this, 0); }
  IteratorType GetEndIterator() const { return IteratorType(*this, this->Size); }
};

If we have an IteratorFromArrayPortal class, why do we need array portal classes to return iterators in the first place? Why not just always create an IteratorFromArrayPortal? It is because IteratorFromArrayPortal introduces another level of indirection to access. Compilers may or may not optimize it away. Also, Thrust has special "fast path" mechanisms for copying certain basic iterators. For these reasons it can be beneficial to remove a layer of indirection and use a containers native iterator when available.

Acknowledgements

Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

SandiaLogo.png DOELogo.png

SAND 2012-5992P