TClonesArray of objects which allocate memory within

mgmarino · February 20, 2011, 11:41am

Hi all-

I am currently using TClonesArrays extensively in our event data structure. The problem is, you can’t use them deeper than top-level without deleting them at the end of the event loop. For that matter you can’t use any object that allocated memory which means you almost certainly can’t have any other class inside (e.g. vector, string, etc. all allocate memory within ) unless you know exactly what’s going on in that class. This is a problem for non-pointer member objects, e.g.:

class MyClass : public TObject { // Can't put this in a TClonesArray // without calling TClonesArray::Delete() on each // event loop protected: std::vector<double> fMyData; };

Anyways, I know way to deal with this is either to give something of static size (for example an array of defined size), call TClonesArray::Delete(), or actively delete objects in Clear(), but this sort of defeats the point of a TClonesArray altogether. My question is, is this really necessary? For example:

The way to get a new object in a TClonesArray is to call this new function to place memory in the location defined by TClonesArray:

where a is of course a TClonesArray. This calls the object’s constructor, but places all the memory in a particular location. When the TClonesArray is cleared with Clear(“C”), it calls the Clear functions of all the TTrack members and resets its own size to 0. No destructors are called. However, when you insert an object via the new method above, the constructor is called again, etc…

My question is, if there is an object that has already been allocated in the TClonesArray, why not just grab a pointer to that old object? That is, your code would look like:

TClonesArray a; track = new(a[a.GetEntriesFast()]) TTrack(x,y,z,...); ... a.Clear('C'); // any subsequent calls would be like: if ( a.WasConstructorCalledAt(i) ) { track = a.GetOld(i); // Function would return } track->fX = myx; // etc...

which would get an old (but cleared and presumably in a pristine virgin state!) object if the constructor for that object has already been called. In principle, one could add a function to TClonesArray:

which would return an old (but cleared) object or a new object if size is less than i.

Anyways, I leave out a lot of implementation details, one thing that would need to be done in Clear() is resetting all the bits and unique id from TObject correctly (which get set normally during the constructor calls).

Am I missing something? This seems like it would speed things up a lot for classes which allocate data. What it would do is force a user to ensure that in the Clear functions, the allocated data is cleared (or reset to an acceptable state). I don’t think this is far beyond what is already required of users coding with TClonesArrays. For some users, this might not make a difference, but for example for users of vectors and other TClonesArrays, this would make a big difference. Both vectors and TClonesArrays can be cleared fast without deallocation of data.

Cheers,
Mike

mgmarino · February 21, 2011, 12:37am

I include here an attachment that displays the (known) issues of TClonesArrays discussed above. Download, unzip and type

make ./main

The problem is that the constructor call overwrites any information about allocated data, for example, from a std::vector the pointers of the class that retain information about the location of the allocated memory are overwritten. The proposal is to have a TClonesArray function do the following:

TObject* TClonesArray::GetNewOrClearedObjectAt(Int_t i)
{
    if (!fClass) {
        // error
        return NULL;
    }
    if (i >= fSize || !fIsConstructed[i]) {
       TObject* temp = static_cast<TObject*>(fClass->GetNew()((*this)[i]));
        fIsConstructed[i] = true;
        return temp;
    }
    return fCont[idx];
}

where fIsConstructed is an array of bools that says whether or not the constructor has ever been called. In the Clear function of the held class, the user must set the allocated data to a correct state. In this example, a call to std::vector::resize(0) would suffice. This doesn’t force a deallocation of memory, but rather sets it to a ‘clean’ state. In addition fBits and fUUID must be set to 0 as would be done in a TObject constructor (these aren’t reset by TObject::Clear()).

The limitation here is of course that nothing can be passed to the constructor. I don’t see this as a limitation, since it can be used the following way (normally this code would be wrapped by an additional Get function so users don’t have to always cast):

TTrack* track = static_cast<TTrack*>(fClones.GetObjectAt(0)); track->fX = 0.2; ...
clones_test.tar.gz (1.54 KB)

jasondet · February 21, 2011, 6:32am

I think Mike could get the functionality he wants with the existing TClonesArray::operator[] if a change is made in TStorage::ObjectAlloc() that should probably be there anyway.

The TObject destructor goes to the trouble of toggling the kNotDeleted bit to 0 so that one can test whether the destructor has been called on a region of memory allocated for a TObject. However, when TStorage::ObjectAlloc() first allocates the memory for the object in TClonesArray::operator[], it seems to just call the global-scope new with the appropriate number of bytes, update the heap head and tail, and return. I think TStorage::ObjectAlloc() should also include a line to zero the newly allocated memory so that the user can call ((TObject*) pointer)->TestBit(TObject::kNotDeleted) before the first time the TObject’s constructor is called and get a well-defined response of “false”. This will at least be a safe call when TStorage::IsOnHeap(pointer) returns true. I think this line, added before the AddToHeap() call in TStorage::ObjectAlloc(), will suffice:

memset((void*) space, 0, sz);

Then Mike might be able to do what he wants via

if ( a[i]->TestBit(TObject::kNotDeleted) ) track = a[i];
else track = new(a[i]) TTrack(x,y,z);

Thanks,
Jason

mgmarino · February 21, 2011, 9:09am

I think Jason’s mechanism is the eventual way to go. For the time being, I have a class derived from TClonesArray and I will implement the memory management within this class. To summarize some details that happened offline:

In MyTClonesArray:

TObject *&MyTClonesArray::operator[](Int_t idx)
{
...
if (!fKeep->fCont[idx]) {
      fKeep->fCont[idx] = (TObject*) TStorage::ObjectAlloc(fClass->Size());
      memset(fKeep->fCont[idx], 0, fClass->Size());
}
...
TObject *MyTClonesArray::GetNewOrCleanedObject(Int_t idx)
{
    TObject* temp = (*this)[idx];
    if ( temp && temp->TestBit(TObject::kNotDeleted) ) return temp;
    return New(idx);
}

In the TObject class that is contained within the TClonesArray, the following needs to be placed:

void MyTObject::Clear(Option_t*)
{
...
  // The following for my particular situation
  fVector.resize(0);

  // The following to take care of TRefs, etc.
  ResetBit( kIsReferenced );
  ResetBit( kHasUUID );
}

And of course the TClonesArray would always need to be cleared with:

myClones.Clear("C");

to ensure that the clear function of the objects within is called.

mgmarino · February 21, 2011, 9:58pm

After testing the code, I found a few issues with accessing protected variables. I also pulled the resetting of SetUniqueID and bits into the array class as this is difficult to maintain scattered through other classes. The derived class looks like now:

TObject *&MyTClonesArray::operator[](Int_t idx) { // Overload of TClonesArray::operator[], zeros memory allocated so // that calls like: // // temp = myClonesArray[i]; // temp->TestBit(TObject::kNotDeleted) // // will behave correctly. Bool_t mustZero = false; if (idx >= 0 && fClass && (idx >= fSize || !fKeep->UncheckedAt(idx))) { mustZero = true; } TObject *&temp = TClonesArray::operator[](idx); if ( temp && mustZero ) memset(temp, 0, fClass->Size()); return temp; }

[code]TObject *MyTClonesArray::GetNewOrCleanedObject(Int_t idx)
{
// Get a new or cleaned object from the array. Tests to see if the
// destructor has been called on the object. If so, or if the object
// has never been constructed the class constructor is called using
// New(). If not, return a pointer to the correct memory location.
// This explicitly to deal with TObject classes that allocate memory
// which will be reset (but not deallocated) in their Clear()
// functions. These classes must call the following in their Clear
// functions:
//
// to ensure that the bits related to references (TRefs, etc.) are
// reset.

TObject* temp = (this)[idx];
if ( temp && temp->TestBit(TObject::kNotDeleted) ) return temp;
return (fClass) ? static_cast<TObject>(fClass->New(temp)) : 0;
}[/code]

void MyTClonesArray::Clear(Option_t *option) { // // Properly reset the UUID, reference bits and reset the UniqueID of each // object. // if (option && option[0] == 'C') { Int_t n = GetEntriesFast(); for (Int_t i = 0; i < n; i++) { TObject *obj = UncheckedAt(i); if (obj) { obj->ResetBit( kHasUUID ); obj->ResetBit( kIsReferenced ); obj->SetUniqueID( 0 ); } } } TClonesArray::Clear(option); }

I think this code could be useful to anyone who wishes to add classes to a TClonesArray that allocate data, but don’t want to fully delete those classes incurring the overhead of deletion. Are there any comments from the ROOT crew?

pcanal · August 15, 2011, 4:37pm

Hi Mike,

This proposal can indeed speed the use of TClonesArray in many cases. This has been added to the trunk where the routine ‘GetNewOrCleanedObject’ has been named ‘ConstructedAt’.

Thanks,
Philippe.