Function to extract file name from path

Dear Rooters

I apologize for asking a mainly C/C++ question, but I am not sure whether my function:
TString Path2Name(const char *name, const char *sep, const char *exten)
which I use at many places in my program, is really correct.

Here are some examples how I use this function:

root [1] Path2Name("/my/path/to/myfile.root", "/", "")
(class TString)"myfile.root"
root [2] Path2Name("/my/path/to/myfile.root", "/", ".")
(class TString)"myfile"
root [3] Path2Name("/my/path/to/myfile.root/mytree.txt;1", "/", "")
(class TString)"mytree.txt;1"
root [4] Path2Name("/my/path/to/myfile.root/mytree.txt;1", "/", ";")
(class TString)"mytree.txt"
root [5] Path2Name("/my/path/to/myfile.root/mytree.txt;1", "/", ".")
(class TString)"mytree"
root [6] Path2Name("/my/path/to/myfile.root/mytree.txt;1", "", ".root")
(class TString)"/my/path/to/myfile"

Here is the source code:

TString Path2Name(const char *name, const char *sep, const char *exten)
{
   // Extract name from full path
   // sep is path separator and exten is name extension

   TString outname = TString(name);
   char   *tmpname = new char[strlen(name) + 1];
   char   *delname = tmpname;

   tmpname = strtok(strcpy(tmpname,name),sep);
   while(tmpname) {
      outname = tmpname;
      tmpname = strtok(NULL,sep);
   }//while
      
   if (strcmp(exten,"") != 0) {
      Int_t i = outname.Index(exten);
      if (i > 0) {outname = outname.Remove(i);}
   }//if

   delete [] delname;

   return outname;
}//Path2Name

I am using this function in my program for a long time without any problems, but
recently I get strange results when working with large datasets resulting in very low
memory conditions. For example, when exporting all tree names from a file, one of the
trees returns as name e.g.:
"H\207\200\001H\207\200\001H\207\200\001H\207\200\001H\207\200\001H\207\200\001H\207\200\001H\207\200\001H\207\200\001H\207\200"
I am not sure what could be the reason for this, but to me it looks as memory overflow,
so my questions for the moment are:

  • Is the code for function Path2Name() correct?
  • Is the line “delete [] delname;” correct?
  • What happens if I delete line “delete [] delname;”? (I see no difference)
  • Does the returned “outname” have a valid memory?

Thank you in advance.

Best regards
Christian

Hi,

I don’t see anything wrong. But make sure that you don’t convert the returned TString to a const char* on the caller’s side - the TString is temporary, and its const char* will be invalid right after the call. So you should copy it to a TString on the caller’s side, or (more efficient) pass the TString to contain the result as an argument. If you remove “delete []delname” you should see a (small) memory leak.

Axel.

Dear Axel

Thank you for your answer and your explanation.

Most of the time I do:

TString str = Path2Name(treename,".","");

According to your explanation this should be ok.

However, in order to extract the tree names for R, I have created the following C-function:

void GetTreeNames(char **exten, int *gettitle, char **treenames)
{
// Get number of trees
   int   ntrees = 0;
   TKey *key    = 0;
   TIter next(gDirectory->GetListOfKeys());
   while ((key = (TKey*)next())) {
      if (strcmp(key->GetClassName(), "TTree") != 0) continue;
      if (!(strcmp((Path2Name(key->GetName(),".",";")).Data(), exten[0]) == 0 ||
            strcmp(exten[0], "*") == 0)) continue;
      ntrees++;
   }//while

// Get tree names
   TString names[ntrees];
   ntrees = 0;
   key    = 0;
   next.Reset();
   while ((key = (TKey*)next())) {
      if (strcmp(key->GetClassName(), "TTree") != 0) continue;
      if (!(strcmp((Path2Name(key->GetName(),".",";")).Data(), exten[0]) == 0 ||
            strcmp(exten[0], "*") == 0)) continue;
      names[ntrees] = (*gettitle == 0) ? key->GetName() : key->GetTitle();
      ntrees++;
   }//while

   for (int i=0; i<ntrees; i++) {
      treenames[i] = (char*)(names[i].Data());
//?      treenames[i] = strcpy(treenames[i], (char*)(names[i].Data()));
   }//for_i
}

It is this function, which sometimes returns the strange result:
Do you think that this use of Path2Name() is ok?
How should I copy names[i] to treenames[i] in the last for-loop?
(I have tried strcpy() buth this did cause even more problems)

P.S.: BTW, as far as I understand, the memory for treenames[ntrees] should have been reserved by R,
however sometimes I get the following bus error when running my R package:

 *** Break *** bus error
/Volumes/GigaDrive/CRAN/Workspaces/Exon/hutissues/u133p2/5163: No such file or directory.
Attaching to process 5163.
Reading symbols for shared libraries . done
Reading symbols for shared libraries ............................................................................................. done
0x90029a67 in wait4 ()

========== STACKS OF ALL THREADS ==========

Thread 1 (process 5163 thread 0x113):
#0  0x90029a67 in wait4 ()
#1  0x90046d9b in system ()
#2  0x04a3945d in TUnixSystem::StackTrace ()
#3  0x04a3c7c9 in TUnixSystem::DispatchSignals ()
#4  0x04a3c8fd in SigHandler ()
#5  <signal handler called>
#6  0x010c0647 in Rf_allocVector (type=10, length=1) at memory.c:1970
#7  0x010410c1 in do_is (call=0x1b096a8, op=0x1819f90, args=0x2f90b24, rho=0x2f90b5c) at coerce.c:1636
#8  0x01097f60 in Rf_eval (e=0x1b096a8, rho=0x2f90b5c) at eval.c:492
#9  0x0109aa25 in do_if (call=0x1b096e0, op=0x180e460, args=0x1b096c4, rho=0x2f90b5c) at eval.c:945
#10 0x01097cfd in Rf_eval (e=0x1b096e0, rho=0x2f90b5c) at eval.c:463
#11 0x01099f88 in do_begin (call=0x1b09718, op=0x180ef04, args=0x1b096fc, rho=0x2f90b5c) at eval.c:1156

This may look as an error in the R function “allocVector()”, which is responsible for allocating the character vector
for treenames[ntrees], but I cannot believe that this function has a bug.

Best regards
Christian

Hi,

your use of Path2Name() looks OK to me. The way you pass the treenames out does not, though: you assign it the buffer values of the local variable “names”, which will be gone after the function returns. Instead you should copy it. Now the question is whether each treename[i] entry points to allocated memory when the function is called. I assume it doesn’t, as this explain the problems you saw with your second option of passing names[] to treenames[]. Try this instead:for (int i=0; i<ntrees; i++) { treenames[i] = new char[names[i].Length() + 1]; strcpy(treenames[i], names[i].Data()); }

Axel.

Dear Axel

Thank you for your correction. Meanwhile, I assume, too, that I have to allocate memory in this case.

I have tried to replace “TString names[ntrees]” with “TString *names = new TString[ntrees]”,
and with this change I had no longer the memory problems.
However, I like your solution better and will use it.

Best regards
Christian