Problem with TChain and https TURLs

Hi,

when using a TFile::Open() with the Google storage we have to add “&#multirange=false&nconnections=10” to the signed TURLs (see below) otherwise davix will not be able to correctly read the file - this works fine for single files, but when experimenting with a TChain, it seems the TURL in mangled at the “#” sign and the reading fails as if “&#multirange=false&nconnections=10” is missing.

Does TChain interpret all after the “#” accidentally as a comment in the TURL ?

Cheers, Johannes

>>> import ROOT 
>>> filename="https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/data22_13p6TeV/DAOD_PHYSLITE.30136824._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1675437388&Signature=blabla&#multirange=false&nconnections=10" 
>>> chain = ROOT.TChain() 
>>> chain.AddFile(filename) 
>>> el=chain.GetListOfFiles()[0] 
>>> el.GetTitle() 
'https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/data22_13p6TeV/DAOD_PHYSLITE.30136824._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1675437388&Signature=blabla&' 
>>> el.GetName() 
'#multirange=false&nconnections=10' 

_ROOT Version: 6.26/08 (Athena 23.0.14)
_Platform: x86_64-centos7-gcc11-opt
_Compiler: gcc 11.2


Hi,
I think @pcanal can help you on this

Cheers

Lorenzo

Apologies for the delay! I’m looking into it. I’ll keep you posted here.

I see the problem: TChain interprets the xyz in the #xyz URL part as tree name. Let me think about how to address this.

@elmsheus I just merged a patch in ROOT master that should fix the problem. Please let me know if this works for you.

Thank you @jblomer ! I will give this a try as soon as it will appear in the dev3LCG+Athena nightly in the coming day(s).

I’ve tested with the ROOT version from /cvmfs/sft-nightlies.cern.ch/lcg/nightlies/dev3/Sat/ROOT/HEAD/x86_64-centos7-gcc11-opt/bin/root

root 
   ------------------------------------------------------------------
  | Welcome to ROOT 6.29/01                        https://root.cern |
  | (c) 1995-2022, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Mar 04 2023, 00:52:00                 |
  | From heads/master@v6-29-01-768-gc554707                          |
  | With g++ (GCC) 11.3.0                                            |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------

and unfortunately I still see similar issues, that the #multirange=false&nconnections=10 seems to be interpreted as TTree and the files are not correctly read with https while when the files are on the local file system all is fine.

Did I pick up the wrong ROOT version ?

Apologies, I missed this reply!

According to the build date, the fix should have been in this build. I just tried again, same path. The build is now from Mar 11. Here, I tried this in the ROOT prompt:

root [0] TChain c("defaultname");
root [1] c.Add("https://some.domain:8443/path/to/file.root.1?a=b&x=y&#multirange=false&nconnections=10")
(int) 1
root [2] c.GetListOfFiles()->At(0)->GetTitle();
root [3] auto n = c.GetListOfFiles()->At(0)->GetName();
root [4] n
(const char *) "defaultname"
root [5] auto t = c.GetListOfFiles()->At(0)->GetTitle();
root [6] t
(const char *) "https://some.domain:8443/path/to/file.root.1?a=b&x=y&#multirange=false&nconnections=10"
root [7]

So, in this build the fragment parameter is still part of the URL and the tree name was not modified. Could you try again?

Hi @jblomer,

sorry for the late reply

Unfortunately it does not work with:

$ /cvmfs/sft-nightlies.cern.ch/lcg/nightlies/dev3/Mon/ROOT/HEAD/x86_64-centos7-gcc11-opt/bin/root

and

   ------------------------------------------------------------------
  | Welcome to ROOT 6.29/01                        https://root.cern |
  | (c) 1995-2022, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Mar 20 2023, 00:22:00                 |
  | From heads/master@v6-29-01-916-g788ebb6                          |
  | With g++ (GCC) 11.3.0                                            |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------

Here is my naive C/C++ style ROOT macro with the signed TURL signature replaced with “blablabla”

{
  std::string filename1="https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391831&Signature=blablabla&#multirange=false&nconnections=10";
  std::string filename2="https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000002.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391875&Signature=blablabla&#multirange=false&nconnections=10";

  TChain chain("CollectionTree");
  chain.AddFile(filename1.c_str());
  chain.AddFile(filename2.c_str());
  chain.Print();
  chain.Scan("AnalysisElectronsAuxDyn.pt");
  chain.Draw("AnalysisElectronsAuxDyn.pt");

  std::cout << chain.GetListOfFiles() << std::endl;
  auto el=chain.GetListOfFiles()[0];
  std::cout << el.GetTitle() << std::endl;
  std::cout << el.GetName() << std::endl;

}

and the output (again signature replaced):

Processing test.C...
******************************************************************************
*Chain   :CollectionTree: https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391831&Signature=blablabla& *
******************************************************************************
******************************************************************************
*Chain   :CollectionTree: https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000002.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391875&Signature=blablabla& *
******************************************************************************
Error in <TChain::LoadTree>: Cannot find tree with name #multirange=false&nconnections=10 in file https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391831&Signature=blablabla&
Error in <TChain::LoadTree>: Cannot find tree with name #multirange=false&nconnections=10 in file https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000002.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391875&Signature=blablabla&
Error in <TChain::LoadTree>: Cannot find tree with name #multirange=false&nconnections=10 in file https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391831&Signature=blablabla&
0x1986b70
An array of objects
TObjArray

Cheers, Johannes

Strange, I’m taking a look.

I see, it is the @ character in the query part of the URL that confuses the url parsing. I’ll follow-up on that. Meanwhile, please try with escaping the @ by %40, like this:

"https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod%40atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391831&Signature=blablabla&#multirange=false&nconnections=10"

This may or may not work, it depends whether or not the Google service un-escapes the character. As far as I understand, it wouldn’t be obliged to do so.

That’s indeed the culprit ! Using %40 instead of @ in the signed-TURLs for the 2 files works, obviously both in the python and the C/C++ macros.

Cheers, Johannes