Problem with TChain and https TURLs

Hi,

when using a TFile::Open() with the Google storage we have to add “&#multirange=false&nconnections=10” to the signed TURLs (see below) otherwise davix will not be able to correctly read the file - this works fine for single files, but when experimenting with a TChain, it seems the TURL in mangled at the “#” sign and the reading fails as if “&#multirange=false&nconnections=10” is missing.

Does TChain interpret all after the “#” accidentally as a comment in the TURL ?

Cheers, Johannes

>>> import ROOT 
>>> filename="https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/data22_13p6TeV/DAOD_PHYSLITE.30136824._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1675437388&Signature=blabla&#multirange=false&nconnections=10" 
>>> chain = ROOT.TChain() 
>>> chain.AddFile(filename) 
>>> el=chain.GetListOfFiles()[0] 
>>> el.GetTitle() 
'https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/data22_13p6TeV/DAOD_PHYSLITE.30136824._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1675437388&Signature=blabla&' 
>>> el.GetName() 
'#multirange=false&nconnections=10' 

_ROOT Version: 6.26/08 (Athena 23.0.14)
_Platform: x86_64-centos7-gcc11-opt
_Compiler: gcc 11.2


Hi,
I think @pcanal can help you on this

Cheers

Lorenzo

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Apologies for the delay! I’m looking into it. I’ll keep you posted here.

I see the problem: TChain interprets the xyz in the #xyz URL part as tree name. Let me think about how to address this.

@elmsheus I just merged a patch in ROOT master that should fix the problem. Please let me know if this works for you.

Thank you @jblomer ! I will give this a try as soon as it will appear in the dev3LCG+Athena nightly in the coming day(s).

I’ve tested with the ROOT version from /cvmfs/sft-nightlies.cern.ch/lcg/nightlies/dev3/Sat/ROOT/HEAD/x86_64-centos7-gcc11-opt/bin/root

root 
   ------------------------------------------------------------------
  | Welcome to ROOT 6.29/01                        https://root.cern |
  | (c) 1995-2022, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Mar 04 2023, 00:52:00                 |
  | From heads/master@v6-29-01-768-gc554707                          |
  | With g++ (GCC) 11.3.0                                            |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------

and unfortunately I still see similar issues, that the #multirange=false&nconnections=10 seems to be interpreted as TTree and the files are not correctly read with https while when the files are on the local file system all is fine.

Did I pick up the wrong ROOT version ?

Apologies, I missed this reply!

According to the build date, the fix should have been in this build. I just tried again, same path. The build is now from Mar 11. Here, I tried this in the ROOT prompt:

root [0] TChain c("defaultname");
root [1] c.Add("https://some.domain:8443/path/to/file.root.1?a=b&x=y&#multirange=false&nconnections=10")
(int) 1
root [2] c.GetListOfFiles()->At(0)->GetTitle();
root [3] auto n = c.GetListOfFiles()->At(0)->GetName();
root [4] n
(const char *) "defaultname"
root [5] auto t = c.GetListOfFiles()->At(0)->GetTitle();
root [6] t
(const char *) "https://some.domain:8443/path/to/file.root.1?a=b&x=y&#multirange=false&nconnections=10"
root [7]

So, in this build the fragment parameter is still part of the URL and the tree name was not modified. Could you try again?

Hi @jblomer,

sorry for the late reply

Unfortunately it does not work with:

$ /cvmfs/sft-nightlies.cern.ch/lcg/nightlies/dev3/Mon/ROOT/HEAD/x86_64-centos7-gcc11-opt/bin/root

and

   ------------------------------------------------------------------
  | Welcome to ROOT 6.29/01                        https://root.cern |
  | (c) 1995-2022, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Mar 20 2023, 00:22:00                 |
  | From heads/master@v6-29-01-916-g788ebb6                          |
  | With g++ (GCC) 11.3.0                                            |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------

Here is my naive C/C++ style ROOT macro with the signed TURL signature replaced with “blablabla”

{
  std::string filename1="https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391831&Signature=blablabla&#multirange=false&nconnections=10";
  std::string filename2="https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000002.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391875&Signature=blablabla&#multirange=false&nconnections=10";

  TChain chain("CollectionTree");
  chain.AddFile(filename1.c_str());
  chain.AddFile(filename2.c_str());
  chain.Print();
  chain.Scan("AnalysisElectronsAuxDyn.pt");
  chain.Draw("AnalysisElectronsAuxDyn.pt");

  std::cout << chain.GetListOfFiles() << std::endl;
  auto el=chain.GetListOfFiles()[0];
  std::cout << el.GetTitle() << std::endl;
  std::cout << el.GetName() << std::endl;

}

and the output (again signature replaced):

Processing test.C...
******************************************************************************
*Chain   :CollectionTree: https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391831&Signature=blablabla& *
******************************************************************************
******************************************************************************
*Chain   :CollectionTree: https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000002.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391875&Signature=blablabla& *
******************************************************************************
Error in <TChain::LoadTree>: Cannot find tree with name #multirange=false&nconnections=10 in file https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391831&Signature=blablabla&
Error in <TChain::LoadTree>: Cannot find tree with name #multirange=false&nconnections=10 in file https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000002.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391875&Signature=blablabla&
Error in <TChain::LoadTree>: Cannot find tree with name #multirange=false&nconnections=10 in file https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod@atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391831&Signature=blablabla&
0x1986b70
An array of objects
TObjArray

Cheers, Johannes

Strange, I’m taking a look.

I see, it is the @ character in the query part of the URL that confuses the url parsing. I’ll follow-up on that. Meanwhile, please try with escaping the @ by %40, like this:

"https://storage.googleapis.com:443/atlas-europe-west1-datadisk/rucio/mc21_13p6TeV/DAOD_PHYSLITE.32048869._000001.pool.root.1?GoogleAccessId=atlas-rucio-prod%40atlas-rucio-prod.iam.gserviceaccount.com&Expires=1679391831&Signature=blablabla&#multirange=false&nconnections=10"

This may or may not work, it depends whether or not the Google service un-escapes the character. As far as I understand, it wouldn’t be obliged to do so.

That’s indeed the culprit ! Using %40 instead of @ in the signed-TURLs for the 2 files works, obviously both in the python and the C/C++ macros.

Cheers, Johannes