Home | News | Documentation | Download

Performance of TAxis::SetTimeDisplay()

Hello, I have been struggling with performance of TAxis::SetTimeDisplay().

To demonstrate this, I wrote the following PyRoot program:

from ROOT import TCanvas, TGraph, TMultiGraph, TLegend, gROOT, TFile, TDatime, TDirectory, gDirectory, TH2D, TProfile
from line_profiler import LineProfiler
from random import randrange, uniform
from array import array

#Unix timestamps
x_values = [randrange(1625546062, 1626946062) for p in range(0, 5000000)]
y_values = [uniform(-500.0, 500.0) for p in range(0, 5000000)]
x_values.sort()

valuelist = [y_values, x_values]

tfile = TFile('/home/akcope/CERN/CMS/mondb_rootfiles/profilefile.root', 'RECREATE', 'Demo ROOT file with histograms')

tfile.mkdir("mydir")
keydir = tfile.GetDirectory("mydir")
keydir.cd()

lp = LineProfiler()
lp_wrapper = lp(create_multigraph)
lp_wrapper('test', 'test', valuelist)
lp.print_stats()

tfile.Write()
tfile.Close()

Where the create_multigraph function is (including profiling):

Total time: 3.81037 s
File: /home/akcope/PycharmProjects/plot_profiler/plot_profiler.py
Function: create_multigraph at line 6

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     6                                           def create_multigraph(name_in_dir, title, valuelist):
     7         1     559296.0 559296.0     14.7      c1 = TCanvas(name_in_dir, '', 200, 10, 700, 500)
     8                                           
     9         1       1678.0   1678.0      0.0      c1.SetFillColor(0)
    10         1       2383.0   2383.0      0.1      c1.SetGrid()
    11                                           
    12         1       1972.0   1972.0      0.1      multigraph = TMultiGraph()
    13         1       3252.0   3252.0      0.1      leg = TLegend(0.1, 0.7, 0.48, 0.9)
    14         1         19.0     19.0      0.0      leg.SetFillColor(0)
    15                                           
    16         1          0.0      0.0      0.0      i = 1
    17         1          1.0      1.0      0.0      color = 0
    18                                           
    19                                           
    20         1          2.0      2.0      0.0      yvalues = valuelist[0]
    21         1          1.0      1.0      0.0      timestamps = valuelist[1]                   
    28         1      75418.0  75418.0      2.0      x = array('d', yvalues)
    29         1     792453.0 792453.0     20.8      y = array('d', timestamps)
    30                                           
    31                                           
    36                                           
    37         1          6.0      6.0      0.0      markerstyle = i + 1
    38                                           
    39         1          1.0      1.0      0.0      if markerstyle > 7 and markerstyle < 20:
    40                                                   markerstyle = markerstyle + 13
    41                                           
    42         1          3.0      3.0      0.0      color += 1
    43         1          1.0      1.0      0.0      if color == 10 or color == 5:
    44                                                   color += 1
    45                                           
    46         1      31441.0  31441.0      0.8      gr = TGraph(len(x), x, y)
    47         1       2042.0   2042.0      0.1      gr.Draw()
    48         1        952.0    952.0      0.0      gr.SetName('key')
    49         1        922.0    922.0      0.0      gr.SetTitle('key')
    50         1        724.0    724.0      0.0      gr.SetMarkerColor(color)
    51         1        642.0    642.0      0.0      gr.SetLineColor(color)
    52         1        695.0    695.0      0.0      gr.SetMarkerStyle(markerstyle)
    53         1        686.0    686.0      0.0      gr.SetMarkerSize(2)
    54         1      35012.0  35012.0      0.9      gr.GetXaxis().SetTitle('key')
    55         1       2317.0   2317.0      0.1      leg.AddEntry(gr, 'key', 'lp')
    56         1       1218.0   1218.0      0.0      multigraph.Add(gr)
    61                                           
    62         1        988.0    988.0      0.0      multigraph.Draw('Alp')
    63         1          7.0      7.0      0.0      multigraph.SetTitle(title)
    64         1    1359952.0 1359952.0     35.7      multigraph.GetXaxis().SetTimeDisplay(1)
    65         1       1022.0   1022.0      0.0      multigraph.GetXaxis().SetTimeFormat('%y-%m-%d %H:%M:%S %F1970-01-01 00:00:00')
    66                                               # multigraph.GetXaxis().SetTimeOffset(0, 'GMT')
    67                                           
    68         1       1387.0   1387.0      0.0      c1.SetGridy(0)
    69         1       1054.0   1054.0      0.0      leg.Draw()
    70                                               # c1.Update()
    71         1       2413.0   2413.0      0.1      c1.GetFrame().SetFillColor(0)
    72         1        825.0    825.0      0.0      c1.GetFrame().SetBorderSize(12)
    73         1     929584.0 929584.0     24.4      c1.Write()

As you can see, multigraph.GetXaxis().SetTimeDisplay(1) takes a considerable amount of time with respect to the rest of the code. How can I optimize this?


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.24/0
Platform: CentOS7
Compiler: linuxx8664gcc


Looking at your example it is not obvious to me where the access to SetTimeDisplay comes from. Moreover I do not think SetTimeDisplay is the one taking the time because it is simply doing:

virtual void       SetTimeDisplay(Int_t value) {fTimeDisplay = (value != 0);}

I would suspect more TMultiGraph.GetAxis() which call GetHistogram but in principle the histogram is created only when needed. It might be your code deletes the underlaying histogram of TMultiGraph which forces GetHistogram to recreate several times. But again that kind of logic is not visible in the code you posted.

Thank you, this indeed did provide some insight, as after modifying the code I get the following:

    65         1    1465470.0 1465470.0     36.3      multigraph.GetXaxis()
    66         1       2257.0   2257.0      0.1      multigraph.GetXaxis().SetTimeDisplay(1)

If I remove the SetTimeDisplay (and all other GetXaxis calls), I still arrive at the desired result of a readable TMultiGraph (and in the production application, actually achieve a 10-fold speed-up), just with unix timestamps instead of human-readable dates. Is there any way around this?

Just to note, I posted the entire program, there is no other logic. The lines which I did not include were just comments, notes to myself.

I yes … I the [rofiler output … ok … hard to copy/paste.
Why don’t you get the Axis only once ? (sorry I prefer C++)

auto xaxis = multigraph->GetXaxis();

Then you use:

xaxis->SetXXXXX()...

Trying this, it seems that is already optimized under the hood, as only the first call of GetXaxis() actually consumes that much time.

Besides, I am trying to optimize even the first call away, as it efficiently makes the production application way slower. For example, if I do not include any calls to this function and leave just the unix timestamps on the x axis, the time improves from 120 seconds to 20 seconds for my production test case.

I guess there is no way to call SetTimeDisplay() without calling GetXaxis()? Or a way to tell root to perform some kind of lazy initialization under the hood, since it might be the case that probably not all of the initialization done under the hood by GetXaxis() might not be required for my use case?

At the very end GetHistogram() will be called … unless you never plot the multigraph.
GetHistogram is the one consuming the time because it loops on all the graphs
in order to find the wider range including all the graphs. That’s why TMultiGraph was made. The Axis belong to the underlaying histogram created by GetHistogram().

You can try to call GetHistogram() once all the graphs are added. The time will be send there I guess.

Hello, thanks for the replies.

Upon some elaboration, I decided to change the program in the following way:

multigraph.GetXaxis().SetLimits(minstamp, maxstamp)
multigraph.GetXaxis().SetTimeDisplay(1)
multigraph.GetXaxis().SetTimeFormat('%y-%m-%d %H:%M:%S %F1970-01-01 00:00:00')

I do this before actually putting the sub-TGraphs in and I never call GetXaxis() again after that. I observed a massive speed increase after that. Literally went down from 110 seconds to 1.2 seconds for my production test-case.

Do you think this is an okay approach or is there something I could be breaking unknowingly?

Yes it is fine.