FFT: Problem with large dataset in memory

Hi,

I am working in a project dealing with a great many of data. Now I have a problem with doing the FFT on a very long time trace, a signal with over 300 million sampling points. After testing with my computer, I realise that I can store only 227 points into memory, which will need 2 GB RAM. With an array of 228 Double_t points the program crash (“segmentation violation”). I tried the TVirtualFFT class with TVirtualFFT::SetPoint or TVirtual::SetPointComplex. They all need to load the data into memory. So the question is: Are there some mechanisms or algorithms to manage the array on a TTree or somewhere else on the hard disk? And then load the the data step by step into cache? Something likes a FileArray. Or you get a better idea?
This is really urgent. I am very grateful if I can hear something from you!
THX!

Hi,

From what I know, the underlying FFTW needs to have all data in memory, so I don’t think there is anything we can do on the ROOT side. If you find out how to make FFTW work without all data in memory, let me know, and I’ll think of how to change the wrapper in ROOT.

How much data (in MB) do you need to process? If it’s a little more than the allowed 2GB, I can try to lower the consumption a bit here and there.

Cheers,
Anna

I think I may be having a similar problem - I have 12,500 scope traces with 10k points each. They are in a Tree that contains one 10k-point scope trace per entry. I am trying to FFT each scope trace, one at a time, and then sum the resulting power spectra as I go. I add the each power spectrum to the histogram that contains the total sum, and then delete it immediately before moving on to the next scope trace. The memory errors only seem to go away if I comment out the FFT lines in the code.

Does this sound right, or could I be missing something else? Sample code below.

Thanks very much,
Penny

[code]TH1F *hscopetrace = new TH1F(“hscopetrace”, “Scope Trace; channels; Volts”, (int)NBins, 0., NBins);
TH1 *hm_tot=0; // sum of FFT power spectra, to be filled in loop below.
hm_tot = hscopetrace->FFT(hm_tot, “MAG_AVG”);

for (Int_t i=0;i<nentries;i++) // Loop through the scope traces contained in the Tree.
{
b_voltagesI->GetEntry(i); //Get the next scope trace from the Tree.

// Keep (re)filling the same histogram with the voltages for the scope trace.
for (Int_t j=0; j<(int)NBins; j++)
hscopetrace->SetBinContent(j, voltages[j]);

TH1 *hm=0; // Create a histogram to be filled with the FFT of the scope trace.
TVirtualFFT::SetTransform(0);
hm = hscopetrace->FFT(hm, “MAG”);
for (Int_t j=0; j<(int)NBins; j++) // Keep summing the FFT spectra.
hm_tot->SetBinContent(j, hm_tot->GetBinContent(j)+hm->GetBinContent(j));

delete hm; // Delete histo with the FFT of the scope trace.
}
hm_tot->Scale(1./(double)nentries); // Calculate the average FFT spectrum from the total.

delete hscopetrace;
delete hm_tot;[/code]

Hi Penny,

How much data (in MB) do you need to process? How high does the memory use of your process go? Is there a memory leak (i.e. memory keeps growing) or is it ‘just’ using too much memory for each iteration?

Philippe.

Hi Philippe,

I have 11 data files, each about 2 GB, and containing a Tree with 3 branches: VoltageI, Voltage Q, and Timestamp. I only need to process VoltageI right now. So I would estimate that I need to process on the order of 10 GB in total. I was hoping to be able to process one file at a time.

If I comment out the following 4 lines from the code, the memory usage tops out at 41.5k regardless of how many scope traces I process:

TVirtualFFT::SetTransform(0);
hm = hscopetrace->FFT(hm, “MAG”); // Do the FFT

for (Int_t j=0; j<(int)NBins; j++)
hm_tot->SetBinContent(j, hm_tot->GetBinContent(j)+hm->GetBinContent(j)); //Keep summing results.

If I put the lines back into the code, the memory usage keeps growing until I am using 873k after 9000 scope traces. The crash with the memory-related error occurs shortly after that.

Thanks,
Penny

[quote]until I am using 873k after 9000 scope traces. The crash with the memory-related error occurs shortly after that.[/quote]Are you sure of the unit? Less then 1MB is not a lot of memory and should not cause memory exhaustion problem (there might still be memory over-write problem though). Even if you are nearing 873MB this should not really be a problem (unless the memory is somehow doubled each time).

Could you try your example with valgrind?

Philippe.

Hi Philippe,

I think that I meant “K” instead of ‘k’, according to my Windows Task Manager. So I think I am getting toward 800+ MB of memory at the crash and I have 2GB of RAM on my laptop. Even if 800 MB does not seem like a lot, it does seem to increase steadily with the number of scope traces and this seems to eventually cause a problem.

I am not sure what valgrind is, but could possibly take a look at this later.

Thanks,
Penny

Hi Penny,

valgrind is a memory tool for Linux/Macos (see valgrind.org).

[quote]it does seem to increase steadily with the number of scope traces and this seems to eventually cause a problem.[/quote]Can you provide a running example (including data files) showing the problems?

Thanks,
Philippe.

Hi Philippe,
I am attaching some code with a relatively small data file. Thanks again for your replies.
Regards,
Penny
Jun10-12-45-25_16bit.root (1.96 MB)
readrootfile.C (1.47 KB)
ScopeClass.C (3.93 KB)

Hi,

I strongly recommend that you compile your code (in particular the class ScopeClass). For example you could use: gROOT->ProcessLine(".L ScopeClass.C+");In particular it should point at least the following (fatal in my case) problem: Double_t voltages[20000]; // Must be 2x as large as MemDepth ..... delete voltages;

Cheers,
Philippe.

Hi,

The memory leak itself comes from:

TVirtualFFT::SetTransform(0);which does NOT delete the existing one. To also delete the existing one do delete TVirtualFFT::GetCurrentTransform(); TVirtualFFT::SetTransform(0);

Cheers,
Philippe.

Hi Philippe,
Thank you so much.
Best regards,
Penny