I was just wondering whether it is possible to use Roofit to generate toy data from simulated data, using the data points of the simulated data themselves to generate the toy data? I know you can use a RooHistPdf to do something roughly similar, but I wonder whether what I am describing is possible or not?
Hi @StephanH, thanks for your reply. Sorry if I wasn’t too clear in my original post. I guess I want to generate ‘new’ toy data points based on the old data points, but these new data points would statistically have the same empirical form as the original data. I think maybe bootstrapping is what I am looking for?
It might actually be easier:
If you are happy with having a weighted dataset, but you just want to wiggle the relative probabilites, just re-throw the weights. For each weight w, generate a w' = random_poisson(w).
You need to retrieve all weights, and make a cumulative distribution of weights. For weights 3, 2, 1, that would be 3, 5, 6.
Now, throw a random number between 0 and 6, and find the interval this thing falls into. Let’s say you get a 1, that’s index 0, since it’s smaller than 3. A 4 is index 1, since larger than 3 but smaller than 5, a 5.3 is index 2.
Draw the event at this index from the original dataset, and put it in the new dataset.
You will get an (unweighted!) dataset where events pop up as often as their weights in the original dataset dictate.