Request for a way to define publicly shared datasets

Hi PROOF expert,

Currently in ROOT-5.16.00-PROOF.00, user can define his own dataset via function CreateDataSet, and access to it by GetDataSet. However, the dataset is store in kProof_WorkDir + username + kProof_DataSetDir. And other users have to explicitly specify the username who created the dataset in some way like GetDataSet("~username/datasetname"). It would be desirable to create shared datasets accessible to all users in a simple way GetDataSet("datasetname").

Would you please implement such a new feature in CreateDataSet? And according change applied to GetDataSet. GetDataSet searches for a dataset in his private area first, if not found, then further searches in the publicly shared area.

–Shuwei

Hi Shuwei,

Thanks for the suggestion. We are already working on redesign of the data set related functions and access to public data sets will be improved. Groups that were introduced to PROOF recently also require to have groups’ common data sets.
There will be a search function. Data sets could also be public by default. Let’s assume that search would return first user’s private data sets, group ones and the global public ones. In that case what you are suggesting is that GetDataSet should make a search and return the first data set without checking what else was found.
I think that the name should uniquely identify the result of GetDataSet. For instance, while getting a global public data set, the result could be auto-magically substituted by some group data set with the same name!

Would GetDataSet("[public/]name") to get [public] data set called ‘name’ be OK with you?

Cheers,
Jan

Hi Jan,

It is fine with me to use GetDataSet("public/name") to get public data set. Another feature I would like is to use "~" followed by "/" to refer to the current user, which is not in place in GetDataSet as I know.

Thanks,

–Shuwei

Hi Shuwei,

You access the dataset folder of a user by default. So I see no need to specify ~/ for that.
See also:
root.cern.ch/twiki/bin/view/ROOT/ProofDataSets

Jan

Hi Jani,

I think that we need some clarifications. It is not clear to me how to distinguish between private datasets, group shared datasets and global public datasets without specifying "~/" for private datasets. At least the following page does not explain it:

root.cern.ch/twiki/bin/view/ROOT/ProofDataSets

–Shuwei

Hi Shuwei,

Sorry. The possibility to add group and global datasets is only being implemented so it can not be descibed in that HowTo.

For private and public datasets it says at the end of that page:
Methods for accessing and verifying the datasets now accept ‘dataSet’ argument of form:

* "[public/]dataSetName" in case of user's own datasets
* "~user/public/dataSetName in case of other user's datasets 

So the distinction is simple. By default user accesses her datasets and ~username/ is added to access other user’s data sets. As I wrote before, the set of funcions will be extended.

Jan