TMVA and categorical variables


I would like to use TMVA for the data analysis of a large data set with mostly categorical variables as all other programs I tried to use cannot handle the amount of data.

The data set has around 15 attributes of which at least 4 are categorical with more than 1000 different values. I tried binary encoding (make n integer columns with values 0 or 1 for each categorical column with n different values) but I constantly get errors that some row is constant and also the files get huge.

Is there a better way to handle categorical data for machine learning in TMVA? (I would like to use naive bayes, random forest and svm) Maybe with integer encoding, but there is no order within the categories. Any help is appreciated!