Using TMultiLayerPerceptron

Hi (Christophe?)

I wish to use TMultiLayerPerceptron to perform a classification. In the mlpHiggs tutorial, the output of the NN is a 0 or 1 corresponding to signal or background. Is it possible to use the NN to classify into more than two outcomes?

The data I am looking at contains 14 dimensions, and I am hoping to be able to build a classifier to distinguish the data into 4 types. I have read all the other forum posts and documentation and can’t tell if this has already been answered.

many thanks

Hi Peter,

Yes, there is no constraint in classification.
There are several ways to achieve that:

  1. use a single neuron and train the network for 0,1,2,3 and look at the closest solution.

  2. use four output neurons, and train the output to (1,0,0,0), (0,1,0,0), (0,0,1,0), (0,0,0,1). Then, when you have an output (a,b,c,d) look at the most probable hypothesis.

  3. use three neurons and train the network to the vertices of a tetrahedron: (1, 1, 1), (0, 0, 1), (0, 1, 0), (1, 0, 0), and for each output (a,b,c), look at the closest vertex.

  4. is probably the worst, since it implies some ordering in the categories. It only makes sense if 0 looks much like 1, less like 2, even less like 3, if 1 lookes like 0 and 2 but not like 3, etc.

  5. is the most flexible way. That’s the method used in some character recognition algorithms

  6. has the advantage of the geometrical interpretation. You can plot the output as a cloud in 3D space and cut around one vertex to select one category.

Cheers,
Christophe.

Hi Christophe,

thanks for coming back to me so quickly - that helped a lot - I can now do what I wanted to do :slight_smile:

cheers
Peter

Hi Christophe,

I have another quick question if you don’t mind. I’ve attached an image showing the neural net training over 1000 iterations. As you can see the training and test samples diverge over time. Should this be interpreted as overtraining on the data sample, or as systematic differences between the training and test samples?

For reference, I’ve also attached the output of DrawNetwork for my four neurons- as you can see the separation looks quite nice actually.

Many thanks again




Hi,

I would still be most appreciative of an interpretation of the diverging error plot.

many thanks

Hi,

this is almost impossible to answer - if there is a systematic difference between your training and test samples then it could cause what you see. Although I doubt that the difference of the test sample’s error and the training sample’s error would increase as steeply as it does for you: the net won’t be influenced by the test sample, so its error should not increase that much. I.e. I would assume that the net is overtrained.

Cheers, Axel.