An ordinal classification problem (confusingly, also called ordinal regression) is one where the goal is to predict a class label in situations where the labels have an ordering. For example, you might want to predict the price of a house, based on things like area in sq. feet, where the house price in the training data is 0 = low, 1 = medium, 2 = high, 3 = very high.

You could just use regular neural classification techniques, but that doesn't take advantage of the ordering information in the data. Put differently, if a true class label is 2 = high, the error for a prediction of 0 = low should be greater than a prediction of 1 = medium.

For ordinal classification, I use a technique that I haven't seen described anywhere else. But the idea is obvious so maybe the technique is used under some fancy name. If the training data has ordinal class labels like 0, 1, 2, 3 then I convert them to float targets of 0.125, 0.375, 0.625, 0.875. I create a neural network that emits a single numeric value between 0.0 and 1.0 and use mean squared error to compare a computed output with the associated float target. If you think this through, you'll see how the ordering information is used.

I recently upgraded my Keras code library to version 2.6 and so I figured I'd code up a demo of ordinal classification using that version. I generated a 200-item set of synthetic training data that looks like:

  -1   0.1275   0   1   0   2   0   0   1   1   0.1100   1   0   0   3   1   0   0  -1   0.1375   0   0   1   0   0   1   0   1   0.1975   0   1   0   2   0   0   1  . . .  

Each item is a house. The first column is air conditioning, the second column is area in square feet (divided by 10,000), the next three columns are one-hot encoded style (1,0,0 = art_deco, 0,1,0 = bungalow, 0,0,1 = colonial), the next column is price (0 = low, 1 = medium, 2 = high, 3 =very high), and the last three columns are local school (1,0,0 = johnson, 0,1,0 = kennedy, 0,0,1 = lincoln).

The key to the ordinal classification technique I use is mapping ordinal labels to float targets. For k = 4 classes, the idea can be explained graphically:

    0-------------1------------2------------3------------4  0.00         0.25         0.50         0.75         1.00        0.125        0.375        0.625        0.875  

There are 4 bins, one for each class label. The float targets are the midpoints of the bins if the bins length is normalized to 1.0. A function to compute targets for ordinal classification is:

  def make_float_targets(k):    targets = np.zeros(k, dtype=np.float32)    start = 1.0 / (2 * k)  # like 0.125    delta = 1.0 / k        # like 0.250    for i in range(k):      targets[i] = start + (i * delta)     return targets  

I coded up a demo using Keras 2.6 without too much trouble, other than the usual glitches that happen with any neural system. I noticed that when I computed classification accuracy, using an item-by-item approach was brutally slow. I suspect this is because there is a lot of conversion between Numpy arrays and Keras/TensorFlow tensors. Anyway, I wrote an accuracy function that used a set approach.

Good fun. Neural network technologies have advanced quickly, but are still relatively crude. When more powerful computing engines become available (probably via quantum computing), neural networks will do things that are impossible to imagine today.



Advances in aircraft engines enabled amazing performance improvements in just a few years. Left: The British S.E.5a (1917) had a top speed of 130 mph. Center: Just 20 years later, the British Spitfire Mk I (1937) had a top speed of 360 mph. Right: Just 20 years later, the U.S. Vought F-8 Crusader (1957) had a top speed of 1,200 mph.


Code and data below. Long. Continue reading "Ordinal Classification Using Keras"