Trendlist: [New post] A Simplified Approach for Ordinal Classification

Friday, September 10, 2021

[New post] A Simplified Approach for Ordinal Classification

Respond to this post by replying above this line

A Simplified Approach for Ordinal Classification

by jamesdmccaffrey

In a standard classification problem, the goal is to predict a class label. For example, in the Iris Dataset problem, the goal is to predict a species of flower: 0 = "setosa", 1 = "versicolor", 2 = "virginica". Here the class labels are just labels wthout any meaning attache to the order. In an ordinal classification problem (also called ordinal regression), the class labels have order. For example, you might want to predict the median price of a house in one of 506 towns, where price can be 0 = very low, 1 = low, 2 = medium, 3 = high, 4 = very high. For an ordinal classification problem, you could just use standard classification, but that approach doesn't take advantage of the ordering information in the training data.

I coded up a demo of a simple technique using the PyTorch code library. The same technique can be used with Keras/TensorFlow too.

I used a modified version of the Boston Housing dataset. There are 506 data items. Each item is a town near Boston. There are 13 predictor variables — crime rate in town, tax rate in town, proportion of Black residents in town, and so on. The original Boston dataset contains the median price of a house in each town, divided by $1,000 — like 35.00 for $35,000 (the data is from the 1970s when house prices were low). To convert the data to an ordinal classification problem, I mapped the house prices like so:

         price          class  count  [$0      to $10,000)    0      24  [$10,000 to $20,000)    1     191  [$20,000 to $30,000)    2     207  [$30,000 to $40,000)    3      53  [$40,000 to $50,000]    4      31                                ---                                506

I normalized the numeric predictor values by dividing by a constant so that each normalized value is between -1.0 and +1.0. I encoded the single Boolean predictor value (does town border the Charles River) as -1 (no), +1 (yes).

The technique I used for ordinal classification is something I invented myself, at least as far as I know. I've never seen the technique I used anywhere else, but it's not too complicated and so it could exist under an obscure name of some sort.

For the modified Boston Housing dataset there are k = 5 classes. The class target values in the training data are (0, 1, 2, 3, 4). My neural network system outputs a single numeric value between 0.0 and 1.0 — for example 0.2345. The class target values of (0, 1, 2, 3, 4) generate associated floating point sub-targets of (0.1, 0.3, 0.5, 0.7, 0.9). When I read the data into memory as a PyTorch Dataset object, I map each ordinal class label to the associated floating point target. Then I use standard MSELoss() to train the network.

Suppose a data item has class label = 3 (high price). The target value for that item is stored as 0.7. The computed predicted price will be something like 0.66 (close to target, so low MSE error and a correct prediction) or maybe 0.23 (far from target, so high MSE error and a wrong prediction). With this scheme, the ordering information is used.

For implementation, most of the work is done inside the Dataset object:

  class BostonDataset(T.utils.data.Dataset):    # features are in cols [0,12], median price as int in [13]      def __init__(self, src_file, k):      # k is for class_to_target_program()      tmp_x = np.loadtxt(src_file, usecols=range(0,13),        delimiter="\t", comments="#", dtype=np.float32)      tmp_y = np.loadtxt(src_file, usecols=13,        delimiter="\t", comments="#", dtype=np.int64)        n = len(tmp_y)      float_targets = np.zeros(n, dtype=np.float32)  # 1D        for i in range(n):  # hard-coded is easy to understand        if tmp_y[i] == 0: float_targets[i] = 0.1        elif tmp_y[i] == 1: float_targets[i] = 0.3        elif tmp_y[i] == 2: float_targets[i] = 0.5        elif tmp_y[i] == 3: float_targets[i] = 0.7        elif tmp_y[i] == 4: float_targets[i] = 0.9        else: print("Fatal logic error ")        float_targets = np.reshape(float_targets, (-1,1))  # 2D        self.x_data = \        T.tensor(tmp_x, dtype=T.float32).to(device)       self.y_data = \        T.tensor(float_targets, dtype=T.float32).to(device)      def __len__(self):      return len(self.x_data)      def __getitem__(self, idx):      preds = self.x_data[idx]  # all cols      price = self.y_data[idx]  # all cols      return (preds, price)     # tuple of two matrices

There are a few minor, but very tricky details. They'd take much too long too explain in a blog post, so I'll just say that if you're interested, examine the code very carefully.

I don't think it's possible to assign a strictly numeric value to art. Here are two clever illustrations by artist Casimir Lee. I like the bright colors and combination of 1920s art deco style with 1960s psychedelic style.