I have had interesting discussions about distinction between static vs. dynamic classes with Kishore and Hal.
The distinction arises in multiclass prediction settings. A static set of classes is given by a set of labels {1,…,k} and the goal is generally to choose the most likely label given features. The static approach is the one that we typically analyze and think about in machine learning.
The dynamic setting is one that is often used in practice. The basic idea is that the number of classes is not fixed, varying on a per example basis. These different classes are generally defined by a choice of features.
The distinction between these two settings as far as theory goes, appears to be very substantial. For example, in the static setting, in learning reductions land, we have techniques now for robust O(log(k)) time prediction in many multiclass setting variants. In the dynamic setting, the best techniques known are O(k), and furthermore this exponential gap may be essential, at least without further assumptions.
Are there techniques for converting from dynamic multiclass to static multiclass? For example, we could embed a dynamic set of classes within a much larger static set ranging over all possible dynamic classes while eliminating all class-dependent features. In some cases, this approach may work well, but I’ve also seen it fail, with the basic problem being that a learning algorithm might easily choose an invalid class. We could of course force a learning algorithm to choose amongst the dynamically valid set, but I don’t know a general way to do that without making the running time at least scale with the number of valid classes.
So, a basic question that’s bothering me is: When and how can we effectively predict amongst a set of dynamically defined classes in sublinear time? A quick answer is “it’s not possible because simply reading off the set of dynamically defined classes require O(class count) time”. This answer isn’t satisfying, because there are many ways to implicitly specify a set in sublinear time. So the modified question is “Are there natural ways to dynamically define classes in sublinear time? And can these be used for sublinear prediction?”