Scaling Up Ordinal Embedding: A Landmark Approach Ordinal Embedding is the problem of placing n objects into R^d to satisfy constraints like "object a is closer to b than to c." It can accommodate data that embeddings from features or distances cannot, but is a more difficult problem. We propose a novel landmark-based method as a partial solution. At small to medium scales, we present a novel combination of existing methods with some new theoretical justification. For very large values of n optimizing over an entire embedding breaks down, so we propose a novel method which first embeds a subset of m << n objects and then embeds the remaining objects independently and in parallel. We prove a distance error bound for our method in terms of m and that it has O(dn log m) time complexity, and show empirically that it is able to produce high quality embeddings in a fraction of the time needed for any published method. Learning to select for a predefined ranking In this paper, we formulate a novel problem of learning to select a set of items maximizing the quality of their ordered list, where the order is predefined by some explicit rule. Unlike the classic information retrieval problem, in our setting, the predefined order of items in the list may not correspond to their quality in general. For example, this is a dominant scenario in personalized news and social media feeds, where items are ordered by publication time in a user interface. We propose new theoretically grounded algorithms based on direct optimization of the resulting list quality. Our offline and online experiments with a large-scale product search engine demonstrate the overwhelming advantage of our methods over the baselines in terms of all key quality metrics. Mallows ranking models: maximum likelihood estimate and regeneration This paper is concerned with various Mallows ranking models. We study the statistical properties of the MLE of Mallows' ϕ model. We also make connections of various Mallows ranking models, encompassing recent progress in mathematics. Motivated by the infinite top-t ranking model, we propose an algorithm to select the model size t automatically. The key idea relies on the renewal property of such an infinite random permutation. Our algorithm shows good performance on several data sets. Fast and Stable Maximum Likelihood Estimation for Incomplete Multinomial Models We propose a fixed-point iteration approach to the maximum likelihood estimation of the incomplete multinomial model, which provides a unified framework for analyzing ranking data. Incomplete observations cannot be distinguished as belonging to a unique category, but instead they fall in a subset of categories. We develop an minorize-maximization (MM) type of algorithm, which requires relatively fewer iterations and better time efficiency to achieve convergence. Under such a general framework, incomplete multinomial models can be reformulated to include several well-known ranking models as special cases, such as the Bradley--Terry, Plackett--Luce models and their variants. Experimental results show that our algorithm runs faster than existing methods on synthetic data and real data. The simple form of iteratively updating equations in our algorithm involves only basic matrix operations, which makes it easy to implement with large data. Fast Algorithm for Generalized Multinomial Models with Ranking Data We develop a framework of the generalized multinomial model, which includes both the popular Plackett--Luce model and Bradley--Terry model as special cases. We theoretically prove that the maximum likelihood estimator (MLE) under the generalized multinomial model corresponds to the stationary distribution of an inhomogeneous Markov chain uniquely. Based on this Markov chain, we propose an iterative algorithm that is easy to implement and interpret and certain to converge. Numerical experiments on synthetic data and real data demonstrate the advantages of our Markov chain based algorithm over existing ones that it converges to the MLE with fewer iterations and faster convergence rate. The new algorithm is readily applicable for problems such as page ranking, sports ranking data. Graph Resistance and Learning from Pairwise Comparisons We consider the problem of learning the qualities of a collection of items by performing noisy comparisons among them. Following the standard paradigm, we assume there is a fixed comparison graph'' and every neighboring pair of items in this graph is compared k times according to the Bradley-Terry-Luce model (where the probability than an item wins a comparison is proportional the item quality). We are interested in how the relative error in quality estimation scales with the comparison graph in the regime where k is large. We show that, asymptotically, the relevant graph-theoretic quantity is the square root of the resistance of the comparison graph. Specifically, we provide an algorithm with relative error decay that scales with the square root of the graph resistance, and provide a lower bound showing that (up to log factors) a better scaling is impossible. The performance guarantee of our algorithm, both in terms of the graph and the skewness of the item quality distribution, significantly outperforms earlier results. Learning Context-dependent Label Permutations for Multi-label Classification A key problem in multi-label classification is to utilize dependencies among the labels. Chaining classifiers are a simple technique for addressing this problem but current algorithms all assume a fixed, static label ordering. In this work, we propose a multi-label classification approach which allows to choose a dynamic, context-dependent label ordering. Our proposed approach consists of two sub-components: a simple EM-like algorithm which bootstraps the learned model, and a more elaborate approach based on reinforcement learning. Our experiments on three public multi-label classification benchmarks show that our proposed dynamic label ordering approach based on reinforcement learning outperforms recurrent neural networks with fixed label ordering across both bipartition and ranking measures on all the three datasets. As a result, we obtain a powerful sequence prediction-based algorithm for multi-label classification, which is able to efficiently and explicitly exploit label dependencies. Discovering Context Effects from Raw Choice Data Many applications in preference learning assume that decisions come from the maximization of a stable utility function. Yet a large experimental literature shows that individual choices and judgements can be affected by "irrelevant" aspects of the context in which they are made. An important class of such contexts is the composition of the choice set. In this work, our goal is to discover such choice set effects from raw choice data. We introduce an extension of the Multinomial Logit (MNL) model, called the context dependent random utility model (CDM) which allows for a particular class of choice-set effects. We show that the CDM can be thought of as a second-order approximation to a general choice system, can be inferred optimally using maximum likelihood and, importantly, is easily interpretable. We apply the CDM to both real and simulated choice data to perform principled exploratory analyses for the presence of choice set effects. On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference Our goal is for agents to optimize the right reward function, despite how difficult it is for us to specify what that is. Inverse Reinforcement Learning (IRL) enables us to infer reward functions from demonstrations, but it usually assumes that the expert is noisily optimal. Real people, on the other hand, often have systematic biases: risk-aversion, myopia, etc. One option is to try to characterize these biases and account for them explicitly during learning. But in the era of deep learning, a natural suggestion researchers make is to avoid mathematical models of human behavior that are fraught with specific assumptions, and instead use a purely data-driven approach. We decided to put this to the test -- rather than relying on assumptions about which specific bias the demonstrator has when planning, we instead learn the demonstrator's planning algorithm that they use to generate demonstrations, as a differentiable planner. Our exploration yielded mixed findings: on the one hand, learning the planner can lead to better reward inference than relying on the wrong assumption; on the other hand, this benefit is dwarfed by the loss we incur by going from an exact to a differentiable planner. This suggest that at least for the foreseeable future, agents need a middle ground between the flexibility of data-driven methods and the useful bias of known human biases. Learning Distance for Sequences by Learning a Ground Metric Thu Jun 13th 12:15 -- 12:20 PM @ Room 201 in Ranking and Preference Learning » Most existing metric learning methods are developed for vector representations. Learning distances that operate directly on multi-dimensional sequences is challenging because such distances are structural by nature and the vectors in sequences are not independent. Generally, distances for sequences heavily depend on the ground metric between the vectors in sequences. We propose to learn the distance for sequences through learning a ground Mahalanobis metric for the vectors in sequences. The learning samples are sequences of vectors for which how the ground metric between vectors induces the overall distance is given. The objective is that the distance induced by the learned ground metric affects large values for sequences from different classes and small values for those from the same class. We formulate the metric as a parameter of the distance and bring closer each sequence to an associated virtual sequence w.r.t. the distance to reduce the number of constraints. We develop a general iterative solution for any ground-metric-based sequence distance. Experiments on several sequence datasets demonstrate the effectiveness and efficiency of our method.