Dec 2, 2022
We present TabPFN, a trained Transformer model that can do tabular supervised classification for small datasets in less than a second, needs no hyperparameter tuning and is competitive with state-of-the-art classification methods.TabPFN is entailed in the weights of our network, which accepts training and test samples as a set-valued input and yields predictions for the entire test set in a single forward pass. TabPFN is a Prior-Data Fitted Network (PFN) and is trained offline once, to approximate Bayesian inference on synthetic datasets drawn from our prior. Our prior incorporates ideas from causal learning: It entails a large space of structural causal models with a preference for simple structures. Afterwards, the trained TabPFN approximates Bayesian prediction on any unseen tabular dataset, without any hyperparameter tuning or gradient-based learning.On 30 datasets from the OpenML-CC18 suite, we show that our method outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with a 70× speedup. This increases to a 3 200× speedup when a GPU is available.We provide all our code and the trained TabPFN at https://anonymous.4open.science/r/TabPFN-2AEE. We also provide an online demo at https://huggingface.co/spaces/TabPFN/TabPFNPrediction.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker