Dec 10, 2023
Embeddings like CLIP, GloVe, and word2vec are an integral part of large-scale machine-learning models. There is evidence that embedding methods encode high-level semantic information into the vector space structure of the embedding space. In this paper, we study the role of partial orthogonality in encoding meanings by searching for “meaningful” subspaces of an embedding spanned by other embeddings, which generalizes the notion of Markov boundaries in Euclidean space.Using this tool, we empirically study the semantic meaning of partial orthogonality in CLIP embeddings and find a good match to conceptual semantic meaning.Complementary to this, we also introduce the concept of independence preserving embeddings where embeddings preserve the conditional independence structures of a distribution, and we prove the existence of such embeddings and approximations to them.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker