Dec 15, 2023
Deep neural networks come in many sizes and architectures. The choice of architecture, in conjunction with the dataset and learning algorithm, affects the learned neural representations. Yet recent results have shown that different architectures learn representations with striking qualitative similarities. Vision transformers, for instance, align with human neural responses to natural images about as well as trained convolutional neural networks. Why might different systems learn similar representations? Here we derive an effective theory of representation learning under the assumption that the encoding map from input to hidden representation and decoding map from representation to output are arbitrary smooth functions. This theory schematizes representation learning dynamics in the regime of complex, large architectures, where hidden representations are not strongly constrained by the parametrization. We show through experiments that the effective theory describes aspects of representation learning dynamics across a range of deep networks with different activation functions and architectures, and exhibits phenomena similar to the 'rich' and 'lazy' regime. While many network behaviors depend quantitatively on architecture, our findings point to certain behaviors that are widely conserved once models are sufficiently flexible.Deep neural networks come in many sizes and architectures. The choice of architecture, in conjunction with the dataset and learning algorithm, affects the learned neural representations. Yet recent results have shown that different architectures learn representations with striking qualitative similarities. Vision transformers, for instance, align with human neural responses to natural images about as well as trained convolutional neural networks. Why might different systems learn similar represe…
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Amir Feder, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Jie Yin, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Shuzheng Si, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%