Neural networks excel at processing language, yet their inner workings are poorly understood. One particular puzzle is how these models can represent compositional structures (e.g., sequences or trees) within the continuous vectors that they use as their representations. We introduce an analysis technique called DISCOVER and use it to show that, when neural networks are trained to perform symbolic tasks, their vector representations can be closely approximated using a simple, interpretable type of symbolic structure. That is, even though these models have no explicit compositional representations, they still implicitly implement compositional structure. We verify the causal importance of the discovered symbolic structure by showing that, when we alter a model’s internal representations in ways motivated by our analysis, the model’s output changes accordingly.
At this week’s MCQLL meeting on October 26 at 3:00 PM, Tom McCoy will give a talk titled “Discovering implicit compositional representations in neural networks.” An abstract of the talk follows.
If you haven’t already registered for the Zoom meeting, you can do so here.