>> In this case we build a model with 1T index which we lookup for every token t...

>> In this case we build a model with 1T index which we lookup for every token to make prediction with much smaller model. <<

This index seems to be used to minimize the size of models.

I'm familiar with term indexing as described in The Handbook of Automated Reasoning and I imagine that this index helps them recognize 'generalizations'.

In the way that a rewrite rule can be used to reduce an infinite number of expressions, not just a single expression, a generalization can be used to minimize models.

Generally, such an index would be some kind of prefix-tree.

Just a guess, guessing is fun