Generally, they don't. word2vec and GloVe, the two most popular word vector mode...

Generally, they don't. word2vec and GloVe, the two most popular word vector models in the circles I run with, don't have any solution to this at all. As a result, when you do two dimensional visualizations of these vector spaces, words with multiple commonly used senses get positions in the middle of nowhere.

In many situations, the downstream models that depend on these word vectors manage to perform well anyway, so for simplicity, we just live with this limitation.

That said, there has been work on handling polysemy (the property of one word having multiple meanings) in word vector models. The simplest method I've heard of it is to do the word sense disambiguation out of band and then tag some kind of identifier onto the end of each word before you start training your word vector model.

As an example, you could run a part of speech tagger and then tag the part of speech onto the end of the word. So in the above example, we'd get "fork_verb" and "fork_noun". Part of speech doesn't fully disambiguate a word, so this only gets you part of the way there, but at least it's easy.

You can do something similar with named entities. You can replace the two words "Larry" and "Page", which the model would learn very generic vectors for with "Larry_Page" or "EntityID_192318" or whatever.

There has also been work in automatically detecting different senses of a word. I've got at least one paper[1] in my notes that talks about this kind of thing. It does k-means clustering on the contexts that a particular word appears in, and learns a different representation for each cluster.

[1] - Eric H. Huang , Richard Socher , Christopher D. Manning , Andrew Y. Ng, Improving word representations via global context and multiple word prototypes, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, July 08-14, 2012, Jeju Island, Korea