brianspiering / Word2vec Workshop
Labels
Projects that are alternatives of or similar to Word2vec Workshop
Word2vec Algorithm: Made as simple as possible, but no simpler
Description
A Pythonic introduction to the word2vec algorithm. Word2vec, translating words (strings) to vectors (lists of floats), is a relatively new algorithm which has proven to be very useful for making sense of text data. You should walk out at the end with a conceptual understanding of the algorithm and be empowered to try it out on your favorite collection of text data.
“You shall know a word by the company it keeps” is a common refrain in Natural Language Processing (NLP). word2vec does that by training a neural network to learn which words tend to co-occur together and embeds the words in a meaningful vector space. From these "word embeddings", it is possible to compare words with distance measures, add/subtract words to explore relationships between concepts, and clustering to find semantically related words. Actually, word2vec is a general purpose algorithm that allows any sequential data to be encoded into meaningful vectors - including emojis!
Bio
Dr. Brian Spiering is a faculty member at GalvanizeU which offers a Master of Science in Data Science. His passions are Natural Language Processing (NLP), deep learning, and building data products. He is active in the San Francisco Data Science community through volunteering and mentoring.
Drop him a line [email protected]
Disclaimer: These are interactive notebooks that are meant to be run. There might be elements not rendered correctly on static GitHub pages.