Modern NLP in Python at PyData DC
I had the pleasure of presenting the tutorial Modern NLP in Python at PyData DC. This 90-minute tutorial provides an overview of powerful tools and modeling techniques used for natural language processing in Python, including spaCy, gensim, statistical phrase detection, Latent Dirichlet Allocation (LDA), and word vector embedding with word2vec. I demonstrate the utility of each tool and technique using relatable examples from the excellent Yelp Dataset, a publicly-available dataset of business profiles, customer reviews, and related metadata provided by the business search and rating service Yelp.
You can find the Jupyter notebook from the tutorial on GitHub and view it online with Jupyter nbviewer. I recommend nbviewer because it preserves the interactive visualizations in the notebook.
This post was originally published on datatheoretic.com.