When you have a large amount of movie reviews, how can you know whether they are complments or criticisms? Since the amount of dataset is large, you need to use natural language processing tools to classify the sentiment of the text with powerful packages like nltk and scikit-learn. In this project, I made a sentiment analysis of movie reviews from the dataset of reviews on imdb from the UCI Machine Learning Repository's Sentiment Labelled Sentences Data Set
For pt.1 -- Basics, here are the Python notebook and the website. Here is the more reader-friendly Medium blog.
For pt.2 -- LSA, here are the Python notebook and the website. Here is the more reader-friendly Medium blog.
For pt.3 -- N-gram, here are the Python notebook and the website. Here is the more reader-friendly Medium blog.
For pt.4 -- BERT, here are the Python notebook and the website. Here is the more reader-friendly Medium blog.
pip install these modules
- pandas: data processing
- numpy: linear algebra
- nltk: natural language processing
- scikit-learn: machine learning
- matplotlib: visualization
- torch: deep learning
- transformers: Huggingface transformers
- tqdm: progress bar