This project performs Natural Language Processing (NLP) to classify restaurant reviews as positive or negative using a Naive Bayes classifier.
- Goal: Predict sentiment (positive/negative) of restaurant reviews.
- Dataset: 1,000 reviews from
Restaurant_Reviews.tsv.
- Text cleaning and preprocessing pipeline.
- Bag of Words model with feature limitation.
- Sentiment prediction using Gaussian Naive Bayes.
- Performance evaluation using accuracy and confusion matrix.
- Removed non-alphabetic characters.
- Converted to lowercase.
- Tokenized into words.
- Removed stopwords (excluding "not").
- Applied Porter Stemming.
- Reconstructed into cleaned text corpus.
- Used Bag of Words model with
CountVectorizer. - Limited features to top 1500 words.
- Classifier: Gaussian Naive Bayes
- Train/Test Split: 80/20
- Confusion Matrix and Accuracy Score used to assess performance.
- Clone this repository:
git clone https://github.com/MaddyRizvi/Natural-Language-Processing_sentiment_analysis.git
cd Natural-Language-Processing_sentiment_analysis- Install dependencies:
pip install -r requirements.txt- Ensure you have the dataset
Restaurant_Reviews.tsvin the project directory. - Run the script:
python natural_language_processing.pyRestaurant_Reviews.tsv: Dataset file.natural_language_processing.py: Main script for training and testing.README.md: Project overview and instructions (this file).CONTRIBUTING.md: Guidelines for contributing.
- numpy
- pandas
- matplotlib
- scikit-learn
- nltk
We welcome contributions! Please read the CONTRIBUTING.md file for guidelines.