Machine Learning for NLP

The **Machine Learning for NLP** course acts as the essential bridge between statistical modeling and modern language processing. Before diving into Transformers, it is crucial to master how classical algorithms like SVM and Naive Bayes handle unstructured text data.

Instructor: Zahra Amini

This repository provides a structured pipeline from raw text to predictive models, bridging the gap between “Text Analysis” and “Neural Networks”.

🚀 VIEW PROJECTS & CODE ON GITHUB

Course Syllabus

The curriculum is organized into logical modules, simulating a real-world NLP pipeline:

Module 1: Text Wrangling & Cleaning
- S00-S02: Tokenization, Stop-word removal, Stemming vs. Lemmatization using NLTK & Spacy.
Module 2: Feature Extraction (Text to Numbers)
- S03-S04: Implementing Bag of Words (BoW), TF-IDF, and N-Grams from scratch.
Module 3: Probabilistic & Geometric Models
- S07-S10: Text Classification using Logistic Regression, K-Nearest Neighbors (KNN), and Naive Bayes.
Module 4: Advanced Classification
- S11-S12: High-dimensional separation using Support Vector Machines (SVM) and LDA.
Module 5: The Neural Shift
- S13-S14: Introduction to Multi-Layer Perceptrons (MLP) for text data.

👩‍🏫 Instructor: Zahra Amini

GitHub Logo GitHub Portfolio Logo Portfolio LinkedIn