Machine Learning for NLP

Bridging the gap between classical Machine Learning and Text Analytics. Covers Preprocessing, TF-IDF, SVM, and MLP.

The **Machine Learning for NLP** course acts as the essential bridge between statistical modeling and modern language processing. Before diving into Transformers, it is crucial to master how classical algorithms like SVM and Naive Bayes handle unstructured text data.
Instructor: Zahra Amini
Machine Learning for NLP

This repository provides a structured pipeline from raw text to predictive models, bridging the gap between “Text Analysis” and “Neural Networks”.

🚀 VIEW PROJECTS & CODE ON GITHUB


Course Syllabus

The curriculum is organized into logical modules, simulating a real-world NLP pipeline:

  • Module 1: Text Wrangling & Cleaning
    • S00-S02: Tokenization, Stop-word removal, Stemming vs. Lemmatization using NLTK & Spacy.
  • Module 2: Feature Extraction (Text to Numbers)
    • S03-S04: Implementing Bag of Words (BoW), TF-IDF, and N-Grams from scratch.
  • Module 3: Probabilistic & Geometric Models
    • S07-S10: Text Classification using Logistic Regression, K-Nearest Neighbors (KNN), and Naive Bayes.
  • Module 4: Advanced Classification
    • S11-S12: High-dimensional separation using Support Vector Machines (SVM) and LDA.
  • Module 5: The Neural Shift
    • S13-S14: Introduction to Multi-Layer Perceptrons (MLP) for text data.

👩‍🏫 Instructor: Zahra Amini

GitHub Logo GitHub    Portfolio Logo Portfolio    LinkedIn Logo LinkedIn