Lecture Plan (11 Lectures)¶
Foundational (L01–L04)¶
L01: Setup + Debugging ✅ in progress¶
- Compressed dev env setup (prereq covered basics)
- Notebook hygiene and reproducibility
- Defensive programming, debugging in VS Code + Jupyter
- Adapt from:
lectures_25/01(setup),lectures_25/02(debugging)
L02: SQL for Data Analysis¶
- Core SQL: SELECT, JOIN, GROUP BY, window functions
- SQLite + pandas/sqlalchemy integration
- Adapt from:
lectures_25/03(was L03 last year)
L03: Larger-than-Memory Data¶
- Polars: lazy vs eager evaluation
- Out-of-core processing, chunking, parquet
- Database as one solution for large data (ties to L02)
- Adapt from:
lectures_25/02(was L02 last year—order swapped)
L04: NLP Foundations (student request)¶
- Text preprocessing, tokenization, embeddings
- Sentiment, NER, text classification
- Clinical/health text applications
- Sets up context for transformer/LLM lectures
- New content + foundations from
lectures_25/07
ML/AI Progression (L05–L09)¶
L05: Classification¶
- Train/test splits, evaluation metrics, cross-validation
- Logistic regression → Random Forest → XGBoost
- Handling imbalanced data, feature selection
- Adapt from:
lectures_25/05
L06: Neural Networks¶
- MLP, CNN, RNN/LSTM fundamentals (PyTorch)
- Training loop, backprop, regularization
- Adapt from:
lectures_25/06
L07: Transformers & Deep Learning¶
- Attention mechanism, transformer architecture
- Hugging Face basics, tokenization
- Adapt from:
lectures_25/07
L08: LLMs – DIY & Understanding¶
- Building intuition: nanoGPT walkthrough
- Embeddings, context windows, fine-tuning concepts
- Adapt from:
lectures_25/07demos
L09: LLMs – API, Agentic & Workflows¶
- OpenAI/Anthropic/local APIs
- Prompt engineering, structured outputs
- Tool use, agents, practical workflows
- Adapt from:
lectures_25/07+ new content
Applied / Student Choice (L10–L11)¶
L10: Time Series & Forecasting¶
- Time-based splits, lag features, rolling windows
- ARIMA basics, ML regressors for forecasting
- Adapt from:
lectures_25/04
L11: TBD – Student Vote¶
Candidates: - Computer Vision – CNNs, transfer learning, medical imaging (from lectures_25/08) - Visualization & Dashboards – Altair, Streamlit, MkDocs reports (from lectures_25/09) - Experimentation & A/B Testing – causal inference, power analysis (from lectures_25/10) - Distributed Computing – threads/processes, HPC intro - End-to-End Project – CRISP-DM, capstone guidance
Notes¶
- Each lecture: 90 min + additional demo time at 1/3, 2/3, end
- Demos in Jupyter (draft as markdown, convert via jupytext)
- Assignments: pass/fail, test understanding of core lecture content
- Reference last year's content in
lectures_25/for examples, datasets, demos