The course will cover the design, implementation and deployment of large scale prediction systems and the associated tradeoffs. As part of the course each student will be required to complete a group or individual project that demonstrates the concepts covered. In addition to the technical material the related regulatory, privacy and ethical issues will be discussed and debated.
Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, By Cathy O'Neal
ISBN-10: 0553418831, ISBN-13: 978-0553418835
Python, Docker, Terraform and various AWS services.
Introduction, project problem statement, tools and setup
Cloud scale architecture(s) and trade-offs
Data collection, organization and monitoring
Feature engineering and labeling
Modeling
Research into prior art and model selection, privacy, red teaming, weaponization
How do you know you have it right?
Staging, Production, Blue/Green Deployment
Ethics in the age of democratized DS/ML
Project presentations
The 'All Singing All Dancing' TensorFlow docker image, it has everything I use for data cleaning, feature engineering, model development, document publication, network packet analysis and construction. This is not something that I would ever use in production, it's heavy and slow to pull. But I never say, 'damn I wish I had installed Latex'
Installed and configured packages
Ubuntu (Bionic Beaver)
Tensorflow, 2.1.0, CPU
Python 3
Jupyter
Matplotlib
SciKit Learn
Pandoc
Inkscape
Latex (Tex Live)
Git Python
Dnsdb
Shodan
Tldextract
Scapy
FeatureTools
Deep Feature Synthesis
Sagemaker / Boto
And lots more