Repository for the Caltech CS11 TensorFlow track. Course outline here.
This course is intended as a fast-paced introduction to machine learning with TensorFlow and Keras, focused particularly on neural networks. It gives a whirlwind tour of several types of models, and brief coverage of several machine learning topics, but its primary goal is first and foremost to get you comfortable with writing, training, and deploying sophisticated models in TensorFlow. Along the way, we'll look at concepts from differentiable programming, an exciting new programming paradigm. By the end of this class, you should be able to write deep models to solve many kinds of problems.
What this course is not:
- An intro class in machine learning. You should probably take one of those first.
- A statistics or linear algebra class. To write models and understand what they're doing, you need both of these, but this class doesn't have time to cover them. You should take a class in each of those first.
- A class in neural network theory. They're the main model we'll focus on, but the focus is how to build them, not why they work.
- Going to cover all of the most recent models. We just don't have time.
The class is three units, pass-fail. That means that readings and labs should take total three hours per week, for ten weeks. There are seven labs, plus some setup time. I've tried my very best to hit the 3-unit mark, so if a week is too short or too long, please let me know! That being said, some weeks may be longer than you're used to for a 3-unit because the ten weeks are divided into seven sections, and because we're trying to cover a lot of material in a limited time.
If you're planning to take the course, please read this entire document. It has some important info!
It goes without saying that you should be comfortable writing lots of Python code.
I'll also assume some working familiarity with machine learning (equivalent to CS156, "Learning From Data"), linear algebra, multivariable calculus, and the Python scientific computing stack (mostly jupyter
, numpy
and pyplot
).
As a result I'll spend less time than some courses might explaining certain concepts (e.g. matrix operations, gradient descent, linear regression) in order to cover more interesting and useful concepts in a short time.
I'll mention it whenever I'm glossing over a concept, and provide links to explanations I like, but be prepared to do plenty of outside reading if you're not fresh on these concepts.
There are some incredible tutorials and explanations online, much better than I could give, so it makes more sense for you to read those blog posts than to read my own worse explanations.
That being said, "requirements" can be fake if you don't mind putting in the extra hours to learn.
The only things you really need are solid Python skills and a working understanding of numpy
arrays.
(But then, this course will be more than 3 units...)
All of the labs will have the same basic structure:
- A README with a high-level overview of the week's content
- Some reading on the week's material, presented as a
notes_*.ipynb
Jupyter notebook that can be read online or run locally. It will have lots of exposition, plus images and code snippets here and there. - A
lab_*.ipynb
file to be filled out as the week's assignment
While doing the assignments, expect to spend some amount of time reading the TensorFlow documentation, especially pages on functions I mention you might need for a problem. This will be true every time you use TensorFlow. To quickly search for what you need, I recommend using https://devdocs.io/, which re-hosts the TensorFlow documentation in an easy-to-search way. The regular documentation is available here: https://www.tensorflow.org/api_docs/python/.
You should do this setup before the first lab.
For this class, we'll using Python 3. The easiest way to complete the setup is in a virtual environment, which I walk you through below. Note that TensorFlow currently only supports Python 3.4, 3.5, and 3.6, so I will assume you have one of those versions installed, with pip set up correctly. I'm also assuming you're setting this up on a standard Linux system. If not, proceed carefully.
I don't really have the bandwidth to help with setup issues, so if you're having trouble, you can try:
- Using a clean virtualenv
- Using a clean Linux VM
- Asking a friend
- Using Colaboratory (see below)
Virtual environments simulate a "clean" Python install on your system so you don't need to worry about library conflicts and dependency issues. Therefore I recommend dedicating a virtual environment to this class. First, install virtualenv and virtualenvwrapper:
pip install virtualenv
pip install virtualenvwrapper
You might want to do additional setup here. Then, create and check the virtual environment:
mkvirtualenv -p /usr/bin/python3.6 cs11-tensorflow # Point to the Python binary you'll be using
workon cs11-tensorflow
python --version # Should print "Python 3.x.y" where x is 4, 5, or 6
To activate the environment, use workon cs11-tensorflow
.
To deactivate it when you're done, use deactivate
.
Once you're in the virtual environment, run
pip install numpy scipy matplotlib ipython jupyter pandas scikit-learn tensorflow==1.14.0 keras
Then, try running python -c "import tensorflow"
in your shell.
If the line executes successfully (printing nothing), your setup is probably fine.
If you get an error message like ModuleNotFoundError: No module named 'tensorflow'
then something went wrong.
You'll be doing assignments by modifying .ipynb
notebook files and submitting them via git, so you need your own copy of the code.
Duplicate the repository (don't fork it, since forks can't be private), host it on GitHub, and add me (and the TAs, whose email addresses I'll send to you).
With the virtual environment active, run
jupyter-notebook
to host the notebook server. If you haven't used Jupyter notebooks before, here's a pretty good guide.
Apart from running a Jupyter notebook locally, Google also provides free "hosted notebooks" with Colaboratory. They're a particularly good choice if your computer is not too powerful (they offer free compute time on GPUs as long as you're not using them for too long at once) or if you're having a hard time setting up the dependencies (they come pre-loaded with all of the libraries we'll use).
The tradeoff is that loading and saving data and files will follow a different procedure than if you're running it locally, running TensorBoard is more complicated (look up the TensorBoardColab library), and the notebook is online so you may experience more latency.
I'm writing this class with the intent that everyone runs the notebooks locally, so if you want to use Colaboratory you may have to figure a lot of it out on your own. For the most part, any modern computer should be able to handle the processsing required in a reasonable amount of time. For any assignment that takes serious processing power, I'll write with Colaboratory and its free acceleration in mind.
Brownie points to anyone who gets their code running on a TPU.
To submit assignments, make a private GitHub repository forking this one. Give read access to me and the TAs for the term you're taking it, then for each assignment, email me the commit hash for your submission. I'll send out the relevant information (GitHub usernames, email addresses, etc) once people have registered for the class.
The labs are due at the following times (11 PM Friday) each term:
- Lab 1: End of the second week.
- Lab 2: End of the third week.
- Lab 3: End of the fourth week.
- Lab 4: End of the sixth week (take week 5 off for midterms for your other classes).
- Lab 5: End of the seventh week.
- Lab 6: End of the eighth week.
- Lab 7: End of the ninth week.
At the start of term, I'll send out exact due dates.
Feel free to collaborate on concepts, algorithms, etc, but please don't share code in any way. This includes looking up code snippets that do what you're trying to do.
However: this is a programming language class, and sometimes the difficulty is in finding the right Operations to use. So, feel free to get help from me and others for searching the documentation, understanding syntax and common language forms, knowing which Operations to use, etc. You can also look up TensorFlow code snippets that do similar, but not the same, thing. It's the difference between getting help on a mini-project to "write linear regression" by searching Stack Exchange for "how to write linear regression in TensorFlow" (bad) vs "how to do matrix multiplication in TensorFlow" (good). Ask me if you have questions.
Ultimately, this is a pass-fail course and you're here to learn useful things, so only do what helps you do that.
- Documentation and TensorFlow resources:
- TensorFlow official API: The official documentation; hard to search effectively but the individual pages are good
- TensorFlow official API (devdocs version): An easier-to-search rehost of the official documentation, which I usually prefer to use
- TensorFlow official guide: A short, high-quality tutorial series by the TF developers
- "Tensorflow: The Confusing Parts" is a short and very readable guide to the key abstractions behind TensorFlow
- "A Practical Guide for Debugging TensorFlow Codes" is exactly what it sounds like
- Derivatives on a computational graph:
- Calculus on Computational Graphs: Backpropagation: Intuitively building up backpropagation on a graph from simple calculus
- Hacker's Guide to Neural Networks: The first part of this (awesome!) article explains backpropagation with examples in code and logic circuits
- Machine learning:
- The "Deep Learning Book" is in my mind the comprehensive reference on deep learning, from ML first principles to modern approaches. It's a must-read for anyone wanting to seriously get into deep learning. And it's free!
- Practical Deep Learning for Coders (fast.ai) is an incredibly popular online course in deep learning.
- Cool stuff:
TensorFlow 2.0 was just released.
This course was written for TensorFlow 1.0, and while I'd like to update it to 2.0 as soon as possible, it'll stay in 1.0 for now -- install version tensorflow==1.14.0
with pip.
The concepts remain the same, so the class should still get you writing effective TensorFlow 2.0 code with a pretty easy transition.
A number of the important APIs (tf.nn
, tf.keras
) remain the same or similar too.