These are the notebooks for my PyconPL talk about basics of doing data science in Python, using Kaggle's Predict survival on the Titanic as an example task.
- Getting Started with Python: Kaggle's Titanic Competition
- Getting Started with Pandas: Kaggle's Titanic Competition
- Exploratory analysis in Python using Pandas
- Data Munging in Python using Pandas
- Learning From Data: small, good for beginners and has an online course
- Machine Learning: A Probabilistic Perspective: larger and still current and very popular
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction: a lot of theory, has free PDF edition
- Machine Learning by Andy Ng (Coursera)
- Intro to Machine Learning by Sebastian Thrun (Udacity)
- dataquest.io
- Kaggle competitions and tutorials