Jupyter Notebooks

Get Started

create a virtualenv
activate the virtualenv
install the requirements.text using "pip install -r requirements.text"
- alternatively install Anaconda from http://www.continuum.io/
clone this repo
change to the root of your repo clone
start the IPython notebook server with "./start_ipython.sh"
find some amazing patterns in data :)

SQL or S3 csv files?
- SQL for adhoc, re-run the notebook frequently to "monitor" the data
- S3, csv for reproducable results, re-run the notebook but have a history of data being used, shared results
for common tasks use the same layout, flow
- start with imports
- read the data
- clean it
- feature analysis
  - distribution function
  - outliers
- validation
  - feature reduction, selection
  - model selection
  - estimator performance, over-fitting

easy to share
if the output is left in the ipynb file the notebook can be executed later and then the results can be compared to previous runs
github.com already renders the notebooks. it has an nbviewer build in.
github enterprise should soon render them too

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
basics		basics
interview		interview
math		math
ml		ml
stats		stats
wrangle		wrangle
.gitignore		.gitignore
README.md		README.md
engine.py		engine.py
example_redshift.ipynb		example_redshift.ipynb
example_s3.ipynb		example_s3.ipynb
example_use_redshift.ipynb		example_use_redshift.ipynb
finance.ipynb		finance.ipynb
requirements.txt		requirements.txt
settings.py		settings.py
start_ipython.sh		start_ipython.sh
ud651_6.ipynb		ud651_6.ipynb
util.py		util.py