data-science-ipython-notebooks
Index
- deep-learning
- scikit-learn
- statistical-inference-scipy
- pandas
- matplotlib
- numpy
- python-data
- kaggle-and-business-analyses
- spark
- mapreduce-python
- amazon spider web services
- command lines
- misc
- notebook-installation
- credits
- contributing
- contact-info
- license
deep-learning
IPython Notebook(s) demonstrating deep learning functionality.
tensor-flow-tutorials
Additional TensorFlow tutorials:
- pkmital/tensorflow_tutorials
- nlintz/TensorFlow-Tutorials
- alrojo/tensorflow-tutorial
- BinRoot/TensorFlow-Book
Notebook | Description |
---|---|
tsf-basics | Learn basic operations inwards TensorFlow, a library for diverse kinds of perceptual in addition to linguistic communication agreement tasks from Google. |
tsf-linear | Implement linear regression inwards TensorFlow. |
tsf-logistic | Implement logistic regression inwards TensorFlow. |
tsf-nn | Implement nearest neighboars inwards TensorFlow. |
tsf-alex | Implement AlexNet inwards TensorFlow. |
tsf-cnn | Implement convolutional neural networks inwards TensorFlow. |
tsf-mlp | Implement multilayer perceptrons inwards TensorFlow. |
tsf-rnn | Implement recurrent neural networks inwards TensorFlow. |
tsf-gpu | Learn nigh basic multi-GPU computation inwards TensorFlow. |
tsf-gviz | Learn nigh graph visualization inwards TensorFlow. |
tsf-lviz | Learn nigh loss visualization inwards TensorFlow. |
tensor-flow-exercises
Notebook | Description |
---|---|
tsf-not-mnist | Learn elementary information curation past times creating a pickle alongside formatted datasets for training, evolution in addition to testing inwards TensorFlow. |
tsf-fully-connected | Progressively develop deeper in addition to to a greater extent than accurate models using logistic regression in addition to neural networks inwards TensorFlow. |
tsf-regularization | Explore regularization techniques past times preparation fully connected networks to sort notMNIST characters inwards TensorFlow. |
tsf-convolutions | Create convolutional neural networks inwards TensorFlow. |
tsf-word2vec | Train a skip-gram model over Text8 information inwards TensorFlow. |
tsf-lstm | Train a LSTM graphic symbol model over Text8 information inwards TensorFlow. |
theano-tutorials
Notebook | Description |
---|---|
theano-intro | Intro to Theano, which allows you lot to define, optimize, in addition to evaluate mathematical expressions involving multi-dimensional arrays efficiently. It tin utilisation GPUs in addition to perform efficient symbolic differentiation. |
theano-scan | Learn scans, a machinery to perform loops inwards a Theano graph. |
theano-logistic | Implement logistic regression inwards Theano. |
theano-rnn | Implement recurrent neural networks inwards Theano. |
theano-mlp | Implement multilayer perceptrons inwards Theano. |
keras-tutorials
Notebook | Description |
---|---|
keras | Keras is an opened upward source neural network library written inwards Python. It is capable of running on overstep of either Tensorflow or Theano. |
setup | Learn nigh the tutorial goals in addition to how to laid upward your Keras environment. |
intro-deep-learning-ann | Get an intro to deep learning alongside Keras in addition to Artificial Neural Networks (ANN). |
theano | Learn nigh Theano past times working alongside weights matrices in addition to gradients. |
keras-otto | Learn nigh Keras past times looking at the Kaggle Otto challenge. |
ann-mnist | Review a elementary implementation of ANN for MNIST using Keras. |
conv-nets | Learn nigh Convolutional Neural Networks (CNNs) alongside Keras. |
conv-net-1 | Recognize handwritten digits from MNIST using Keras - Part 1. |
conv-net-2 | Recognize handwritten digits from MNIST using Keras - Part 2. |
keras-models | Use pre-trained models such every bit VGG16, VGG19, ResNet50, in addition to Inception v3 alongside Keras. |
auto-encoders | Learn nigh Autoencoders alongside Keras. |
rnn-lstm | Learn nigh Recurrent Neural Networks (RNNs) alongside Keras. |
lstm-sentence-gen | Learn nigh RNNs using Long Short Term Memory (LSTM) networks alongside Keras. |
deep-learning-misc
Notebook | Description |
---|---|
deep-dream | Caffe-based reckoner vision plan which uses a convolutional neural network to abide by in addition to heighten patterns inwards images. |
scikit-learn
IPython Notebook(s) demonstrating scikit-learn functionality.
Notebook | Description |
---|---|
intro | Intro notebook to scikit-learn. Scikit-learn adds Python back upward for large, multi-dimensional arrays in addition to matrices, along alongside a large library of high-level mathematical functions to operate on these arrays. |
knn | Implement k-nearest neighbors inwards scikit-learn. |
linear-reg | Implement linear regression inwards scikit-learn. |
svm | Implement back upward vector machine classifiers alongside in addition to without kernels inwards scikit-learn. |
random-forest | Implement random wood classifiers in addition to regressors inwards scikit-learn. |
k-means | Implement k-means clustering inwards scikit-learn. |
pca | Implement primary element analysis inwards scikit-learn. |
gmm | Implement Gaussian mixture models inwards scikit-learn. |
validation | Implement validation in addition to model alternative inwards scikit-learn. |
statistical-inference-scipy
IPython Notebook(s) demonstrating statistical inference alongside SciPy functionality.
Notebook | Description |
---|---|
scipy | SciPy is a collection of mathematical algorithms in addition to convenience functions built on the Numpy extension of Python. It adds pregnant mightiness to the interactive Python session past times providing the user alongside high-level commands in addition to classes for manipulating in addition to visualizing data. |
effect-size | Explore statistics that quantify trial size past times analyzing the difference inwards superlative betwixt men in addition to women. Uses information from the Behavioral Risk Factor Surveillance System (BRFSS) to approximate the hateful in addition to measure difference of superlative for adult women in addition to men inwards the United States. |
sampling | Explore random sampling past times analyzing the average weight of men in addition to women inwards the USA using BRFSS data. |
hypothesis | Explore hypothesis testing past times analyzing the difference of first-born babies compared alongside others. |
pandas
IPython Notebook(s) demonstrating pandas functionality.
Notebook | Description |
---|---|
pandas | Software library written for information manipulation in addition to analysis inwards Python. Offers information structures in addition to operations for manipulating numerical tables in addition to fourth dimension series. |
github-data-wrangling | Learn how to load, clean, merge, in addition to characteristic engineer past times analyzing GitHub information from the Viz repo. |
Introduction-to-Pandas | Introduction to Pandas. |
Introducing-Pandas-Objects | Learn nigh Pandas objects. |
Data Indexing in addition to Selection | Learn nigh information indexing in addition to alternative inwards Pandas. |
Operations-in-Pandas | Learn nigh operating on information inwards Pandas. |
Missing-Values | Learn nigh treatment missing information inwards Pandas. |
Hierarchical-Indexing | Learn nigh hierarchical indexing inwards Pandas. |
Concat-And-Append | Learn nigh combining datasets: concat in addition to append inwards Pandas. |
Merge-and-Join | Learn nigh combining datasets: merge in addition to bring together inwards Pandas. |
Aggregation-and-Grouping | Learn nigh aggregation in addition to grouping inwards Pandas. |
Pivot-Tables | Learn nigh pin tables inwards Pandas. |
Working-With-Strings | Learn nigh vectorized string operations inwards Pandas. |
Working-with-Time-Series | Learn nigh working alongside fourth dimension serial inwards pandas. |
Performance-Eval-and-Query | Learn nigh high-performance Pandas: eval() in addition to query() inwards Pandas. |
matplotlib
IPython Notebook(s) demonstrating matplotlib functionality.
Notebook | Description |
---|---|
matplotlib | Python 2D plotting library which produces publication character figures inwards a multifariousness of hardcopy formats in addition to interactive environments across platforms. |
matplotlib-applied | Apply matplotlib visualizations to Kaggle competitions for exploratory information analysis. Learn how to do bar plots, histograms, subplot2grid, normalized plots, scatter plots, subplots, in addition to center density estimation plots. |
Introduction-To-Matplotlib | Introduction to Matplotlib. |
Simple-Line-Plots | Learn nigh elementary describe plots inwards Matplotlib. |
Simple-Scatter-Plots | Learn nigh elementary scatter plots inwards Matplotlib. |
Errorbars.ipynb | Learn nigh visualizing errors inwards Matplotlib. |
Density-and-Contour-Plots | Learn nigh density in addition to contour plots inwards Matplotlib. |
Histograms-and-Binnings | Learn nigh histograms, binnings, in addition to density inwards Matplotlib. |
Customizing-Legends | Learn nigh customizing plot legends inwards Matplotlib. |
Customizing-Colorbars | Learn nigh customizing colorbars inwards Matplotlib. |
Multiple-Subplots | Learn nigh multiple subplots inwards Matplotlib. |
Text-and-Annotation | Learn nigh text in addition to annotation inwards Matplotlib. |
Customizing-Ticks | Learn nigh customizing ticks inwards Matplotlib. |
Settings-and-Stylesheets | Learn nigh customizing Matplotlib: configurations in addition to stylesheets. |
Three-Dimensional-Plotting | Learn nigh three-dimensional plotting inwards Matplotlib. |
Geographic-Data-With-Basemap | Learn nigh geographic information alongside basemap inwards Matplotlib. |
Visualization-With-Seaborn | Learn nigh visualization alongside Seaborn. |
numpy
IPython Notebook(s) demonstrating NumPy functionality.
Notebook | Description |
---|---|
numpy | Adds Python back upward for large, multi-dimensional arrays in addition to matrices, along alongside a large library of high-level mathematical functions to operate on these arrays. |
Introduction-to-NumPy | Introduction to NumPy. |
Understanding-Data-Types | Learn nigh information types inwards Python. |
The-Basics-Of-NumPy-Arrays | Learn nigh the basics of NumPy arrays. |
Computation-on-arrays-ufuncs | Learn nigh computations on NumPy arrays: universal functions. |
Computation-on-arrays-aggregates | Learn nigh aggregations: min, max, in addition to everything inwards betwixt inwards NumPy. |
Computation-on-arrays-broadcasting | Learn nigh computation on arrays: broadcasting inwards NumPy. |
Boolean-Arrays-and-Masks | Learn nigh comparisons, masks, in addition to boolean logic inwards NumPy. |
Fancy-Indexing | Learn nigh fancy indexing inwards NumPy. |
Sorting | Learn nigh sorting arrays inwards NumPy. |
Structured-Data-NumPy | Learn nigh structured data: NumPy's structured arrays. |
python-data
IPython Notebook(s) demonstrating Python functionality geared towards information analysis.
Notebook | Description |
---|---|
data structures | Learn Python basics alongside tuples, lists, dicts, sets. |
data construction utilities | Learn Python operations such every bit slice, range, xrange, bisect, sort, sorted, reversed, enumerate, zip, listing comprehensions. |
functions | Learn nigh to a greater extent than advanced Python features: Functions every bit objects, lambda functions, closures, *args, **kwargs currying, generators, generator expressions, itertools. |
datetime | Learn how to piece of work alongside Python dates in addition to times: datetime, strftime, strptime, timedelta. |
logging | Learn nigh Python logging alongside RotatingFileHandler in addition to TimedRotatingFileHandler. |
pdb | Learn how to debug inwards Python alongside the interactive source code debugger. |
unit tests | Learn how to essay out inwards Python alongside Nose unit of measurement tests. |
kaggle-and-business-analyses
IPython Notebook(s) used in kaggle competitions in addition to concern analyses.
Notebook | Description |
---|---|
titanic | Predict survival on the Titanic. Learn information cleaning, exploratory information analysis, in addition to machine learning. |
churn-analysis | Predict client churn. Exercise logistic regression, slope boosting classifers, back upward vector machines, random forests, in addition to k-nearest-neighbors. Includes discussions of confusion matrices, ROC plots, characteristic importances, prediction probabilities, in addition to calibration/descrimination. |
spark
IPython Notebook(s) demonstrating spark in addition to HDFS functionality.
Notebook | Description |
---|---|
spark | In-memory cluster computing framework, upward to 100 times faster for sure as shooting applications in addition to is good suited for machine learning algorithms. |
hdfs | Reliably stores rattling large files across machines inwards a large cluster. |
mapreduce-python
IPython Notebook(s) demonstrating Hadoop MapReduce alongside mrjob functionality.
Notebook | Description |
---|---|
mapreduce-python | Runs MapReduce jobs inwards Python, executing jobs locally or on Hadoop clusters. Demonstrates Hadoop Streaming inwards Python code alongside unit of measurement essay out and mrjob config file to analyze Amazon S3 bucket logs on Elastic MapReduce. Disco is or hence other python-based alternative. |
aws
IPython Notebook(s) demonstrating Amazon Web Services (AWS) in addition to AWS tools functionality.
Also banking concern fit out:
- SAWS: Influenza A virus subtype H5N1 Supercharged AWS command describe interface (CLI).
- Awesome AWS: Influenza A virus subtype H5N1 curated listing of libraries, opened upward source repos, guides, blogs, in addition to other resources.
Notebook | Description |
---|---|
boto | Official AWS SDK for Python. |
s3cmd | Interacts alongside S3 through the command line. |
s3distcp | Combines smaller files in addition to aggregates them together past times taking inwards a pattern in addition to target file. S3DistCp tin also live used to transfer large volumes of information from S3 to your Hadoop cluster. |
s3-parallel-put | Uploads multiple files to S3 inwards parallel. |
redshift | Acts every bit a fast information warehouse built on overstep of applied scientific discipline from massive parallel processing (MPP). |
kinesis | Streams information inwards existent fourth dimension alongside the mightiness to procedure thousands of information streams per second. |
lambda | Runs code inwards reply to events, automatically managing compute resources. |
commands
IPython Notebook(s) demonstrating diverse command lines for Linux, Git, etc.
Notebook | Description |
---|---|
linux | Unix-like in addition to to a greater extent than frequently than non POSIX-compliant reckoner operating system. Disk usage, splitting files, grep, sed, curl, viewing running processes, terminal syntax highlighting, in addition to Vim. |
anaconda | Distribution of the Python programming linguistic communication for large-scale information processing, predictive analytics, in addition to scientific computing, that aims to simplify packet management in addition to deployment. |
ipython notebook | Web-based interactive computational surroundings where you lot tin combine code execution, text, mathematics, plots in addition to rich media into a unmarried document. |
git | Distributed revision command scheme alongside an emphasis on speed, information integrity, in addition to back upward for distributed, non-linear workflows. |
ruby | Used to interact alongside the AWS command describe in addition to for Jekyll, a weblog framework that tin live hosted on GitHub Pages. |
jekyll | Simple, blog-aware, static site generator for personal, project, or organization sites. Renders Markdown or Textile in addition to Liquid templates, in addition to produces a complete, static website ready to live served past times Apache HTTP Server, Nginx or or hence other spider web server. |
pelican | Python-based alternative to Jekyll. |
django | High-level Python Web framework that encourages rapid evolution in addition to clean, pragmatic design. It tin live useful to percentage reports/analyses in addition to for blogging. Lighter-weight alternatives include Pyramid, Flask, Tornado, and Bottle. |
misc
IPython Notebook(s) demonstrating miscellaneous functionality.
Notebook | Description |
---|---|
regex | Regular aspect cheat canvass useful inwards information wrangling. |
algorithmia | Algorithmia is a marketplace for algorithms. This notebook showcases iv dissimilar algorithms: Face Detection, Content Summarizer, Latent Dirichlet Allocation in addition to Optical Character Recognition. |
notebook-installation
anaconda
Anaconda is a gratis distribution of the Python programming linguistic communication for large-scale information processing, predictive analytics, in addition to scientific computing that aims to simplify packet management in addition to deployment.
dev-setup
For detailed instructions, scripts, in addition to tools to laid upward your evolution surroundings for information analysis, banking concern fit out the dev-setup repo.
running-notebooks
To stance interactive content or to modify elements inside the IPython notebooks, you lot must outset clone or download the repository hence run the notebook. More information on IPython Notebooks tin live found here.
$ git clone https://github.com/donnemartin/data-science-ipython-notebooks.git $ cd data-science-ipython-notebooks $ jupyter notebook
Notebooks tested alongside Python 2.7.x.
credits
- Python for Data Analysis: Data Wrangling alongside Pandas, NumPy, in addition to IPython by Wes McKinney
- PyCon 2015 Scikit-learn Tutorial by Jake VanderPlas
- Python Data Science Handbook by Jake VanderPlas
- Parallel Machine Learning alongside scikit-learn in addition to IPython by Olivier Grisel
- Statistical Interference Using Computational Methods inwards Python by Allen Downey
- TensorFlow Examples by Aymeric Damien
- TensorFlow Tutorials by Parag K Mital
- TensorFlow Tutorials by Nathan Lintz
- TensorFlow Tutorials by Alexander R Johansen
- TensorFlow Book by Nishant Shukla
- Summer School 2015 by mila-udem
- Keras tutorials by Valerio Maggio
- Kaggle
- Yhat Blog
contributing
Contributions are welcome! For põrnikas reports or requests please submit an issue.
contact-info
Feel gratis to contact me to hash out whatever issues, questions, or comments.
- Email: donne.martin@gmail.com
- Twitter: @donne_martin
- GitHub: donnemartin
- LinkedIn: donnemartin
- Website: donnemartin.com
license
This repository contains a multifariousness of content; or hence developed past times Donne Martin, in addition to or hence from third-parties. The third-party content is distributed nether the license provided past times those parties.
The content developed past times Donne Martin is distributed nether the next license:
I am providing code in addition to resources inwards this repository to you lot nether an opened upward source license. Because this is my personal repository, the license you lot have to my code in addition to resources is from me in addition to non my employer (Facebook).
Copyright 2015 Donne Martin Licensed nether the Apache License, Version 2.0 (the "License"); you lot may non utilisation this file except inwards compliance alongside the License. You may obtain a re-create of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required past times applicable police pull or agreed to inwards writing, software distributed nether the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either limited or implied. See the License for the specific linguistic communication governing permissions in addition to limitations nether the License.