PY301: Python for Data Analysis

What you can learn from this course,

Part I: Python Language Essentials
The Python Interpreter
Python 2 and Python 3
The Basics
Language Semantics
Scalar Types
Control Flow
Data Structures and Sequences
Tuple
List
Built-in Sequence Functions
Dict
Set
List, Set, and Dict Comprehensions
Functions
Namespaces, Scope, and Local Functions
Returning Multiple Values
Functions Are Objects
Anonymous (lambda) Functions
Closures: Functions that Return Functions
Extended Call Syntax with *args, **kwargs
Currying: Partial Argument Application
Generators
Files and the operating system

Part II:
Chapter 1. Preliminaries
Why Python for Data Analysis?
Python as Glue
Solving the “Two-Language” Problem
Why Not Python?
Essential Python Libraries
NumPy
pandas
matplotlib
IPython and Jupyter
SciPy
scikit-learn
statsmodels
Installation and Setup

Chapter 2. Introductory Examples
1.usa.gov data from bit.ly
Counting Time Zones in Pure Python
Counting Time Zones with pandas
MovieLens 1M Data Set
Measuring rating disagreement
US Baby Names 1880-2010
Analyzing Naming Trends
Measuring the increase in naming diversity
The “Last letter” Revolution
Boy names that became girl names (and vice versa)
Conclusions and The Path Ahead
Chapter 3. IPython and Jupyter: Interactive Computing and Notebooks
IPython Basics
Running the IPython Shell
Running the Jupyter Notebook
Tab Completion
Introspection
The %run Command
Executing Code from the Clipboard
Terminal Keyboard Shortcuts
Exceptions and Tracebacks
About Magic Commands
Matplotlib Integration
Using the Command History
Searching and Reusing the Command History
Interacting with the Operating System
Software Development Tools
Interactive Debugger
Timing Code: %time and %timeit
Basic Profiling: %prun and %run -p
Profiling a Function Line-by-Line
Tips for Productive Code Development Using IPython
Reloading Module Dependencies
Code Design Tips
Keep relevant objects and data alive
Flat is better than nested
Overcome a fear of longer files
Advanced IPython Features
Making Your Own Classes IPython-friendly
Profiles and Configuration
Chapter 4. NumPy Basics: Arrays and Vectorized Computation
The NumPy ndarray: A Multidimensional Array Object
Creating ndarrays
Data Types for ndarrays
Operations between Arrays and Scalars
Boolean Indexing
Fancy Indexing
Transposing Arrays and Swapping Axes
Universal Functions: Fast Element-wise Array Functions
Loop-free programming with arrays
Expressing Conditional Logic as Array Operations
Mathematical and Statistical Methods
Methods for Boolean Arrays
Sorting
Unique and Other Set Logic
File Input and Output with Arrays
Storing Arrays on Disk in Binary Format
Saving and Loading Text Files
Linear Algebra
Pseudorandom Number Generation
Example: Random Walks
Simulating Many Random Walks at Once
Chapter 5. Getting Started with pandas
Introduction to pandas Data Structures
Series
DataFrame
Index Objects
Essential Functionality
Reindexing
Dropping entries from an axis
Indexing, selection, and filtering
Arithmetic and data alignment
Function application and mapping
Sorting and ranking
Axis indexes with duplicate values
Summarizing and Computing Descriptive Statistics
Handling Missing Data
Hierarchical Indexing

Duration: Est. 21 Hours