In this week you’ll get an introduction to the field of data science, review common Python functionality and features which data scientists use, and be introduced to the Coursera Jupyter Notebook for the lectures. All of the course information on grading, prerequisites, and expectations are on the course syllabus, and you can find more information about the Jupyter Notebooks on our Course Resources page.
Basic Data Processing with Pandas
In this week of the course you’ll learn the fundamentals of one of the most important toolkits Python has for data cleaning and processing — pandas. You’ll learn how to read in data into DataFrame structures, how to query these structures, and the details about such structures are indexed.
More Data Processing with Pandas
In this week you’ll deepen your understanding of the python pandas library by learning how to merge DataFrames, generate summary tables, group data into logical pieces, and manipulate dates. We’ll also refresh your understanding of scales of data, and discuss issues with creating metrics for analysis. The week ends with a more significant programming assignment.
Answering Questions with Messy Data
In this week of the course you’ll be introduced to a variety of statistical techniques such a distributions, sampling and t-tests. The week ends with two discussions of science and the rise of the fourth paradigm — data driven discovery.