Python Data Analysis
.MP4, AVC, 1280x720, 30 fps | English, AAC, 2 Ch | 3h 43m | 521 MB
Instructor: Michele Vallisneri
.MP4, AVC, 1280x720, 30 fps | English, AAC, 2 Ch | 3h 43m | 521 MB
Instructor: Michele Vallisneri
Data science has transformed the way that government and industry leaders look at both specific problems and the world at large. Curious about how data analysis works in practice? In this course, instructor Michele Vallisneri explains what it takes to get started with data science using Python.
Michele demonstrates how to set up your analysis environment and provides a refresher on the basics of working with data structures in Python. Then, he jumps into the big stuff: the power of arrays, indexing, and tables in NumPy and pandas. He also guides you through two sample big-data projects: using NumPy to identify and visualize weather patterns and using pandas to analyze the popularity of baby names over the last century. Challenges issued along the way help you practice what you've learned. Plus, learn about the skills in the basic tasks of data analysis: importing and wrangling, summarizing and visualizing, modeling, and reasoning.
This course is integrated with GitHub Codespaces, an instant cloud developer environment that offers all the functionality of your favorite IDE without the need for any local machine setup. With GitHub Codespaces, you can get hands-on practice from any machine, at any time—all while using a tool that you’ll likely encounter in the workplace. Check out the “Set up: Using Codespaces” video to learn how to get started.
Learning objectives
- Demonstrate proficiency in using Python data structures, including tuples, lists, dictionaries, sets, and comprehensions, as well as advanced data structures like defaultdicts and data classes.
- Manipulate and analyze data effectively using NumPy, including creating arrays, indexing, performing mathematical operations, working with special arrays like records and dates, and leveraging the NumPy ecosystem.
- Utilize pandas to work with structured data, including creating and manipulating DataFrames and series, indexing, performing mathematical and plotting operations, and conducting database operations.
- Import, wrangle, and preprocess data using pandas, including cleaning, filtering, reshaping, tidying, and simulating data, as well as linking and merging databases.
- Summarize and visualize data using various techniques, such as exploring data, summarizing quantitative and categorical data, visualizing distributions and categorical data, and comparing variables, leveraging tools like Plotly and Dash for interactive dashboards.