Skip to content

Welcome to my data playground, a repository where I explore and learn to use new tools and technologies for dealing with a variety of data problems.

Notifications You must be signed in to change notification settings

rtm010/data-playground

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Data Playground

Welcome to my data playground, a repository where I explore and learn to use new tools and technologies for dealing with a variety of data problems. I've decided to publish my projects to this repository so that I can share what I've learned, and hopefully help others who are new to these topics.

I'll use real-world examples as much as possible, because I realize that often there is a significant gap going from toy examples in tutorials/training courses to dealing with real-world data problems.

I'll primarily use python for my examples, but may explore other tools/languages in the future.

Projects

1. Wrangling COVID-19 data
In this project I dive into web scraping and data wrangling techniques.

  • I'll use the python libraries Requests and Beautiful Soup to automatically download PDFs from the WHO website containing daily COVID-19 data.
  • A second implementation uses Selenium, which is essential when you're dealing with dynamic websites that only build up the HTML page once opened in your browser. Think of websites built using frameworks like React, Angular, and Vue.
  • Once downloaded, I'll read the table from the PDF using tabula-py and clean up the data using pandas.

Contact

Got any questions?
Reach out to me here or open an issue. Happy to have a chat! 😃

About

Welcome to my data playground, a repository where I explore and learn to use new tools and technologies for dealing with a variety of data problems.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published