# Introduction

Welcome to the course! Glad you're here :)

### Supporting The Project
* Star the repo 😎
    * Maybe share it with some people new to web-scraping?
* Consider [sponsoring](https://github.com/sponsors/davidteather) me on GitHub
* Send me an email or a [LinkedIn](https://www.linkedin.com/in/davidteather/) message telling me what you enjoy in the course (and maybe what else you want to see in the future)
* Submit PRs for suggestions/issues :)

## Table Of Contents
1. [Welcome!](#welcome)
    1. [What I'm Known For](#what-im-known-for)
    2. [Learning Objectives](#learning-objectives)
    3. [How You Will Learn](#how-you-will-learn)
    4. [How To Learn Effectively](#how-to-learn-effectively)
    5. [Course Topics](#course-topics)
3. [Getting Started](#getting-started)
    1. [Prerequisites](#prerequisites)
    2. [Tools Required](#tools-required)

## Video For The Lesson
Consider checking out the video for this introduction [here](https://www.youtube.com/watch?v=KY3E-6wVOqA&list=PLmRtxHvzkEE8Ofiy4hnnXSoxw7gs4HOHt), this video just provides the [slides](./slides.pdf) with commentary, later lessons are more high quality.

### Video Corrections
None so far

## Welcome

I'm David Teather and I work as a software engineer and my specialty is data extraction.

If you'd like a more visual experience check out the introduction video on [YouTube](https://www.youtube.com/watch?v=KY3E-6wVOqA&list=PLmRtxHvzkEE8Ofiy4hnnXSoxw7gs4HOHt), or pull up the introduction [slides](./slides.pdf)
### What I'm Known For
* [My research](https://theresponsetimes.com/yikyak-is-exposing-user-locations/) on YikYak (a social media app) that was featured in [Vice](https://www.vice.com/en/article/7kbnna/anonymous-social-media-app-yik-yak-exposed-users-precise-locations) and [The Verge](https://www.theverge.com/2022/5/13/23070696/yik-yak-anonymous-app-precise-locations-revealed)
* Creating various data extraction tools
    * My most popular is [TikTokApi](https://github.com/davidteather/TikTok-Api)
        * 600K+ Downloads
        * 2.3K+ Stars

## Course Introduction
### Learning Objectives
* Learners will understand the many different ways websites prevent web scraping
* Learners will be able to reverse engineer a real-world website for data extraction

### How You Will Learn
* Real website examples
    * Although these websites might change over time and the lesson becomes broken 
* Websites I've created for this course
    * Will not change to ensure that these lessons don't break
* Each lesson will have a hands on activity
    * In addition most modules will have a `submission.py` file that you can create functions related to the lesson concept and run it against a test suite
    * These will primarily focused on extracting data from the websites created for this course

### How To Learn Effectively
* Everybody learns different so these are guidelines
* Take notes from the slides presented in the [videos](https://youtube.com/playlist?list=PLmRtxHvzkEE8Ofiy4hnnXSoxw7gs4HOHt) 
    * These will revolve around general concepts
    * Will be accompanied by programs to write
* Try the activities before watching the solution in the video
    * Treat the website folder as a black box, like you would a real website, you can figure out everything through the website itself

### Course Topics
* Forging API requests
* Proxies
* Captchas
* Storing data at scale
* Emulating human behavior
* And more 
    * Feel free to [tweet at me](https://twitter.com/david_teather) or file an issue with the `lesson-request` label with what you'd like to see

## Getting Started

Learn how to get started learning with this course!
### Prerequisites
* A basic understanding of programming
* Recommended
    * Some python experience
        * We probably won't do much complex python

### Tools Required
* [Docker](https://www.docker.com/)
    * And docker-compose (should be bundled)
* [Python](https://www.python.org/)
    * I'll be using 3.10
* A web browser
    * I'll be using [Brave](https://brave.com/) (chromium based)
    * Doesn't really matter which as long as you can view network traffic
* And the files in this git repo, so be sure to download it! (and maybe give it a star 😉)


Hope you'll enjoy the content in this course! You can either get started with [lesson 1](../001-introduction-to-forging-api-requests/), or check out the [course catalogue](../README.md#course-catalogue)