Scraping is a technique that allows you to extract information from websites.
This programming tutorial on web scraping with Python will allow you to:
- Take your first steps in scraping (to start with requests and beautifulsoup)
- Learn and bypass the protection mechanisms against scraping
- Know the legal rules regarding scraping
- Create an advanced project that will allow you to scrape any site.
- Use AI (ChatGPT) to help you with code generation.
----------------------------------------------------------
Prerequisites:
----------------------------------------------------------
The links:
1️⃣ This video is in partnership with Brightdata (professional scraping solution), which offers you $15 of credit by going through this link:
https://brdta.com/CodeAvecJonathan(this link does not bring me any commission, it just allows you to follow this tutorial for free)
----------------------------------------------------------
The program:
00:00:00 Introduction
00:01:50 Prerequisites
00:02:10 The program
00:02:31 PART 1 - Your first steps in scraping
00:04:18 Make an HTTP request (requests)
00:13:36 Extract information (title + description)
00:23:33 Retrieve multiple elements (ingredients)
00:27:40 Exercise: Preparation steps
00:30:22 Tips to go further (generate code with ChatGPT / practice with scrapethissite.com)
00:33:16 PART 2 - Protections against scraping
00:36:34 User-agent: masquerade as a browser
00:41:28 Issues related to Javascript
00:44:13 Headless browsing: bypass issues related to Javascript
00:45:42 Professional scraping solutions: IP rotations, Proxies, Anti-captcha...
00:48:03 PART 3 - Is scraping legal?
00:50:56 PART 4 - Advanced scraping project
00:52:23 Protected sites: Limits of the current script
00:55:51 The steps of the project
00:56:54 Creating your account on BrightData.com
00:58:18 Understanding: The WebUnlocker and ScrapingBrowser
00:59:37 Using the Web Unlocker
01:11:44 Using the Scraping Browser
01:17:39 Bypassing scraping mode
01:21:22 Extracting information (title)
01:26:53 Extracting information (number of reviews, price, description)
01:37:19 Multiple URLs, storing data, scheduler
01:42:53 Rephrasing content with the ChatGPT API
01:45:57 Conclusion
----------------------------------------------------------
About:
Passionate developer with over 19 years of professional experience, I am currently a freelance developer specializing in iOS and Android mobile applications, and WEB servers. I work remotely with my clients.
On this channel, I offer you to discover programming in a different way: I bring you my pedagogy and my professional techniques.
The goal? To allow you to learn programming, to become a better developer, to professionalize yourself, and why not change your life.
Subscribe to the channel to access new videos on the following topics:
- Programming tutorial (Python, C#, .NET, ...)
- Becoming a freelance developer
- Using generative AIs (ChatGPT, Midjourney...)