Requests and BeautifulSoup

Author

A demo by Breanna E. Green // Powered by Quarto

Requests

Simply put – Requests allows you to “ask” for the information/data from a webpage.

This can be as easy (or as difficult) as you want to make it, but it is one of the O.G. options for pulling information from websites.

BeautifulSoup

BeautifulSoup takes the request concept and makes it agonizing… but then Beautiful.

No really, the setup for it can have an intense learning curve (well, for me at least). Once you figure it out and get it working the way you like, you can scrape for days!

Need to Know

Let’s be clear, not all webpages WANT to be scraped… thus, making these processes quite time consuming on the forefront. There are certain things that you will need to get well aquainted with, such as HTML tags:

HTML TAGS

HTML Tag Explanation
<!DOCTYPE>  Defines document type
<html> Defines HTML document
<head> Main information about document
<title> Title for document
<body> Document body
<h1> to <h6> Headings
<p> Paragraph
<br> Line break
<!--comment here--> Comment
<img> Image
<a> Hyperlink
<ul> Unordered list
<ol> Ordered list
<li> List item
<style> Style information for a document
<div> Section in a document
<span> Section in a document

However, that does not mean it’s impossible! And I think it’s a good practice in learning new skills, making solid decisions about what you are actually studying, etc.

Home | Return to Twitter Scraping Demo Home Page | << Twitter API Demo | BeautifulSoup Demo >>