Web Requests and Scrapping

HTTP Status Codes

200 is success
404 is not found
1XX is information
2XX is success
3XX is redirect and other annoynaces
4XX and 5XX are failures

Curl

curl https://example.com

Python

import requests
r = requests.get('https://example.com/')
print(r.text) # HTML as string
print(r.status_code) # <int>

Beautiful Soup

Docs for Beautiful Soup 4

from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')