Web-Scrapping-Python

In this repository, I have used a dummy website to scrape data due to scraping rules. I have used Requests and BeautifulSoup for scraping.

Scraping Rules:

Check a website's Term and Conditions before scraping it and read the statements about legal use of the data.
Do not request data from the website too aggressiely and ensure that your program behaves in a reasonable manner.

Usage

You can download and open the python file on your preferred editor.
You can download and open the notebook on Jupyter Notebook or Google Colab

Steps to Scrape

Insect the page
Obtain HTML
Choose a parser (lxml , html5lib , html.parser)
Create a beautifulsoup object
Extract tags that we need
Store the data in lists
Make a dataframe
Download a CSV file that contains all data scraped

Specifications

The scrapers are different between one site and another. So, to use those scrapers, you have to change the value of base_site with the url desired, and identify tags to extract.

Packages Used

from bs4 import BeautifulSoup
import requests
import pandas as pd

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Fakejob_scraping.ipynb		Fakejob_scraping.ipynb
README.md		README.md
fake_jobs.csv		fake_jobs.csv
temperature_scrapping.ipynb		temperature_scrapping.ipynb
weather.csv		weather.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web-Scrapping-Python

Scraping Rules:

Usage

Steps to Scrape

Specifications

Packages Used

About

Releases

Packages

Languages

SaloniThete/Web-Scrapping-Python

Folders and files

Latest commit

History

Repository files navigation

Web-Scrapping-Python

Scraping Rules:

Usage

Steps to Scrape

Specifications

Packages Used

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages