GitHub - GabrielDan92/eMag_Altex-WebScraping

eMag & Altex WebScraping

The Python script extracts all the pages from a given eMag link in less than 20 seconds and saves the data in an Excel output file. In the below example the script extracted all the 25 pages with aprox 1500 products from the mobile phones eMag page (https://www.emag.ro/telefoane-mobile/c).

By sending a different header with each request using https://httpbin.org/user-agent, I managed to bypass the anti spyder/scraping tools eMag is currently using. Otherwise, the website would lock me out after several requests.

Python tools used:

Pandas
Requests
Threading
BeautifulSoup
Regular Expressions/Regex

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
README.md		README.md
altex.py		altex.py
emag.gif		emag.gif
emag.py		emag.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

eMag & Altex WebScraping

Python tools used:

About

Releases

Packages

Languages

GabrielDan92/eMag_Altex-WebScraping

Folders and files

Latest commit

History

Repository files navigation

eMag & Altex WebScraping

Python tools used:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages