The Wayback Machine - https://web.archive.org/web/20210901183917/https://github.com/topics/website-scraper
Here are
59 public repositories
matching this topic...
Download website to local directory (including all css, images, js, etc.)
Updated
Jun 30, 2021
JavaScript
A new web development methodology for JavaScript & C# developers. A super fast and very easy to use CMS.
🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Updated
Jun 22, 2021
TypeScript
Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.
Plugin for website-scraper which returns html for dynamic websites using puppeteer
Updated
Jun 10, 2021
JavaScript
A server to collect & archive websites, also supports video downloads
Updated
Sep 1, 2021
TypeScript
Plugin for website-scraper which returns html for dynamic websites using PhantomJS.
Updated
Mar 25, 2021
JavaScript
Wayback Machine Downloader. 🔥 Download your entire archived websites from the Internet Archive Wayback Machine.
🕸 Builds and serves RSS feeds via HTTP. Generate your own feeds or start instantly with the included configs.
Updated
Aug 26, 2021
Ruby
A spider to crawl webpages
Updated
Feb 16, 2020
Python
Now you can keep track of your followers from YouTube, Instagram and Twitter accounts - Followers scraper API on AWS serverless
Updated
Jun 5, 2021
TypeScript
JSON collection of scraped file extensions, along with their description and type, from FileInfo.com
Updated
Jul 17, 2021
Python
Bandwidth efficient scheduled downloads
Updated
Mar 28, 2018
Shell
Scraping websites made easy! A minimalistic yet powerful tool for collecting data from websites.
Updated
Jan 3, 2019
JavaScript
Website Penetration Testing Tool With Dos Attack Feature
Updated
Sep 5, 2020
Python
Alexa Bulk Website Rank Checker PHP Script 2020 Latest! you can grab 200+ URL's website ranking at once!
Article Dataset Generator for Internet News Sites. Crawls news sites, analyses them with NLP (sentiment analysis), and pushes to a database.
Updated
Oct 25, 2017
Jupyter Notebook
Download ALL the images (JPEG/GIF/PNG) from any Tumblr website! This project employs Python3 and BeautifulSoup4 to scrape a Tumblr site (with the url provided by the user) to download, page by page, all the images from the Tumblr site's posts. Ideal for archiving other peoples' Tumblrs <3
Updated
Apr 9, 2018
Python
Scrapes any website to retrieve all hyperlinks from it in a matter of seconds. Scraping made easy!
Updated
Jan 10, 2018
Python
Simple library which parses web pages into objects usin attributes
This is a python based website crawling script equipped with Random time intervals, User Agent switching and IP rotation through proxy server capabilities to trick the website robot and avoid getting blocked.
Updated
Jun 2, 2021
Python
Plugin for website-scraper which allows to save resources to existing directory
Updated
Jun 9, 2021
JavaScript
Run the following python code with a text file in the same directory containing the words for which you need the mnemonic.
Updated
Nov 20, 2017
Python
Simple program which is able to extract the video stream from online streaming sites and show it using VLC
Updated
Oct 4, 2020
Python
Python application that scraps diverse sources for covid-19 papers, applies NLP transformations and stores them in a dataset for visualizing on a Flask web application.
Updated
Aug 23, 2021
Jupyter Notebook
Universal Web-page Scraper for NodeJS
Updated
May 21, 2017
JavaScript
Updated
Aug 26, 2019
Python
Yet another telegram bot for flibusta library.
Updated
May 22, 2020
HTML
This is a website url scraper built using python.
Updated
Aug 16, 2021
Python
All match details of Pakistan Super League teams. All data is scraped from
www.psl-t20.com website using python.
Updated
Nov 8, 2017
Python
Improve this page
Add a description, image, and links to the
website-scraper
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
website-scraper
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.