Skip to content

LinkCollector is web-crawler which collects links of given host recursively

Notifications You must be signed in to change notification settings

ezzhood/LinkCollector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LinkCollector

⚡ A lightning-fast web-crawler which collects links of a given host recursively and categorizes them to internal and external links.

Features

  • Parallel work get queries faster with parallel works when websites have much more pages to request
  • RESTful API run the server in background and integrate it in your technical stack

Getting started

For getting started you will need rust and cargo installed on your computer. Then follow commands below:

To run

cargo run

To build:

cargo build --release

after building, you will have binary file located in target/release/link_collector, then just execute it.

Server address: http://0.0.0.0:4000

Available requests

  • http://0.0.0.0:4000/links?url=https://my.website.com

Query url takes seed host to look up links for.

Demo

After running server, request to http://0.0.0.0:4000/links?url=https://www.rust-lang.org to get all links from official rust programming language. Result below:

Contributing

LinkCollector is open-source, so if you have any idea to improve or want to contribute feel free to open pull requests!

About

LinkCollector is web-crawler which collects links of given host recursively

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages