Search Engine Scraper
Harvesting URLs from Search Engines
Search Engine Harvester
- Comes bundled with ScrapeBox – a great tool to have
- By far the fastest scraper available – this thing is lightning fast
- Got multiple connections going on at the same time – even up to 3,000 if you want
- Will find URLs on +30 search engines
- Proxy Support – and you can even refresh them while you’re harvesting
- Is highly trainable – so you can get it to scrape any search engine you like
- Will harvest results from any search query you throw at it
- You can even use a custom useragent to scrape low bandwidth mobile search engines
- Get detailed harvester stats while you’re working
- Keyword stats too
When we released ScrapeBox v2.0 we hit a major milestone – we built the fastest search engine scraper ever made. And the speeds are pretty staggering – over 1 million URL’s per minute is no joke.
ScrapeBox comes with a custom search engine scraper that can be trained to scrape virtually any website that has a search function. So whether you’re after a load of URLs from a WordPress blog or a major search engine like Google, Bing or Yahoo, this is the tool for you.
We’ve got about 30 search engines already set up and ready to go. This means you can get started with your keyword scraping in no time – just stick your keywords in and get her going. And if you need a bit more help, we’ve also included a Keyword Scraper to get you started.
Some of the search engines we’ve included are Lycos, Ask.com, Rambler, AltaVista, Mojeek, Blekko, Excite, HotBot, IXQuick, DogPile, Blingo – and even some ISP specific search engines like Charter, Verizon, Comcast and Orange.co.uk. There’s even a YouTube scraper to get at all those YouTube video URLs and an Alexa Topsites scraper to get domains with the highest traffic rankings.
ScrapeBox is also super flexible – you can run thousands of connections at once if you’ve got the internet speed to back it up. Or if you’re on a slower connection, you can dial it right back to conserve bandwidth.
You can also configure all sorts of options – like proxy retries, removing dead proxies while harvesting, and refreshing proxies while you’re at it.
Trainable Harvester
The Harvester is totally trainable
This means you can take the 30 search engines we’ve included and add your own custom search engines to the mix. So if you need to scrape a WordPress site with a search box, or a country-based search engine, or even a completely custom search engine – this is the tool for you.
Training new engines is pretty straightforward – many people have worked out how to do it just by looking at how our included search engines are set up. We’ve got a tutorial video to help you out, or our support staff are always happy to lend a hand.
You can even export engine files to share with your mates or work colleagues who use ScrapeBox.
For the power users among you, there are even more advanced options. For each engine you can customise all the header data ScrapeBox sends with each request. You can change the useragent to use low bandwidth mobile search engines. You can set custom cookies, clear cookies before each request, follow redirects… the list goes on.
Harvester Stats
We love stats – and that’s why we’ve included detailed statistics when you’re harvesting. We know not everyone needs to scrape millions of URL’s – some people just need a bit of granular data.
So we’ve included harvester statistics to help you log how many results were obtained for every keyword in every search engine. And to make life even easier, the harvester can save the keyword with each harvested URL so you can easily work out what keywords produced what results.




