BizwikiBot

Home : BizwikiBot

All about BizwikiBot

What is BizwikiBot?

BizwikiBot is a web crawler programmed to browse the internet in a methodical, automated manner. When visiting sites, BizwikiBot spiders pages and retrieves text as a means of providing up-to-date information about businesses to the many people who use the Bizwiki site.

This automatically spidered information is added to the content, providing additional detail to the site's users.

BizwikiBot is designed to be a polite bot, following any instructions from the site's owners and using the minimum bandwidth and server resources.

What does BizwikiBot spider?

Bizwikibot is programmed to perform tasks such as confirming or updating vital contact details and retrieving and processing a short portion of text about companies.

At this stage, BizwikiBot is only spidering websites of companies in the United States and United Kingdom. This will change as international versions of the site are launched.

Please note that editors can update a company's contact details, so that it is possible to have the current contact information listed on Bizwiki even if the company's website has not yet been updated. BizwikiBot will not over-write hand edited contact details apart from expired domains.

How do I make sure BizwikiBot sees my site?

If the current Bizwiki listing for your company does not have a website please use the Report an Error form to let us know about the website address. Once the website address is added to the Bizwiki listing it will be added to a list for BizwikiBot to spider in its next pass.

Why has no additional information been taken from my website?

If you have a website and it is correctly listed on this website, chances are it will be spidered over the next few weeks. Please be patient - there is a lot of web to get through!

If you have waited for more than a couple of weeks for additional information to appear, it may be the case that the site was unreadable at the time BizwikiBot tried to spider it or it failed the bot's Quality Guidelines.

Frequent cases where a site cannot be read from are websites with content in Flash or images (bots only read text), insufficient text on the pages, no About page or contact details, or the site's links are not working. Remember you can add additional information about a company at any time, but if you think there is a problem and none of the above-mentioned reasons apply please contact us.

Why has no information been taken from my website at all?

This could be because content on your website does not fall within our Quality Guidelines. Reasons why a site might fail Bizwiki's Quality Guidelines include sites with adult material or content inappropriate for appearance on a business directory, sites that have changed ownership, have changed their address or are for sale and sites containing vulgar language.

Who made BizwikiBot?

We are fortunate to have some highly committed people on the Bizwiki team with years of experience in search and meta-search. These include Keith Hinde, Craig Sefton and Matt Aird who between them have helped develop search products for Infospace, Thomson Directories, TradePage and Webcrawler metasearch.

This experience with sites that have many millions of visitors per month has honed a commitment to creating a webcrawler that gathers as much relevant information as possible while politely following webmaster's rules and using a bare minimum of server resources.

Managing BizwikiBot

How do I request that BizwikiBot not crawl parts or all of my site?

BizwikiBot, like most web-crawling robots, is designed to follow commands website owners write in a file called 'robots.txt'. This is a standard document that can tell BizwikiBot not to download some or all information from your web server.

BizwikiBot will obey any commands you make for all bots, as well as any specifically aimed at it.

For instance, to stop Bizwiki spidering a folder called /private/ add this to the robots.txt file:
User-agent: BizwikiBot
Disallow: /private/

To stop all bots (including BizwikiBot) spidering a folder called /private/ add this to the robots.txt file:
User-agent: *
Disallow: /private/

To stop BizwikiBot spidering your entire site (this is generally not recommended)
User-agent: BizwikiBot
Disallow: /

To stop all bots (including BizwikiBot) spidering your entire site (this is generally not recommended)
User-agent: *
Disallow: /

For further information on creating or editing a robots.txt file, please