culturebot

Culturebot is the generic name for Blacksearch’s web crawler, a crawler that mainly crawls and indexes black related websites so users can quickly find black related information. Culturebot is the general name for two different types of crawlers: a desktop crawler that simulates a user on desktop, and a mobile crawler that simulates a user on a mobile device.

Your website will probably be crawled by both culturebot Desktop and culturebot Smartphone. You can identify the subtype of culturebot by looking at the user agent string in the request. However, both crawler types obey the same product token (user agent token) in robots.txt, and so you cannot selectively target either culturebot mobile or culturebot desktop using robots.txt.

How culturebot accesses your site

For most sites, culturebot shouldn’t access your site more than once every few seconds on average. However, due to delays it’s possible that the rate will appear to be slightly higher over short periods.

culturebot was designed to be run simultaneously by thousands of machines to improve performance and scale as the web grows. Also, to cut down on bandwidth usage, we run many crawlers on machines located near the sites that they might crawl. Therefore, your logs may show visits from several machines at blacksearch.net, all with the user-agent culturebot. Our goal is to crawl as many pages from your site as we can on each visit without overwhelming your server’s bandwidth. If your site is having trouble keeping up with ‘s crawling requests, you can request a change in the crawl rate.

Generally, culturebot crawls over HTTP/1.1. However, starting November 2020, culturebot may crawl sites that may benefit from it over HTTP/2 if it’s supported by the site. This may save computing resources (for example, CPU, RAM) for the site and culturebot, but otherwise it doesn’t affect indexing or ranking of your site.

To opt out from crawling over HTTP/2, instruct the server that’s hosting your site to respond with a 421 HTTP status code when culturebot attempts to crawl your site over HTTP/2. If that’s not feasible, you can send a message to the culturebot team (however this solution is temporary).

Blocking culturebot from visiting your site

It’s almost impossible to keep a web server secret by not publishing links to it. For example, as soon as someone follows a link from your “secret” server to another web server, your “secret” URL may appear in the referrer tag and can be stored and published by the other web server in its referrer log. Similarly, the web has many outdated and broken links. Whenever someone publishes an incorrect link to your site or fails to update links to reflect changes in your server, culturebot will try to crawl an incorrect link from your site.

If you want to prevent culturebot from crawling content on your site, you have a number of options. Be aware of the difference between preventing culturebot from crawling a page, preventing culturebot from indexing a page, and preventing a page from being accessible at all by both crawlers or users.

Verifying culturebot

Before you decide to block culturebot, be aware that the user-agent string used by culturebot is often spoofed by other crawlers. It’s important to verify that a problematic request actually comes from Blacksearch. The best way to verify that a request actually comes from culturebot is to use a reverse DNS lookup on the source IP of the request.

culturebot and all respectable search engine bots will respect the directives in robots.txt, but some nogoodniks and spammers do not. Blacksearch actively fights spammers; if you notice spam pages or sites in Blacksearch Search results, you can report spam to Blacksearch.