Public Bots_User Guide_Cloud Security 2.0-CDNetworks ドキュメント

Overview

Manage automated program traffic such as active search engines, data crawlers, and website monitoring tools on the Internet. Crawlers that benefit your website operations can be skipped with a single click. For websites with sensitive information, you can also deny these crawlers with one click.

Public Bots Category

Public bots on the platform are grouped into the following six categories：

Public Bots Category	Description
Search Engines	Webpage content is crawled from the Internet through automated procedures, and its information is stored in the database of the search engine to provide users with webpage search services. If you want your website to be retrieved by more people, it is recommended to set all of them to Skip.
Site Monitor	Automated programs that regularly visit and monitor website availability, performance, and security. If you have used the following Bots, it is recommended to set the corresponding Bots to Skip.
Marketing Analysis	For the purpose of market analysis, collect and analyze webpage content to help customers improve visibility in advertising, materials, and other aspects. If you want the website content to be used for market analysis, you can set all of them to Skip.
Page Preview	Quickly extract and present the key information from the target page for subsequent processing or display.
Feed Fetcher	Help users track and aggregate various information flows, obtain the latest content from different sources through automated programs, and automatically update and push it. If you want the information on the website to be obtained in a timely manner by the information flow aggregation application, you can set all of them to Skip.
Practical Tools	Automated programs that can provide practical assistance to websites, such as snapshot saving, analysis, and optimizing website loading speed. If you have used the following Bots, it is recommended to set the corresponding Bots to Skip.

Response Actions

You can choose to Log, Deny, or Skip public Bot traffic of different categories. Please refer to the table below for details:

Action	Description
Not Used	This strategy is not used for traffic inspection, and the traffic will still be forwarded to other inspection modules.
Log	Only log this type of request. The request will be forwarded as usual.
Deny	Deny the request and respond with a 403 status code.
Skip	Log the request and skip all subsequent Bot policy detection. Other protections, such as WAF and API security, will still be applied.

Steps

Log in to the console and go to the subscribed security product page.
Go to Security Settings–>Policies.
Select the domain you wish to configure the security policy and click to enter the Security Policy editing page.
Open the Bot Management tab and enable the master switch if it is turned off.
Go to Public Bots, click the <Configure> button on the right to open the configuration page.
Set different actions for different Public Bots categories according to your business needs, including Not Used, Skip, Log, and Deny.
Click Publish Changes at the bottom to publish the configuration. Changes take effect within 1–3 minutes.

Protection Recommendations

If your website relies on traffic acquisition or promotion, it is recommended to set public bots to Skip to avoid blocking beneficial crawlers by mistake.
For websites dealing with sensitive information or limited bandwidth, it is recommended to set public bots to Deny to mitigate crawler-induced strain.
For more granular protection, if you need to only skip or block specific crawlers, please use a Custom Bots policy and define specifying categories + User-Agent keywords. For details, refer to the Custom Bots section.

Inclusion Criteria

To be included in the Public Bot Library, a bot must meet all of the following criteria:

Verifiable Identity: The bot operator must be clearly identified, possess a verifiable and hard-to-imitate identity, and demonstrate consistent access behavior and stable network characteristics over time.
Traffic Volume Threshold: The robot needs to generate stable traffic across the network (e.g., the average daily verified traffic request number is greater than 1,000 in the last month) to ensure the universality of the inclusion.
Traceable IP Intelligence: IP data used for bot identification must have a traceable source, such as official APIs or RDNS databases.

Policy Violations

A bot will be removed from the Public Bot Library if it no longer meets the criteria above. Examples of policy violations include:

Abusive Behavior: Failing to comply with reasonable rate limiting, or frequently triggering WAF or high-risk risk control rules.
Mismatched or Spoofed Identity: The User-Agent claims to be a well-known bot, but its IP address, RDNS, or official source information does not match the declared identity.

Public Bot Library Updates and Inclusion Requests

The Public Bot Library is continuously updated to include widely used bots for search engine indexing, SEO, market intelligence, website monitoring, and other common use cases. Category-based handling is supported.
If a specific Public bot is not yet included, you can contact technical support to request its addition to the library. To help speed up the review process, it is recommended that you provide the following information:

Bot name and purpose
Link to official documentation
User-Agent
Source IP/ASN range
Expected handling action

Public Bots List

Public Bots Category	Bot Name	Corresponding User-Agent Keyword
Site Monitor	SiteLock Monitoring	SiteLockSpider
	Pingdom Monitoring	pingdom.com_bot
	Uptime Monitoring	uptimerobot
	Downnotifier Monitoring	downnotifier
Search Engines	Facebook Crawler	www.facebook.com
	Pinterest Crawler	www.pinterest.com
	Ahrefs Crawler	ahrefsbot
	Dataprovider Crawler	Dataprovider
	Barkrowler Crawler	Barkrowler
	Blex Crawler	blexbot
	Google Crawler	Googlebot/
	Google Crawler - Image	Googlebot-Image
	Bing Crawler	bingbot msnbot BingPreview
	Baidu Crawler	baiduspider
	Sogou Crawler	Sogou web spider
	Youdao Crawler	youdaobot
	Yahoo Crawler	Yahoo! Slurp/
	Yandex Crawler	YandexBot/ YandexImages/
	Istella Crawler	istellabot
	Yeti Crawler	Yeti/
	Apple Crawler	applebot
	Coccoc Crawler	coccocbot
	Seznam Crawler	SeznamBot
	Findx Crawler	Findxbot
	360 Crawler	360Spider
	Byte Crawler	Bytespider
	Qwant Crawler	Qwantbot
	Easou Spider	YisouSpider
	Mail.ru Spider	Mail.RU_Bot
	Mojeek Crawler	MojeekBot
	Kakaotalk Crawler	kakaotalk-scrap
Marketing Analysis	Netvibes Crawler	Netvibes
	Google Crawler - AdWords	AdsBot-Google Google-Ads Google-Adwords
	LinkedIn Crawler	LinkedInBot
	Semrush Crawler	SemrushBot
	SEOkicks Crawler	SEOkicks
	AwarioRss Crawler	AwarioRssBot
Feed Fetcher	Trendiction Crawler	trendictionbot
	Archive.org Crawler	archive.org_bot
	Blogtrottr	Blogtrottr
	Feeder	feeder.co
	ipip.net Crawler	ipip.net
Practical Tools	Google-Site-Verification Crawler	Google-Site-Verification
	Google-PageRenderer Crawler	Google-PageRenderer
	Google Web Preview Crawler	Google Web Preview
	Google-AMPCrawler Crawler	Google-AMPCrawler
	Google Docs Crawler	Google-Docs
	Google Page Speed Insights Crawler	Google-Page-Speed-Insights
	Google Read Aloud Crawler	Google-Read-Aloud
Page Preview	Zoom Crawler	Zoombot/
	Telegram Crawler	TelegramBot
	Twitter Crawler	Twitterbot

Media Delivery

Cloud Security

Web Performance

Network Acceleration

Infrastructure

Edge Computing

Console Guide

Developer Services

Management and Services

Public Bots