logo
Apache Lounge
Webmasters

 

About Forum Index Downloads Search Register Log in RSS X


Keep Server Online

If you find the Apache Lounge, the downloads and overall help useful, please express your satisfaction with a donation.

or

Bitcoin

A donation makes a contribution towards the costs, the time and effort that's going in this site and building.

Thank You! Steffen

Your donations will help to keep this site alive and well, and continuing building binaries. Apache Lounge is not sponsored.
Post new topic   Forum Index -> How-to's & Documentation & Tips View previous topic :: View next topic
Reply to topic   Topic: How to stop annoying/abusive crawlers in 2.4
Author
Steffen
Moderator


Joined: 15 Oct 2005
Posts: 3092
Location: Hilversum, NL, EU

PostPosted: Fri 28 Jun '13 10:23    Post subject: How to stop annoying/abusive crawlers in 2.4 Reply with quote

User Xing has posted a nice list how he stopped quite some crawlers based on User Agent in 2.4. He excluded Google, Bing and others which he wants to index.

When you have a User Agent string which is worth to block, please post it here.

Steffen

Code:
<Directory />
..
..
..
<RequireAll>
Require all granted
Require expr %{HTTP_USER_AGENT} !~ /LinkFinder/i
Require expr %{HTTP_USER_AGENT} !~ /GSLFbot/i
Require expr %{HTTP_USER_AGENT} !~ /sistrix/i
Require expr %{HTTP_USER_AGENT} !~ /zooms/i
Require expr %{HTTP_USER_AGENT} !~ /majesti/i
Require expr %{HTTP_USER_AGENT} !~ /omgili/i
Require expr %{HTTP_USER_AGENT} !~ /ows 98/i
Require expr %{HTTP_USER_AGENT} !~ /extrabot/i
Require expr %{HTTP_USER_AGENT} !~ /ahrefs/i
Require expr %{HTTP_USER_AGENT} !~ /Java/i
Require expr %{HTTP_USER_AGENT} !~ /youtech/i
Require expr %{HTTP_USER_AGENT} !~ /seokicks/i
Require expr %{HTTP_USER_AGENT} !~ /Seznam/i
Require expr %{HTTP_USER_AGENT} !~ /esri/i
Require expr %{HTTP_USER_AGENT} !~ /warebay/i
Require expr %{HTTP_USER_AGENT} !~ /libwww/i
Require expr %{HTTP_USER_AGENT} !~ /Solomo/i
Require expr %{HTTP_USER_AGENT} !~ /WWWC/i
Require expr %{HTTP_USER_AGENT} !~ /ip-web/i
Require expr %{HTTP_USER_AGENT} !~ /panopta/i
Require expr %{HTTP_USER_AGENT} !~ /curl/i
Require expr %{HTTP_USER_AGENT} !~ /Wget/i
Require expr %{HTTP_USER_AGENT} !~ /Spider/i
Require expr %{HTTP_USER_AGENT} !~ /ntegrome/i
Require expr %{HTTP_USER_AGENT} !~ /andwatch/i
Require expr %{HTTP_USER_AGENT} !~ /SearchBot/i
Require expr %{HTTP_USER_AGENT} !~ /spinn3/i
Require expr %{HTTP_USER_AGENT} !~ /BLEX/i
</RequireAll>
</Directory>
Back to top
DnvrSysEngr



Joined: 15 Apr 2012
Posts: 226
Location: Denver, CO USA

PostPosted: Sun 30 Jun '13 3:08    Post subject: Reply with quote

This is what I have in my httpd.conf to stop crawling by unwanted crawlers/bots.
<Directory>
SetEnvIfNoCase User-Agent "AhrefsBot*" BadBot
SetEnvIfNoCase User-agent “Baidu” BadBot
SetEnvIfNoCase User-Agent "BaiDuSpider*" BadBot
SetENvIfNoCase User-Agent "Choopa" BadBot
SetEnvIfNoCase User-Agent "Cityreview*" BadBot
SetEnvIfNoCase User-Agent "crawl" BadBot
SetEnvIfNoCase User-Agent "Dotbot*" BadBot
SetEnvIfNoCase User-Agent "Java" BadBot
SetEnvIfNoCase User-Agent "MJ12bot" BadBot
SetEnvIfNoCase User-Agent "NG\ 1.x (Exalead)" BadBot
SetEnvIfNoCase User-Agent "Sogou" BadBot
SetEnvIfNoCase User-Agent "Sosospider" BadBot
SetEnvIfNoCase User-Agent "spider" BadBot
SetEnvIfNoCase User-Agent "Twiceler" BadBot
SetEnvIfNoCase User-agent “Yandex” BadBot
SetEnvIfNoCase User-Agent "YandexBot" BadBot
SetEnvIfNoCase User-Agent "Yandex*" BadBot
<RequireAll>
Require all granted
<RequireNone>
Require env BlockCountry (REMOVED for this POST)
<Limit GET POST PUT HEAD>
Require env BadBot
</Limit>
</RequireNone>
</RequireAll>
</Directory>
Back to top
glsmith
Moderator


Joined: 16 Oct 2007
Posts: 2268
Location: Sun Diego, USA

PostPosted: Sun 30 Jun '13 6:59    Post subject: Reply with quote

I've gone a different route, save the 403s polluting the error log

RewriteEngine on

# [Multi-Useragent]
RewriteCond %{HTTP_USER_AGENT} black.widow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^e?mail.?(collector|magnet|reaper|siphon|sweeper|harvest|collect|wolf) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^IE\ \d\.\d\ Compatible.*Browser$" [OR]
RewriteCond %{HTTP_USER_AGENT} ^net.?(ants|carta|mechanic|spider|vampire|zip) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^site.?(searcher|snagger|valet) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^web.?(auto|bandit|catcher|collage|collector|copier|copy|core|devil|downloader|fetch??|image|inator|hook|layers|linker|log|mole|miner|mirror|quest|reaper|sauger|site|snake|snarf|stolperer|stripper|sucker|vac|walk??|watch|whacker|weasel|zinger|zip) [NC,OR] # ODs

# [HTTP_USER_AGENT]
# ::ELNSB50 EmailHarvesting & GuestbookSpamming
RewriteCond %{HTTP_USER_AGENT} ^::ELNSB50 [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^Acrobat [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^anarchie [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ASPSimply [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Atomz [OR]
RewriteCond %{HTTP_USER_AGENT} ^cherry.?picker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "compatible ; MSIE 6.0?" [OR]
RewriteCond %{HTTP_USER_AGENT} crescent [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} "^DA \d\.\d+" [OR] # OD
RewriteCond %{HTTP_USER_AGENT} ^DataCha0s [OR]
RewriteCond %{HTTP_USER_AGENT} "^DTS Agent" [OR] # OD
RewriteCond %{HTTP_USER_AGENT} ^Download [OR] # OD
RewriteCond %{HTTP_USER_AGENT} ^EasyDL/\d\.\d+ [OR] # OD
RewriteCond %{HTTP_USER_AGENT} "^ABCdatos BotLink" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Acme.Spider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Ahoy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Alkaline [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ananzi [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Anthill [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Arachnophilia [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Arale [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Araneo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^AraybOt [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ArchitextSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Aretha [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ARIADNE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^arks [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ASpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^ATN Worldwide" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^AURESYS [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^BackRub [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
RewriteCond %{HTTP_USER_AGENT} "^Bay Spider" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^BBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Big Brother" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Bjaaland [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Bloodhound [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Borg-Bot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Bot\ mailto:craftbot@yahoo.com" [OR]
RewriteCond %{HTTP_USER_AGENT} ^BoxSeaBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^bright.net [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^BSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
RewriteCond %{HTTP_USER_AGENT} ^CACTVS [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Calif [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Cassandra [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Checkbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChristCrawler.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^churl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^cIeNcIaFiCcIoN.nEt [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^CMC/0.01 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Collective [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
RewriteCond %{HTTP_USER_AGENT} "^Combine System" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^ComputingSite Robi/1.0" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Conceptbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ConfuzzledBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^CoolBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Copier [OR]
RewriteCond %{HTTP_USER_AGENT} ^crawlpaper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Cusco [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^CyberSpyder [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^CydralSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
RewriteCond %{HTTP_USER_AGENT} "^Desert Realm " [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DeWeb(c) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Die Blinde Kuh" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DienstSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Digger [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Digimarc MarcSpider" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Digimarc Marcspider/CGI" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Digital Integrity Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Direct Hit Grabber" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^DISCo\ Pump" [OR]
RewriteCond %{HTTP_USER_AGENT} ^DNAbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^DownLoad Express" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Download\ Demon" [OR]
RewriteCond %{HTTP_USER_AGENT} "^Download\ Wonder" [OR]
RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^DragonBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
RewriteCond %{HTTP_USER_AGENT} ^DWCP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EbiNess [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^e-collector [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} "^EIT Link Verifier Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ELFINBOT [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Emacs-w3 Search Engine" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^esculapio [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Esther [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Evliya Celebi" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Express\ WebPictures" [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FastCrawler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Felix IDE" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^FetchRover [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^fido [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
RewriteCond %{HTTP_USER_AGENT} "^Fish search" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} "^Fluid Dynamics Search Engine" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Fouineur [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Freecrawl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^FunnelWeb [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^gammaSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^gazz [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GCreep [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetterroboPlus [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetURL [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^Golem [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grapnel/0.01 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Griffon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Gromit [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Gulper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Hämähäkki [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HamBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Harvest [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^havIndex [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HI.*Search [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^HKU WWW Octopus" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} "^Hometown Spider Pro" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ht://Dig [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^html_analyzer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HTMLgobble [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
RewriteCond %{HTTP_USER_AGENT} ^Hyper-Decontextualizer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^I, Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^iajaBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^IBM_Planetwide [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^image.kapsi.net [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Imagelock [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^IncyWincy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Informant [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InfoSeek [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InfoSpiders [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Ingrid [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Inspector Web" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^IntelliAgent [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} "^Internet Cruiser Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Internet Shinchakubin" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Internet\ Ninja" [OR]
RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
RewriteCond %{HTTP_USER_AGENT} ^Iron33 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Israeli-search [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JavaBee [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JCrawler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Jeeves [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JoBo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Jobot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC [OR]
RewriteCond %{HTTP_USER_AGENT} ^JoeBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JumpStation [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
RewriteCond %{HTTP_USER_AGENT} ^Katipo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^KDD-Explorer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Kilroy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^KIT-Fireball [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^KO_Yappo_Robot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^LabelGrabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^legs [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
RewriteCond %{HTTP_USER_AGENT} "^Link Validator" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkScan [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Lockon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^logo.gif Crawler" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Lycos [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Mac WWWWorm" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
RewriteCond %{HTTP_USER_AGENT} ^Magpie [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^marvin/infoseek [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Mass\ Downloader" [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mattie [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MediaFox [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
RewriteCond %{HTTP_USER_AGENT} ^MerzScope [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^MIDown\ tool" [OR]
RewriteCond %{HTTP_USER_AGENT} ^MindCrawler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister [OR]
RewriteCond %{HTTP_USER_AGENT} ^mnoGoSearch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^moget [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MOMspider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Monster [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Motor [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MSNBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Muncher [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Muninn [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Muscat [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mwd.Search [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NDSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NEC-MeshExplorer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Nederland.zoek [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetScoop [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZip [OR]
RewriteCond %{HTTP_USER_AGENT} ^newscan-online [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^NHSE Web Forager" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Nomad [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Northern Light Gulliver" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Novarra-Vision/unknown" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NPBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^nzexplorer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ObjectsSearch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Occam [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} "^Offline\ Explorer" [OR]
RewriteCond %{HTTP_USER_AGENT} ^OntoSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Open Text Index Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Openfind data gatherer" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Orb Search" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Pack Rat" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PageBoy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} "^Papa\ Foto" [OR]
RewriteCond %{HTTP_USER_AGENT} ^ParaSite [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Patric [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^pegasus [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PerlCrawler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^PGP Key Agent" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Phantom [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PhpDig [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PiltdownMan [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Pimptrain.com's" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Pioneer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PlumtreeWebAccessor [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
RewriteCond %{HTTP_USER_AGENT} ^Poppi [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Popular Iconoclast" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Portal Juice Spider" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^PortalB Spider" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
RewriteCond %{HTTP_USER_AGENT} "Python-urllib" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Raven Search" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^RBSE Spider" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} "^Resume Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^RixBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Road Runner" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^RoadHouse [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Robbie [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^RoboCrawl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^RoboFox [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Robot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Robozilla [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Roverbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^RuLeS [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SafetyNet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Scooter [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Search.Aus-AU.COM [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SearchProcess [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Senrigan [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SG-Scout [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ShagSeeker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Shai'Hulud [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Sift [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Simmany [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteTech-Rover [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Skymob.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SLCrawler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Sleek [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Smart Spider" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snooper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Solbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
RewriteCond %{HTTP_USER_AGENT} ^Spanner [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Speedy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^spider_monkey [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SpiderBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Spiderline [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SpiderMan [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SpiderView [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Spry [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} StumbleUpon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Suke [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^suntek [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Sven [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Sygol [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Tarantula [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^tarspider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Tcl W3 Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^TechBOT [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
RewriteCond %{HTTP_USER_AGENT} ^Templeton [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^TeomaTechnologies [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^The Jubii Indexing Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^The NorthStar Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^The NWI Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^The Peregrinator" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^The Python Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^The TkWWW Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^The Web Moose" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^The Web Wombat" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^The Webfoot Robot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^the World Wide Web Wanderer" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^The World Wide Web Worm" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^TITAN [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^TitIn [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^TLSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^UCSD Crawl" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^UdmSearch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^UptimeBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^URL Check" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^URL Spider Pro" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Vacuum [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Valkyrie [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Verticrawl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Victoria [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^vision-search [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^void-bot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Voyager [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^VWbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^w.pSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^W3M2 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^w3mir [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Walhello [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WallPaper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Webcapture [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStolperer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Whacker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^whatUseek [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WhoWhere [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Wild Ferret" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^Wired Digital" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "^WWWC Ver 0.2.5" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^XGET [NC,OR] [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^XYLEME [NC]

# Send them home to mama or a dead connection
RewriteRule ^/(.*) http://localhost/$1 [L,R]
Back to top


Reply to topic   Topic: How to stop annoying/abusive crawlers in 2.4 View previous topic :: View next topic
Post new topic   Forum Index -> How-to's & Documentation & Tips