turnitinbot (ancien SlySearch)

Discussion dans 'Autres moteurs de recherche connus' créé par Mitirapa, 8 Janvier 2003.

  1. Mitirapa
    Mitirapa WRInaute passionné
    Inscrit:
    10 Juillet 2002
    Messages:
    1 176
    J'aime reçus:
    0
    voilà je viens de regarder mes stats.. et yavait deux ip ( 64.140.49.66 et 64.140.49.68 ) qui m'avait bouffé plus de 300mo de traffic (sur mes 10Go autorisés)
    et en fait c'est turnitinbot (ancien SlySearch) que je ne connaissais pas du tout
    http://www.turnitin.com/robot/crawlerinfo.html

    d'apres ce qu'ils disent c un truc de l'education pour voir si les etudiants ne font pas de plagiats pour les redactions à rendre... ils en parlaient il y a trés longtps à la tv..
    ben je me fait scanner donc lol

    mais bon, vu que ca n'apporte rien.. à part perdre du traffic autant le degager ...

    pour ceux qui seraient dans le meme cas;
    avec un robots.txt à la racine de votre site:
    User-agent: TurnitinBot
    Disallow: /
     
  2. WebRankInfo
    WebRankInfo Admin
    Membre du personnel
    Inscrit:
    19 Avril 2002
    Messages:
    18 816
    J'aime reçus:
    261
    je suis dans le même cas... mais au fait mitirapa, tu fais la chasse aux mauvais robots ou quoi ???
     
  3. Kmacleod
    Kmacleod WRInaute passionné
    Inscrit:
    28 Novembre 2002
    Messages:
    2 468
    J'aime reçus:
    0
    Je l'ai aussi dans mes stats

    Merci pour cette chasse au gaspi de bande passante,
    Faudrait presque en faire un dossier à part entière même si le sujet ne touche pas au référencement (quoique), mais c'est un domaine sensible pour beaucoup (selon les hébergeurs).
     
  4. Mitirapa
    Mitirapa WRInaute passionné
    Inscrit:
    10 Juillet 2002
    Messages:
    1 176
    J'aime reçus:
    0
    WRI> c mal passé mon supplément à payer le mois dernier.. donc là je fais gaffe à tout lol
    en plus mes visiteurs ont monté d'1/3.. donc faut vraiment que je cible bien ma bande passante...
    je commence à comprendre pkoi certains site ne veulent pas du tout se faire indexer... :wink:
     
  5. hetzeld
    hetzeld WRInaute passionné
    Inscrit:
    2 Décembre 2002
    Messages:
    1 603
    J'aime reçus:
    0
    Mitirapa,

    De mon côté, je dégage IpiumBot, les chinois, coréens et roumains dans mon .htaccess
    Code:
    RewriteCond %{REMOTE_HOST}  \.laurion\.net  [NC,OR]
    RewriteCond %{REMOTE_HOST} \.cn$ [OR]
    RewriteCond %{REMOTE_HOST} \.kr$ [OR]
    RewriteCond %{REMOTE_HOST} \.ro$
    RewriteRule ^.*$   -   [F]
    Pour info, le robot de laurion.net (IpiumBot), est un robot qui scanne en permanence le web à la recherche de plagiats, pour le compte de sociétés qui en font la demande.
    Come cette @&§%$! de robot ne respecte pas le protocole d'exclusion et ne se gêne pas pour descendre 100 pages/minute, j'ai purement interdit le domaine.
    Ils n'ont qu'à se brosser :lol:

    En ce qui concerne les autres, je n'ai pas l'intention de leur vendre un site immobilier pour les aider à répertorier leurs paillottes, donc même sentence...

    Dan
     
  6. WebRankInfo
    WebRankInfo Admin
    Membre du personnel
    Inscrit:
    19 Avril 2002
    Messages:
    18 816
    J'aime reçus:
    261
    ça veut dire quoi la dernière ligne de ton htaccess ?
     
  7. hetzeld
    hetzeld WRInaute passionné
    Inscrit:
    2 Décembre 2002
    Messages:
    1 603
    J'aime reçus:
    0
    Olivier,

    La dernière ligne: RewriteRule ^.*$ - [F] veut dire:
    Pour toute chaine de 0 ou plus de caractères .* comprise entre le début ^ et la fin $ de la ligne, on ne fait rien - mais on renvoie le "Forbidden" [F]

    Donc, cela ne réécrit pas l'url mais les envoie balader :wink:

    Dan
     
  8. BDGest
    BDGest WRInaute discret
    Inscrit:
    6 Janvier 2003
    Messages:
    196
    J'aime reçus:
    0
    Ben moi, il est en train de se farcir tout mon site la TurnitinBot 8O
    Ahh si google pouvais passer aussi bien !!!
    Bon ben je mets à jour mon .htacces
     
  9. Jocelyn
    Jocelyn WRInaute occasionnel
    Inscrit:
    6 Novembre 2002
    Messages:
    382
    J'aime reçus:
    0
    Moi aussi !
    TurnitinBot est passé aujourd'hui sur mon site. Il est temps de modifier mon robots.txt et mon .htaccess...

    Jocelyn
     
  10. perle d'argent
    perle d'argent WRInaute discret
    Inscrit:
    4 Janvier 2003
    Messages:
    76
    J'aime reçus:
    0
    Et WebCopier v3.3, vous connaissez?
    C'est qui, çuilà :?:
     
  11. BDGest
    BDGest WRInaute discret
    Inscrit:
    6 Janvier 2003
    Messages:
    196
    J'aime reçus:
    0
    C'est un aspirateur, de la pire espèce :(
    A bloquer aussi !!! :evil:
     
  12. perle d'argent
    perle d'argent WRInaute discret
    Inscrit:
    4 Janvier 2003
    Messages:
    76
    J'aime reçus:
    0
    Je n'ai ni .htaccess, ni robots.txt
    Puis-je le bloquer directement à partir des balises META?
    Merci
     
  13. BDGest
    BDGest WRInaute discret
    Inscrit:
    6 Janvier 2003
    Messages:
    196
    J'aime reçus:
    0
    Non tu ne pourras pas. Il faut absolument passé par ton .htacces (par xemple :
    Code:
    ErrorDocument 404 /monfichier404.php
    ErrorDocument 403 /accesrefuse.html
    RewriteEngine on
    RewriteCond %{REMOTE_HOST}  \.laurion\.net  [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^psbot [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Test [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Szukacz [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot [OR]
    RewriteCond %{REMOTE_HOST} \.cn$ [OR] 
    RewriteCond %{REMOTE_HOST} \.kr$ 
    RewriteRule ^.*$   -   [F]
    Ou ton robot.txt
    Tu peux egalement faire un petit code php, mais ca ne l'empechera pas de demander les pages (même si elle ne s'afficheront pas pour lui).
     
  14. WebRankInfo
    WebRankInfo Admin
    Membre du personnel
    Inscrit:
    19 Avril 2002
    Messages:
    18 816
    J'aime reçus:
    261
    au fait le bon nom est robots.txt et non robot.txt
    j'ai très souvent une erreur 404 là-dessus sur WRI
    j'espère que ce sont des curieux qui se trompent plutôt que des moteurs de recherche mal programmés ?
     
  15. perle d'argent
    perle d'argent WRInaute discret
    Inscrit:
    4 Janvier 2003
    Messages:
    76
    J'aime reçus:
    0
    Merci pour ces précieux renseignements :!:
     
  16. crughon
    crughon WRInaute discret
    Inscrit:
    24 Juillet 2005
    Messages:
    170
    J'aime reçus:
    0
    htaccess et bande passante

    Bonjour,
    comme beaucoup je traque les dévoreurs de bande passante. Je viens de lire beaucoup de messages sur le forum, au sujet de robots.txt et htaccess. Des exemples de contenu de fichier sont proposés. Il y a t-il un message posté qui proposerait le fichier htaccess ultime pour bloquer des moteurs de recherches non désirés ? J'avais trouvé un post qui montrait le fichier robots.txt de webrankinfo, mais je n'arrive plus à le retrouver :)
     
  17. Snoopy52
    Snoopy52 Nouveau WRInaute
    Inscrit:
    23 Décembre 2004
    Messages:
    42
    J'aime reçus:
    0
    Bonjour à tous,

    Voici ma modeste contribution concernant ce sujet :wink:

    Code:
    ################
    # Interdit certains robots
    ################
    RewriteCond %{HTTP_USER_AGENT} ^ADSARobot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ah-ha [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^aktuelles [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^amzn_assoc [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ASSORT [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^ATHENS [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Atomz [OR]
    RewriteCond %{HTTP_USER_AGENT} ^attach [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^attache [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Alexibot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^asterias [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BackDoorBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BackDoorBot/1.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^bdfetch [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^big.brother [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Black.Hole [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Black\ Hole [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BlowFish [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BlowFish/1.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^bmclient [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Boston\ Project [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BravoBrian\ SpiderEngine\ MarcoPolo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bullseye [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bullseye/1.0 [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^bumblebee [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^BotALot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BuiltBotTough [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BunnySlippers [OR]
    RewriteCond %{HTTP_USER_AGENT} ^capture [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CICC [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Cegbfeieh [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CheeseBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CherryPickerElite/1.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CherryPickerSE/1.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
    RewriteCond %{HTTP_USER_AGENT} ^clipping [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^CopyRightCheck [OR]
    RewriteCond %{HTTP_USER_AGENT} ^cosmos [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Crescent\ Internet\ ToolPak\ HTTP\ OLE\ Control\ v.1.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^cyberalert [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Deweb [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^diagem [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Digger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Digimarc [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump\ 3.1 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCoFinder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DittoSpyder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DSurf15a [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DTS.Agent [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EasyDL [OR]
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ecollector [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^efp@gmx\.net [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Email\ Extractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EroCrawler [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FavOrg [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Favorites\ Sweeper [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Fetch [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^FEZhead [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet\ WebWasher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlickBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^fluffy [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Foobot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FrontPage [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^GalaxyBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Generic [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Getleft [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetWebPage [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^gigabaz [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Girafabot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Googlebot-Image [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GornKer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grabber [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Green\ Research [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Harvest [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Harvest/1.5 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^hhjhj@yahoo [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^hloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HomePageSearch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^httpdown [OR]
    RewriteCond %{HTTP_USER_AGENT} ^http\ generic [OR]
    RewriteCond %{HTTP_USER_AGENT} ^httplib [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HTTrack [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^HTTrack\ 3.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^humanlinks [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
    RewriteCond %{HTTP_USER_AGENT} ^IBM_Planetwide [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^imagefetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^IncyWincy [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Indy\ Library [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^InfoNaviRobot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^informant [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Ingelin [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InternetLinkAgent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InternetSeer\.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Irvine [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JBH*Agent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JennyBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JOC [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Kenjin.Spider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Kenjin\ Spyder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Keyword.Density [OR]
    RewriteCond %{HTTP_USER_AGENT} ^KWebGet [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Lachesis [OR]
    RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LexiBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
    RewriteCond %{HTTP_USER_AGENT} ^libwww [OR]
    RewriteCond %{HTTP_USER_AGENT} ^likse [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Link*Sleuth [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LINKS\ ARoMATIZED [OR]
    RewriteCond %{HTTP_USER_AGENT} ^libWeb/clsHTTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkextractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkScan/8.1a.Unix [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkScan/8.1a\ Unix [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LWP [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^lwp-trivial [OR]
    RewriteCond %{HTTP_USER_AGENT} ^lwp-trivial/1.34 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mac\ Finder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mata\ Hari [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MCspider [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ URL\ Control\ -\ 5.01.4511 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ URL\ Control\ -\ 6.00.8169 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mirror [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Missigua\ Locator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIIxpc [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIIxpc/4.2 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister.PiX [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MJ12bot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MMMtoCrawl\/UrlDispatcherLLL [OR]
    RewriteCond %{HTTP_USER_AGENT} ^moget [OR]
    RewriteCond %{HTTP_USER_AGENT} ^moget/2.1 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MSProxy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^multithreaddb [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^nationaldirectory [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAttache [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAttache\ Light\ 1.1 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetCarta [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
    RewriteCond %{HTTP_USER_AGENT} ^netprospector [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetResearchServer [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZip\ Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZippy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NEWT [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^NPBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
    RewriteCond %{HTTP_USER_AGENT} ^OpaL [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline.Explorer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Openfind\ data\ gathere [OR]
    RewriteCond %{HTTP_USER_AGENT} ^OpenTextSiteCrawler [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^OrangeBot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^PackRat [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PersonaPilot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PingALink [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Proxy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ProPowerBot/2.14 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ProWebWalker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PSurf [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^puf [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PushSite [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^QuepasaCreep [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^QueryN.Metasearch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^QueryN\ Metasearch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^QRVA [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^replacer [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^RepoMonkey [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RepoMonkey\ Bait\ &\ Tackle/v1.01 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RMA [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Robozilla [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Rover [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^RPT-HTTPClient [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Rsync [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SearchExpress [OR]
    RewriteCond %{HTTP_USER_AGENT} ^searchhippo [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^searchterms\.it [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Second\ Street\ Research [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Seeker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Shai [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^ShopWiki [OR]
    RewriteCond %{HTTP_USER_AGENT} ^sitecheck [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^snagger [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SpankBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^spanner [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Spegla [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SpiderBot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SqWorm [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot/2.6 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SurfWalker [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^suzuran [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Szukacz/1.4 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
    RewriteCond %{HTTP_USER_AGENT} ^tarspider [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Templeton [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Telesoft [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Test [OR]
    RewriteCond %{HTTP_USER_AGENT} ^The.Intraformant [OR]
    RewriteCond %{HTTP_USER_AGENT} ^The\ Intraformant [OR]
    RewriteCond %{HTTP_USER_AGENT} ^TheNomad [OR]
    RewriteCond %{HTTP_USER_AGENT} ^TightTwatBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Titan [OR]
    RewriteCond %{HTTP_USER_AGENT} ^toCrawl/UrlDispatcher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^True_Robot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^True_Robot/1.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^turingos [OR]
    RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot/1.5 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^TV33_Mercator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^UIowaCrawler [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^UrlDispatcher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^URL_Spider_Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^URLy.Warning [OR]
    RewriteCond %{HTTP_USER_AGENT} ^URLy\ Warning [OR]
    RewriteCond %{HTTP_USER_AGENT} ^UtilMind [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
    RewriteCond %{HTTP_USER_AGENT} ^vagabondo [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^vayala [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^VCI [OR]
    RewriteCond %{HTTP_USER_AGENT} ^VCI\ WebViewer\ VCI\ WebViewer\ Win32 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^visibilitygap [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^vspider [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^w3mir [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^web\.by\.mail [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Data\ Extractor [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebBandit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebBandit/3.50 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webclipping [OR]
    RewriteCond %{HTTP_USER_AGENT} ^webcollector [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier\ v3.3 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^webcopy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^webcraft@bea [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^webdevil [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^webdownloader [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webdup [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebEnhancer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebHook [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web.Image.Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webinator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WEBMASTERS [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebmasterWorldForumBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebMiner [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^webmirror [OR]
    RewriteCond %{HTTP_USER_AGENT} ^webmole [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^website\ extractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website.Quester [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSnake [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webster.Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webster\ Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper/2.02 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^websucker [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^webvac [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^webwalk [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^webweasel [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZip [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZip/4.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wget/1.5.3 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wget/1.6 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^whizbang [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WhosTalking [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WISEbot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WUMPUS [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wweb [OR]
    RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WinHTTrack [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WWW-Collector-E [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^XGET [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xenu's [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xenu's\ Link\ Sleuth\ 1.1c [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus\ 32297\ Webster\ Pro\ V2.9\ Win32 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^x-Tractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Yandex [OR]
    # Bloque les navigateurs se dissimulants avec des lettres et chiffres aléatoires
    RewriteCond %{HTTP_USER_AGENT} [0-9A-Za-z]{15,} [OR]
    RewriteCond %{HTTP_USER_AGENT} ^[0-9A-Za-z]+$ [OR]
    # Un hôte qui tente de se cacher dans une reverse DNS lookup
    RewriteCond %{REMOTE_HOST} ^private$ [NC,OR]
    #
    RewriteCond %{REMOTE_HOST}  \.laurion\.net  [NC,OR]
    #
    # Un faux referrer souvent utilisé
    RewriteCond %{HTTP_USER_AGENT} ^[^?]*iaea\.org [NC,OR]
    #
    # Le referrer "addresses.com" est utilisé par un email address extractor
    RewriteCond %{HTTP_USER_AGENT} ^[^?]*addresses\.com [NC,OR]
    #
    #
    # Un faux referrer utilisé en conjonction avec un formmail exploits
    RewriteCond %{HTTP_USER_AGENT} ^[^?]*\.ideography\.co\.uk [NC]
    #
    #
    RewriteCond %{REMOTE_HOST} \.cn$ [OR] 
    RewriteCond %{REMOTE_HOST} \.kr$ [OR]
    RewriteCond %{REMOTE_HOST} \.ro$ 
    # Ne fais pas à autrui ce que tu n'aimerais pas que l'on te fasse  :-p
    RewriteRule .*$ http://www.turnitin.com [R,L]
    ###########################
    # Fin d'interdiction
    ###########################
    
    C'est une compilation de tout ce que j'ai pu trouver sur le net à ce sujet, mais certainement pas exhaustive.
    Fonctionne sur mon site sans erreurs.

    Si vous y voyez une erreur ou un bot qui ne devrait pas se trouver dans cette liste, merci de me le dire.

    @++
     
Chargement...
Similar Threads - turnitinbot (ancien SlySearch) Forum Date
Perte sur timeline d'une photo du mur (ancien) bien notifié Facebook 23 Mars 2012
  1. Ce site utilise des cookies. En continuant à utiliser ce site, vous acceptez l'utilisation des cookies.
    Rejeter la notice