| |
|
Voir le sujet précédent :: Voir le sujet suivant
|
| Auteur |
Message |
| |
|
esf WRInaute occasionnel

Inscrit le: 13 Juin 2005 Messages: 102 Localisation: www
|
Posté le : Dim Nov 18, 2007 17:05 Sujet du message: 403 avec un simulateur de robots mais google indexe... |
|
|
Bonjour,
Avec ce lien de simulation de robots http://www.spider-simulator.com , j'ai testé un site avec le htaccess suivant. Ce htaccess dont le code n'est pas de moi est supposé bloquer les robots malicieux.
Avec le htaccess, j'obtient aussi une erreur 403. Ref: http://www.webrankinfo.com/forums/viewpost_804080.htm#804080
Sans le htaccess, tout est ok (200)
| Code: |
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} acoi [NC,OR]
RewriteCond %{HTTP_USER_AGENT} anon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} asptear [NC,OR]
RewriteCond %{HTTP_USER_AGENT} bandit [NC,OR]
RewriteCond %{HTTP_USER_AGENT} cache [NC,OR]
RewriteCond %{HTTP_USER_AGENT} cj.spider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} collect [NC,OR]
RewriteCond %{HTTP_USER_AGENT} combine [NC,OR]
RewriteCond %{HTTP_USER_AGENT} control [NC,OR]
RewriteCond %{HTTP_USER_AGENT} contrpl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} contype [NC,OR]
RewriteCond %{HTTP_USER_AGENT} copier [NC,OR]
RewriteCond %{HTTP_USER_AGENT} copy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} dnload [NC,OR]
RewriteCond %{HTTP_USER_AGENT} download [NC,OR]
RewriteCond %{HTTP_USER_AGENT} dsns [NC,OR]
RewriteCond %{HTTP_USER_AGENT} dts.agent [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ecatch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} email [NC,OR]
RewriteCond %{HTTP_USER_AGENT} fetch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} filehound [NC,OR]
RewriteCond %{HTTP_USER_AGENT} flashget [NC,OR]
RewriteCond %{HTTP_USER_AGENT} frontpage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ftp [NC,OR]
RewriteCond %{HTTP_USER_AGENT} fuck [NC,OR]
RewriteCond %{HTTP_USER_AGENT} getright [NC,OR]
RewriteCond %{HTTP_USER_AGENT} getter [NC,OR]
RewriteCond %{HTTP_USER_AGENT} go.zilla [NC,OR]
RewriteCond %{HTTP_USER_AGENT} go.ahead.got.it [NC,OR]
RewriteCond %{HTTP_USER_AGENT} grab [NC,OR]
RewriteCond %{HTTP_USER_AGENT} grub.client [NC,OR]
RewriteCond %{HTTP_USER_AGENT} httpget [NC,OR]
RewriteCond %{HTTP_USER_AGENT} httrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} hyperspin [NC,OR]
RewriteCond %{HTTP_USER_AGENT} installshield.digitalwizard [NC,OR]
RewriteCond %{HTTP_USER_AGENT} internetseer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} jobo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} konqueror [NC,OR]
RewriteCond %{HTTP_USER_AGENT} leech [NC,OR]
RewriteCond %{HTTP_USER_AGENT} libwww-perl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} lwp [NC,OR]
RewriteCond %{HTTP_USER_AGENT} mailto [NC,OR]
RewriteCond %{HTTP_USER_AGENT} mister.pix [NC,OR]
RewriteCond %{HTTP_USER_AGENT} moozilla [NC,OR]
RewriteCond %{HTTP_USER_AGENT} netants [NC,OR]
RewriteCond %{HTTP_USER_AGENT} newt [NC,OR]
RewriteCond %{HTTP_USER_AGENT} offline [NC,OR]
RewriteCond %{HTTP_USER_AGENT} oliverperry [NC,OR]
RewriteCond %{HTTP_USER_AGENT} pavuk [NC,OR]
RewriteCond %{HTTP_USER_AGENT} picture [NC,OR]
RewriteCond %{HTTP_USER_AGENT} pingalink [NC,OR]
RewriteCond %{HTTP_USER_AGENT} publish [NC,OR]
RewriteCond %{HTTP_USER_AGENT} python.urllib [NC,OR]
RewriteCond %{HTTP_USER_AGENT} registry.verify [NC,OR]
RewriteCond %{HTTP_USER_AGENT} scan [NC,OR]
RewriteCond %{HTTP_USER_AGENT} snag [NC,OR]
RewriteCond %{HTTP_USER_AGENT} softwing [NC,OR]
RewriteCond %{HTTP_USER_AGENT} strip [NC,OR]
RewriteCond %{HTTP_USER_AGENT} stamina [NC,OR]
RewriteCond %{HTTP_USER_AGENT} surveybot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} teleport [NC,OR]
RewriteCond %{HTTP_USER_AGENT} t.h.u.n.d.e.r.s.t.o.n.e [NC,OR]
RewriteCond %{HTTP_USER_AGENT} turnitinbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} udmsearch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} webcollage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} webfilter.robot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} webinator [NC,OR]
RewriteCond %{HTTP_USER_AGENT} webreaper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} webster [NC,OR]
RewriteCond %{HTTP_USER_AGENT} webwasher [NC,OR]
RewriteCond %{HTTP_USER_AGENT} wget [NC,OR]
RewriteCond %{HTTP_USER_AGENT} wildsoft [NC,OR]
RewriteCond %{HTTP_USER_AGENT} wwwoffle [NC,OR]
RewriteCond %{HTTP_USER_AGENT} zip [NC]
RewriteRule ^.* - [F]
<Files 403.shtml>
order allow,deny
allow from all
</Files> |
Je pioche dans le htaccess mais je ne trouve pas. Quelqu'un peut m'aider à trouver l'erreur ?
Cordialement
@+
Dernière édition par esf le Dim Nov 18, 2007 18:11; édité 1 fois |
|
| |
|
 |
jeanluc WRInaute accro

Inscrit le: 03 Mai 2004 Messages: 2312 Localisation: Bruxelles
|
Posté le : Dim Nov 18, 2007 17:55 Sujet du message: 403 avec un simulateur de robots mais google indexe... |
|
|
Simple: le user agent du spider-simulator est "libwww-perl/5.800".
Jean-Luc |
|
| |
|
 |
esf WRInaute occasionnel

Inscrit le: 13 Juin 2005 Messages: 102 Localisation: www
|
Posté le : Dim Nov 18, 2007 18:06 Sujet du message: 403 avec un simulateur de robots mais google indexe... |
|
|
Merci Jean-Luc.
J'ai testé 'sans' et vous avez raison. C'est ça qui bloque.
Cet agent-user est celui que certains malafrats de l'europe de l'Est utilisent... Je dois le laisser.
Sur ce simulateur, tout passe: http://tools.summitmedia.co.uk/spider/
@++ |
|
| |
|
 |
| |
|
|
|
|
Autres sujets de discussion :
|
|