[question] Utilisation de sitemap generator?

Nouveau WRInaute
J'essaie de paramétrer mon sitemap et rien ne se passe ...
Mon fichier config est il bon?

Code:
<?xml version="1.0" encoding="UTF-8"?>
<!--
  sitemap_gen.py example configuration script

  This file specifies a set of sample input parameters for the
  sitemap_gen.py client.

  You should copy this file into "config.xml" and modify it for
  your server.


  ********************************************************* -->


<!-- ** MODIFY **
  The "site" node describes your basic web site.

  Required attributes:
    base_url   - the top-level URL of the site being mapped
    store_into - the webserver path to the desired output file.
                 This should end in '.xml' or '.xml.gz'
                 (the script will create this file)

  Optional attributes:
    verbose    - an integer from 0 (quiet) to 3 (noisy) for
                 how much diagnostic output the script gives
    suppress_search_engine_notify="1"
               - disables notifying search engines about the new map
                 (same as the "testing" command-line argument.)
    default_encoding
               - names a character encoding to use for URLs and
                 file paths.  (Example: "UTF-8")
-->
<site
  base_url="http://www.webynux.net/"
  store_into="/homepages/1/d139468600/htdocs/webynux/sitemap.xml.gz"
  verbose="1"
  >

  <!-- ********************************************************
          INPUTS

  All the various nodes in this section control where the script
  looks to find URLs.

  MODIFY or DELETE these entries as appropriate for your server.
  ********************************************************* -->

  <!-- ** MODIFY or DELETE **
    "url" nodes specify individual URLs to include in the map.

    Required attributes:
      href       - the URL

  <url
     href="http://www.example.com/stats?q=age"
     lastmod="2004-11-14T01:00:00-07:00"
     changefreq="yearly"
     priority="0.3"
  />


  <!-- ** MODIFY or DELETE **
    "urllist" nodes name text files with lists of URLs.
    An example file "example_urllist.txt" is provided.

    Required attributes:
      path       - path to the file

  -->
  <urllist  path="example_urllist.txt"  encoding="UTF-8"  />





  <!-- ** MODIFY or DELETE **
    "accesslog" nodes tell the script to scan webserver log files to
    extract URLs on your site.  Both Common Logfile Format (Apache's default
    logfile) and Extended Logfile Format (IIS's default logfile) can be read.

    Required attributes:
      path       - path to the file

    Optional attributes:
      encoding   - encoding of the file if not US-ASCII
  -->
  <accesslog  path="/homepages/1/d139468600/htdocs/logs/access.log"       encoding="UTF-8"  />



  <!-- ** MODIFY or DELETE **
    "sitemap" nodes tell the script to scan other Sitemap files.  This can
    be useful to aggregate the results of multiple runs of this script into
    a single Sitemap.

    Required attributes:
      path       - path to the file
  -->
  <sitemap    path="/homepages/1/d139468600/htdocs/webynux/sitemap.xml" />


  <!-- ********************************************************
          FILTERS

  Filters specify wild-card patterns that the script compares
  against all URLs it finds.  Filters can be used to exclude
  certain URLs from your Sitemap, for instance if you have
  hidden content that you hope the search engines don't find.

  Filters can be either type="wildcard", which means standard
  path wildcards (* and ?) are used to compare against URLs,
  or type="regexp", which means regular expressions are used
  to compare.

  Filters are applied in the order specified in this file.

  An action="drop" filter causes exclusion of matching URLs.
  An action="pass" filter causes inclusion of matching URLs,
  shortcutting any other later filters that might also match.
  If no filter at all matches a URL, the URL will be included.
  Together you can build up fairly complex rules.

  The default action is "drop".
  The default type is "wildcard".

  You can MODIFY or DELETE these entries as appropriate for
  your site.  However, unlike above, the example entries in
  this section are not contrived and may be useful to you as
  they are.
  ********************************************************* -->

  <!-- Exclude URLs that end with a '~'   (IE: emacs backup files)      -->
  <filter  action="drop"  type="wildcard"  pattern="*~"           />

  <!-- Exclude URLs within UNIX-style hidden files or directories       -->
  <filter  action="drop"  type="regexp"    pattern="/\.[^/]*"     />

</site>

Merci d'avance à ceux qui m'aideront ;-)
 
WRInaute discret
Bonsoir

Voici un modèle qui fonctionne.
A adapter:
Code:
<site
  base_url="http://www.MONSITE.com/"
  
store_into="/home/MONSITE.com/rep/www/sitemap.xml"
  
verbose="1"
  
>
***************************
 <url
     
href="http://www.MONSITE.com/"
     
lastmod="2008-02-25T01:00:00-07:00"
     
changefreq="weekly"
     
priority="0.5"
  
/>

***********************************************************
<directory  path="/home/MONSITE.com/ftp/www"    url="http://www.MONSITE.com/" />
  
<directory
     
path="/home/MONSITE.com/rep/www"
     
url="http://www.MONSITE.com/"
     
default_file="index.html"
  
/>
*********************************************************
  <!-- Exclude URLs that end with a '~'   (IE: emacs backup files)      -->
  
<filter  action="drop"  type="wildcard"  pattern="*~"           />

  
<!-- Exclude URLs within UNIX-style hidden files or directories       -->
  
<filter  action="drop"  type="regexp"    pattern="/\.[^/]*"     />

***exemple d'exclusion:**
<filter action="drop" type="wildcard" pattern="http://www.MONSITE.com/repertoire/config/*" />
Et en ligne de commande, dans le bon répertoire:
Code:
python sitemap_gen.py --config=config.xml
Bon courage!
casa
 
WRInaute discret
Tout ce qu'il y a dans le répertoire CONFIG sera exclue:
Code:
<filter action="drop" type="wildcard" pattern="http://www.MONSITE.com/repertoire/config/*" />
A+
casa
 
Nouveau WRInaute
Merci j'ai compris le principe ....

Je vais faire d'autre recherche car c'est un site joomla et les url (ni SEF, ni celles d'origines) des mes articles ne sont pas reprises dans le sitemap...

Quelqu'un utilise sitemap_generator sur un site joomla ?
 
Discussions similaires
Haut