ログ解析スクリプトAWStats 6.0ドキュメント

 


FAQ(よくある質問) / トラブルシューティング(作業中)


全般についての質問:

セットアップ / 利用についての質問:
以下に, AWStatsをセットアップ/利用する際によく聞かれる質問をまとめました.


エラー / トラブルシューティングについての質問:
AWStatsを利用する際に発生するエラーや問題についての質問と解答をまとめました.

セキュリティについての質問:
以下に, AWStatsを利用する際のセキュリティの問題について, よく聞かれる質問をまとめました
.





FAQ-ABO100 : サポートされているログファイル形式/OSは ?
AWStatsは以下の環境で動作します :
- ログファイルを, Apacheのようなcombinedログ形式 (XLF/ELF), ApacheやSquidのようなcommonログ形式(CLF), IIS 5.0以上のような W3Cログ形式でログを記録できる全てのWebサーバー, そのほかAWStatsが必要とする情報をすべて含むあらゆるログ形式.
- そのほかのほぼ全てのWeb/Wap/プロキシサーバー.
- 特定のFTPログ, シスログ, メールログファイル.
AWStatsはPerlで記述されているため, 全てのOSで動作します.
利用実績があるプラットフォームの例を以下に示します(太字は'著者がテストした'という意味で, そのほかはAWStatsのユーザーから正常に動作したという報告があったものです)
:
OS:
Windows 2000, Windows NT 4.0, Windows Me, Linux (RedHat, Mandrake, Debian, Suse...), Macintosh, Solaris, Aix, BeOS, ...
Web/Wap/プロキシ/ストリーミングサーバ
Apache 1.3.x and 2.x, IIS 5.0 and 6.0, WebStar, WebLogic, WebSite, Windows Media Server, Tomcat, Squid, Sambar, Roxen, Resin, RealMedia server, Oracle9iAS, Lotus Notes/Domino, Darwin, IPlanet, IceCast, ZeroBrand, Zeus, Zope, Abyss...
FTPサーバ
ProFTP, ...
メールサーバ
Postfix, Sendmail, QMail, Mdaemon, www4mail, ...
Perlインタープリタ(Perl >= 5.005):
ActivePerl 5.6, ActivePerl 5.8, Perl 5.8, Perl 5.6, Perl 5.005, mod_perl, Apacheのmod_perl2, ...


FAQ-ABO150 : AWStatsが解析できるログ形式は ?
AWStatsで事前に定義されているログ形式を利用すれば, 設定が簡単になります. しかし, 独自のログ形式を利用することもできます. このため, AWStatsはほぼ全てのWeb/WAP/プロキシサーバーのログを解析できるのです. また, 特定のFTPサーバーやシスログ, メールサーバーのログを解析することもできます.
唯一の必須条件は, "必要とされる情報がログファイルに格納されていること"です.
解析できるログ形式の一例を以下に示します:

Apache common ログ形式 (Note*を参照),
Apache combined ログ形式 (NCSA combined ログ形式, XLFあるいはELF形式という名で知られています),
その他のカスタマイズされたApacheのログ形式,
あらゆるIISのログ形式(W3C形式という名で知られています),
Webstarネイディブログ形式,
Realmediaサーバ, Windows Mediaサーバ, Darwinストリーミングサーバ,
ProFTPサーバ,
Postfix, Sendmail, QMail, Mdaemon
多くのweb/wap/プロキシサーバーのログ形式

Note*: Apache commonログ形式(AWStatsはこの形のログも解析できるようになりましたが, AWStatsが必要とする全ての情報が含まれていません. 問題は内容であり, 形式ではありません). commonログ形式のファイルを解析するのはあまり有意義とはいえません. なぜならば, 欠落している情報が多すぎて, ロボットのフィルタや検索エンジン/検索キーワード/OS/ブラウザの検出ができないからです, しかし, 多くのユーザーがAWStatsにcommonログ形式のサポートを望んでいたため, AWStatsはその声に応えてサポートしました. しかしながら, ロボットのフィルタや検索エンジン/検索キーワード/OS/ブラウザの検出などのAWStatsの先進的な機能の多くは, 動作しません.

F.A.Q.: LOG FORMAT SETUP OR ERRORS も参照してください.


FAQ-ABO200 : 利用できる言語は ?
AWStatsは39もの言語でレポートを作成することができます. 以下に最新版のリストをアルファベット順で示します(LangパラメーターとShowFlagLinksパラメーターで利用できるコードです) :
  • アルバニア語=al, ボスニア語=ba, ブルガリア語=bg, カタロニア語=ca, 中国語(台湾)=tw, 中国語(簡体字)=cn, チェコ語=cz, デンマーク語=dk, オランダ語=nl, 英語=en, エストニア語=et, バスク語=eu, フィンランド語=fi, フランス語=fr, ガリシア語=gl, ドイツ語=de, ギリシャ語=gr, ヘブライ語=he, ハンガリー語=hu, アイスランド語=is, インドネシア語I=id, イタリア語=it, 日本語=jp, 韓国語=kr, ラトヴィア語=lv, ノルウェー語(ニーノシク)=nn, ノルウェー語(ブークモールl)=nb, ポーランド語=pl, ポルトガル語=pt, ポルトガル語(ブラジル語)=br, ルーマニア語=ro, ロシア語=ru, セルビア語=sr, スロベキア語=sk, スペイン語=es, スゥェーデン語=se, トルコ語=tr, ウクライナ語=ua, ウェールズ語=wlk.
    ただし, AWStatsのドキュメントは英語でのみ提供されています.
    しかし, 有志によるドキュメントも存在します.
    Documentation Contribページを参照してください.


    FAQ-ABO250 : AWStatsがPHP Nukeに統合される可能性は ?
    現時点では, 筆者はPHPNukeにAWStatsをアドオンするという計画は聞いたことがありません. しかし, 計画が生まれる可能性はあります. そのようなアドオンが存在するかどうかを, PHPNuleの作者やPHPNukeのフォーラムに確認してみてください.




    FAQ-COM025 : サーバのログがない場合, AWStatsをどのように利用すればいいのでしょうか ?
    問題:
    AWStatsを利用したいのですが, 自分のサーバのログファイルを入手できません.
    解決方法:
    AWStatsはログ解析ツールであるため, ログファイルを読むことができないということは, 解析するべき情報がないということを意味し, AWStatsを利用することはできません. しかし, ログファイルを入手するための方策はあります. psloggerのようなCGIスクリプトを呼び出すタグをすべてのWebページに負荷する必要があります. こうすれば, AWStatsに解析させることが可能な人工的なログファイルを入手することができます.
    AWStatsの開発者の手により機能強化されたPerl版CGIが, 配布ファイルに/files/pslogger.plという名前で含まれています. あるいは, Florent CHANTRETによるPHP版CGIも, 配布ファイルに/files/pslogger.phpsという名前で含まれています.

    FAQ-COM050 :AWStatsが解析できるログサイズの上限は ?
    問題:
    AWStatsの更新プロセスを新しいログファイルに対して頻繁に実行する必要があることは理解していますが, これはログファイルサイズを一定サイズ以下に保つことが目的だと思われます. しかし, 初回更新時には, 巨大な過去のログファイルを対象に更新プロセスを実行する必要があります. AWStatsが解析できるログファイルサイズに上限はあるのでしょうか ?
    解決方法:
    いいえ. AWStatsとしての限界はありません. つまり, 巨大なログファイルに利用することができるということになります(10GBのログファイルを利用したテストが行われたこともあります).
    しかし, あなたのシステム(OSやPerlのバージョン)に限界がある可能性はあります. たとえば, 2GBや4GBを限界と報告するエラーを目にする可能性はあります. もし限界がPerlによるものであれば, Perlのラージファイルオプションを付加してコンパイルされたバージョンを利用してみてください.
    そのバージョンを見つけることも, 自分でコンパイルすることもできない場合は, LogFile="cat /yourlogfilepath/yourlogfile |"LogFile="/yourlogfilepath/yourlogfile"の代わりに利用してみてください.


    FAQ-COM090 : FTPサーバのログファイルのセットアップについて
    問題:
    FTPサーバのログファイルの解析を行う場合は, どのようにAWStatsを利用すればいいのでしょうか ?
    解決方法:
    AWStatsは特定のFTPサーバのログファイルを解析することができます.

    1- FTPログファイル形式の設定:

    ProFTPを利用している場合, proftpd.confファイルを編集して以下の2行を追加します :
    LogFormat awstats "%t %h %u %m %f %s %b"     # 警告: %タグの区切り文字として, 半角スペースではなくタブを必ず利用してください!
    ExtendedLog /var/log/xferlog read,write awstats
        # 警告: ExtendedLogディレクティブは, バーチャルホストを利用している場合そのコンテクストの中に置く必要があります.
    続いて, 古いログ形式の設定を無効にします:
    TransferLog none     # 警告: TransferLogディレクティブは, バーチャルホストを利用している場合そのコンテクストの中に置く必要があります.

    変更を有効にするには, FTPサーバを停止し, 古いログファイルである/var/log/xferlogを削除してから, FTPサーバを起動します.
    FTPを利用してファイルをダウンロードし, 以下のような新しいログが作成されていることを確認します:
    [01/Jan/2001:21:49:57 +0200] ftp.server.com user RETR /home/fileiget.txt 226 1499

    2- AWStatsをFTPのログファイルを解析できるようにセットアップする:

    awstats.model.confファイルを"awstats.proftp.conf"という名前でコピーします.
    新しい設定ファイルを以下のように編集します:
    LogFile="/var/log/xferlog"
    LogType=F
    LogFormat="%time1 %host %logname %method %url %code %bytesd"
    LogSeparator="\t"
    NotPageList=""
    LevelForBrowsersDetection=0
    LevelForOSDetection=0
    LevelForRefererAnalyze=0
    LevelForRobotsDetection=0
    LevelForWormsDetection=0
    LevelForSearchEnginesDetection=0
    ShowLinksOnUrl=0
    ShowMenu=1
    ShowMonthStats=UVHB
    ShowDaysOfMonthStats=HB
    ShowDaysOfWeekStats=HB
    ShowHoursStats=HB
    ShowDomainsStats=HB
    ShowHostsStats=HBL
    ShowAuthenticatedUsers=HBL
    ShowRobotsStats=0
    ShowEMailSenders=0
    ShowEMailReceivers=0
    ShowSessionsStats=1
    ShowPagesStats=PBEX
    ShowFileTypesStats=HB
    ShowFileSizesStats=0
    ShowBrowsersStats=0
    ShowOSStats=0
    ShowOriginStats=0
    ShowKeyphrasesStats=0
    ShowKeywordsStats=0
    ShowMiscStats=0
    ShowHTTPErrorsStats=0
    ShowSMTPErrorsStats=0

    あとは, AWStatswを通常通り利用するだけです(更新プロセスを走らせ, 統計を閲覧します).


    FAQ-COM100 : メールサーバ(Postfix, Sendmail, QMail, MDaemon, Exchange...)のログのセットアップについて
    問題:
    メールサーバのログファイルの解析を行う場合は, どのようにAWStatsを利用すればいいのでしょうか ?
    解決方法:

    以下の手順は, AWStats 5.5以上で適用できます.

    Postfix, Sendmail, QMail, MDaemonのログファイルを解析する場合

    メールログファイルのプリプロセッサ(maillogconvert.plがAWStatsのtoolsディレクトリに同梱されていますが, 好みのものを利用することもできます)を利用するようにAWStatsをセットアップする必要があります:
    このためには, "awstats.model.conf""awstats.mail.conf"という名前でコピーします.
    この新しい設定ファイルを以下のように編集します:
    通常のPostfix, Sendmail, MDaemon, QMailのログファイルを利用している場合は, 以下のとおりに設定します.
    LogFile="perl /path/to/maillogconvert.pl standard < /pathtomaillog/maillog |"
    ログファイルが圧縮されている場合は, 以下のとおりに設定します.
    LogFile="gzip -cd /var/log/maillog.0.gz | /path/to/maillogconvert.pl standard |"
    VAdmin QMailのログファイル(VAdminを利用して複数のホスト/バーチャルホストのメールサーバを利用)の場合は, 以下のとおりに設定します
    LogFile="perl /path/to/maillogconvert.pl vadmin < /pathtomaillog/maillog |"
    続いて, どのようなメールサーバを利用している場合でも, 必ず以下の変更を行います:
    LogType=M
    LogFormat="%time2 %email %email_r %host %host_r %method %url %code %bytesd"
    LevelForBrowsersDetection=0
    LevelForOSDetection=0
    LevelForRefererAnalyze=0
    LevelForRobotsDetection=0
    LevelForWormsDetection=0
    LevelForSearchEnginesDetection=0
    LevelForFileTypesDetection=0
    ShowMenu=1
    ShowMonthStats=HB
    ShowDaysOfMonthStats=HB
    ShowDaysOfWeekStats=HB
    ShowHoursStats=HB
    ShowDomainsStats=0
    ShowHostsStats=HBL
    ShowAuthenticatedUsers=0
    ShowRobotsStats=0
    ShowEMailSenders=HBL
    ShowEMailReceivers=HBL
    ShowSessionsStats=0
    ShowPagesStats=0
    ShowFileTypesStats=0
    ShowFileSizesStats=0
    ShowBrowsersStats=0
    ShowOSStats=0
    ShowOriginStats=0
    ShowKeyphrasesStats=0
    ShowKeywordsStats=0
    ShowMiscStats=0
    ShowHTTPErrorsStats=0
    ShowSMTPErrorsStats=1
    MDaemonメールサーバを利用している場合, "-Statistics.log"で終了する新しいMDaemonのログファイルを必ず利用する必要があります.

    あとは, AWStatswを通常通り利用するだけです(更新プロセスを走らせ, 統計を閲覧します).

    Exchangeのログファイルを解析する場合

    Exchangeではさまざまなログ形式を利用することができますが, そのいずれも興味を引く解析結果を得るには十分とはいえない情報しか含まれていません(情報の不足, 不必要なデータ, 複数レコードを結びつけるためのタグデータの欠落など). このため, "興味を引く"ログ解析結果をExchangeのログファイルから得ることは, 事実上不可能です(解析するだけ無駄です). AWStatsのことは忘れて, もっと真面目なメールサーバ(sendmail, postfix, qmail, mdaemon, ...)を利用するようにしてください. すいません.


    FAQ-COM110 : メディアサーバ(Realmedia, Windows media, Darwin streaming server)のログファイルのセットアップについて
    問題:
    メディアサーバのログファイルの解析を行う場合は, どのようにAWStatsを利用すればいいのでしょうか ?
    解決方法:

    Realmediaの場合

    ログファイルには以下のようなレコードが記録されているはずです:
    216.125.146.50 - - [16/Sep/2002:14:57:21 -0500] "GET cme/rhythmcity/rcitycaddy.rm?cloakport=8080,554,7070 RTSP/1.0" 200 6672 [Win95_4.0_6.0.9.374_play32_NS80_en-US_586] [80d280e1-c9ae-11d6-fa53-d52aaed98681] [UNKNOWN] 281712 141 3 0 0 494

    awstats.model.confを"awstats.mediaserver.conf"という名前でコピーします. そのファイルを以下のように編集します:
    LogFile="/pathtomediaserverlog/mediaserverlog"
    LogType=S
    LogFormat="%host %other %logname %time1 %methodurl %code %bytesd %uabracket %other %other %other %other %other %other %other %other"
    LogSeparator="\s+"
    ShowMenu=1
    ShowMonthStats=UHB
    ShowDaysOfMonthStats=HB
    ShowDaysOfWeekStats=HB
    ShowHoursStats=HB
    ShowDomainsStats=HB
    ShowHostsStats=HBL
    ShowAuthenticatedUsers=0
    ShowRobotsStats=0
    ShowEMailSenders=0
    ShowEMailReceivers=0
    ShowSessionsStats=0
    ShowPagesStats=PB
    ShowFileTypesStats=HB
    ShowFileSizesStats=0
    ShowBrowsersStats=1
    ShowOSStats=1
    ShowOriginStats=PH
    ShowKeyphrasesStats=0
    ShowKeywordsStats=0
    ShowMiscStats=0
    ShowHTTPErrorsStats=1
    ShowSMTPErrorsStats=0

    あとは, AWStatswを通常通り利用するだけです(更新プロセスを走らせ, 統計を閲覧します).


    Windows Mediaサーバ / Darwinストリーミングサーバの場合

    1- 利用しているWindows Media / Darwinストリーミングサーバのバージョンがログ形式の変更を許容する場合, ログ形式を変更して下記のフィールドが記録されるようにします:
    c-ip
    date
    time
    cs-uri-stem
    c-starttime
    x-duration
    c-rate
    c-status
    c-playerid
    c-playerversion
    c-playerlanguage
    cs(User-Agent)
    cs(Referer)
    c-hostexe
    c-hostexever
    c-os
    c-osversion
    c-cpu
    filelength
    filesize
    avgbandwidth
    protocol
    transport
    audiocodec
    videocodec
    channelURL
    sc-bytes

    変更を有効にするために, サーバを停止して, 古いログファイルを削除してからサーバを再起動してください.
    ストリーミングファイルを視聴してから, 新しいログファイルに下記のようなレコードが記録されていることを確認します:
    80.223.91.37 2002-10-08 14:18:58 mmst://mydomain.com/mystream 0 106 1 200 {F4A826EE-FA46-480F-A49B-76786320FC6B} 8.0.0.4477 fi-FI - - wmplayer.exe 8.0.0.4477 Windows_2000 5.1.0.2600 Pentium 0 0 20702 mms TCP Windows_Media_Audio_9 - - 277721

    利用しているWindows Media / Darwinストリーミングサーバのバージョンがログ形式の変更を許容しない場合:
    ステップ2の手順をそのまま実行します. ただし, AWStatsのLogFormatパラメータとして, ログファイルの最初の行に含まれるログ形式文字列("#Fields"の直後)を指定します. 以下に例を示します:
    LogFormat="c-ip date time c-dns cs-uri-stem c-starttime x-duration c-rate c-status c-playerid c-playerversion c-playerlanguage cs(User-Agent) cs(Referer) c-hostexe c-hostexever c-os c-osversion c-cpu filelength filesize avgbandwidth protocol transport audiocodec videocodec channelURL sc-bytes c-bytes s-pkts-sent c-pkts-received c-pkts-lost-client c-pkts-lost-net c-pkts-lost-cont-net c-resendreqs c-pkts-recovered-ECC c-pkts-recovered-resent c-buffercount c-totalbuffertime c-quality s-ip s-dns s-totalclients s-cpu-util"
    つまり, AWStatsはAWStatsのタグを利用しない場合でも, IISやWindows Mediaサーバのタグを理解することができることもあるということになります.

    2- AWStatsをメディアサーバのログファイルを解析できるようにセットアップする:
    awstats.model.confファイルを"awstats.mediaserver.conf"という名前でコピーします.
    新しい設定ファイルを以下のように編集します:
    LogFile="/pathtomediaserver/mediaserverlog"
    LogType=S
    LogFormat="c-ip date time cs-uri-stem c-starttime x-duration c-rate c-status c-playerid c-playerversion c-playerlanguage cs(User-Agent) cs(Referer) c-hostexe c-hostexever c-os c-osversion c-cpu filelength filesize avgbandwidth protocol transport audiocodec videocodec channelURL sc-bytes"
    DecodeUA=1
    ShowMenu=1
    ShowMonthStats=UHB
    ShowDaysOfMonthStats=HB
    ShowDaysOfWeekStats=HB
    ShowHoursStats=HB
    ShowDomainsStats=HB
    ShowHostsStats=HBL
    ShowAuthenticatedUsers=0
    ShowRobotsStats=0
    ShowEMailSenders=0
    ShowEMailReceivers=0
    ShowSessionsStats=0
    ShowPagesStats=PB
    ShowFileTypesStats=HB
    ShowFileSizesStats=0
    ShowBrowsersStats=1
    ShowOSStats=1
    ShowOriginStats=H
    ShowKeyphrasesStats=0
    ShowKeywordsStats=0
    ShowMiscStats=0
    ShowHTTPErrorsStats=1
    ShowSMTPErrorsStats=0

    あとは, AWStatswを通常通り利用するだけです(更新プロセスを走らせ, 統計を閲覧します).


    FAQ-COM120 : 損失なくログファイルを循環させるには ?
    問題:
    Webサーバーのシステムの機能, もしくは外部コマンド(rotatelog, cronolog)を利用してログファイルをアーカイブ/循環させたいのですが, 循環処理中に一行たりとも情報を失いたくありません.
    解決:
  • If your config file is setup with a LogFile parameter that point to your current running log file (required if you want to use the AllowToUpdateStatsFromBrowser option to have "real-time" statistics), to avoid loosing too much records during the rotate process, you must run the AWStats update JUST BEFORE the rotate process is done.
    The best way to do that on 'Linux like' OS is to use the linux built-in logrotate feature. You must edit the logrotate config file used for your web server log file (usually stored in /etc/logrotate.d directory) by adding the AWStats update process as a preprocessor command, like this example (bold lines are lines to add for having a prerotate process):
    /usr/local/apache/logs/*log
    {
    notifempty
    daily
    rotate 7
    compress
    sharedscripts
    prerotate
    /usr/local/awstats/wwwroot/cgi-bin/awstats.pl -update -config=mydomainconfig
    endscript
    postrotate
    /usr/bin/killall -HUP httpd
    endscript
    }

    If using a such solution, this is sequential steps that happens:
    Step
    DescriptionStep nameDate/Time example
    Alogrotate is started (by cron)Start of logrotate04:02:00
    B
      awstats -update is launched by logrotate
    Start of awstats04:02:01
    C
        awstats start to read the log file file.log
     04:02:02
    D
      awstats has reached the end of log file so now it starts to save its database on disk.
     04:05:00
    E
      awstats has finished to save its new database, so it stops
    End of awstats04:06:00
    Flogrotate moves old log file file.log to a new name file.log.sav. Apache now logs in this file file.log.sav since log file handle has not been changed (only log file name has been renamed).Log move04:06:01
    Glogrotate sends the -HUP or -USR1 signal to Apache.
    With -HUP, Apache immediatly kills all its child process/thread, close log file file.log.sav, and reopen file file.log. So now, ALL hits are written to new file.
    With -USR1, Apache only ask its child process/thread to stop only when HTTP request will be completely served. However it closes immediatly log file file.log.sav, and reopen file file.log. So only NEW hits are written to new log file. HTTP requests that are still running will write in old one.
    Apache restart04:06:02
    Hlogrotate starts compress the old log file file.log.sav into file.log.gzStart compress04:06:03
    I
      If some apache threads/processes are still running (because the kill sent was -USR1, so child process are waiting end of request before to stop), then those threads/processes are still writing to file.log.sav.
      If kill -HUP was used, all process are already restarted so all writes in new file.log.
      
    Jlogrotate has finished to compress log file into file.log.gz. File file.log.sav is deleted.End of compress
    End of logrotate
    04:07:03
    K
      If signal was -USR1, some old childs can still run (when serving a very long request for example). So the log writing, still done in same file handle are going to a file that has been removed. So log writing are lost nowhere (this is only if -USR1 was used and if request was very long).
      

    The advantage of this solution is that it is a very common way of working, used by a lot of products, and easy to setup. You will notice that you can "loose" some hits:
    If you use the -HUP signal, you will only loose all hits that were written during D and E. Note that you will also break all requests still running at G. In the example, it's a 1 minute lost (for small or medium web sites, it will be less than few seconds), so this give you an error less than 0.07% (less for small web sites). This is not significant, above all for a "statistics" progam.
    If you use the -USR1 signal, you will not kill any request. But you will loose all hits that were wrote during D and E (like with -HUP) but also all hits that are still running after H (all very long request that requires several minutes to be served). If hit ends during I, it is wrote in a log file already analyzed, if hit ends at K, it is wrote nowhere. In the example, it's also a 0.07% error plus error for other not visible hits that were finished during I or K, but number of such hits should be very low since only hits that started before G and not finished after H are concerned. In most cases a hits needs only few milliseconds to be served so lost hits could be ignored.

    Note also that if you have x logrotate config files, with each of them a postrotate with a kill -HUP, you send a kill x times to your server process. So try to include several log files in same logrotate config file. You can have several awstats update command in the same prerotate section and you will send the -HUP only once, after all updates are finished. However, doing this, you will have a lap time between D and F (were some hits are lost) that will be higher.

  • Another common way of working is to choose to run the AWStats update process only once the log file has been archived.
    This is required for example if you use the cronolog or rotatelog tools to rotate your log files. For example, Apache users can setup their Apache httpd config file to write log file through a pipe to cronolog or rotatelog using Apache CustomLog directive:
    CustomLog "|/usr/sbin/cronolog [cronolog_options] /var/logs/access.%Y%m%d.log" combined
    If you use a such feature, you can't trigger AWStats update process to be ran just BEFORE the rotate is done, so you must run it AFTER the rotate process, so on the archived log file.
    To setup awstats to always point to last archive log file, you can use the 'tags' available for LogFile.
    The problem with that is that your data are refreshed only after a rotate has done. However, you will miss absolutely nothing (no hits) and your server processes are never killed.

  • So, if you really want to not loose absolutely no hit and want to have updates more frequently than the rotate frequency, the best way is still an hybrid solution (i am not sure that it worth the pain, and remember that statistics are only statistics):
    You run the awstats update process from you crontab frequently, every hour for example, and half and hour before the rotate has done. See next FAQ to know how to setup a scheduled job.
    Then, once the rotate has been done (by the logrotate or by a piped cronolog log file), and before the next scheduled awstats update process start, you run another update process on the archived log file using the -logfile option to force update on the archived log file and not the current log file defined in awstats config file. This will allow you to update the half hour missing, until the log rotate (AWStats will find the new lines). However don't forget that this particular update MUST be finished before the next croned update.


    FAQ-COM130 : HOW TO RUN AWSTATS UPDATE PROCESS FREQUENTLY
    PROBLEM:
    AWStats must be ran frequently to update statistics. How can I do this ?
    SOLUTION:
    A good way of working is to run the AWStats update process as a preprocessor of your log rotate process. See previous FAQ (FAQ-COM120) for this.
    But you can also run AWStats update process regularly by a scheduler:

    With Windows, you can use the internal task scheduler. The use of this tool is not an AWStats related problem, so please take a look at your Windows manual. Warning, if you use "awstats.pl -config=mysite -update" in your scheduled task, you might experience problem of failing task. Try this instead
    "C:\WINNT\system32\CMD.EXE /C C:\[awstats_path]\awstats.pl -config=mysite -update"
    or
    "C:\[perl_path]\perl.exe C:\[awstats_path]\awstats.pl -config=mysite -update"
    A lot of other scheduler (sharewares/freewares) are very good.

    With unix-like operating systems, you can use the "crontab".
    This is examples of lines you can add in the cron file (see your unix reference manual for cron) :
    To run update every day at 03:50, use :
    50 3 * * * /usr/local/awstats/wwwroot/cgi-bin/awstats.pl -config=mysite -update >/dev/null
    To run update every hour, use :
    0 * * * * /usr/local/awstats/wwwroot/cgi-bin/awstats.pl -config=mysite -update >/dev/null


    FAQ-COM140 : HOW CAN I EXCLUDE MY IP ADDRESS (OR WHOLE SUBNET MASK) FROM STATS ?
    PROBLEM:
    I don't want to see my own IP address in the stats or I want to exclude counting visits from a whole subnet.
    SOLUTION:
    You must edit the config file to change the SkipHosts parameter.
    For example, to exclude:
  • your own IP address 123.123.123.123, use SkipHosts="123.123.123.123"
  • the whole subnet 123.123.123.xxx, use SkipHosts="REGEX[^123\.123\.123\.]"
  • all sub hosts xxx.myintranet.com, use SkipHosts="REGEX[\.myintranet\.com$]" (This one works only if DNS lookup is already done in your log file).


    FAQ-COM145 : HOW TO USE THE EXTRA SECTIONS FEATURES ?
    PROBLEM:
    I want to build personalized reports not provided in default AWStats reports. How can I setup the Extra Sections parameters in my AWStats config file to do so ?
    SOLUTION:
    Take a look at the Using AWStats Extra Sections features


    FAQ-COM150 : BENCHMARK / FREQUENCY TO LAUNCH AWSTATS TO UPDATE STATISTICS
    PROBLEM:
    What is AWStats speed ?
    What is the frequency to launch AWStats process to update my statistics ?
    SOLUTION:
    All benchmarks information and advice on frequency for update process are related into the Benchmark page.


    FAQ-COM200 : HOW REVERSE DNS LOOKUP WORKS, UNRESOLVED IP ADDRESSES
    PROBLEM:
    The reported page AWStats shows me has no hostnames, only IP addresses, countries reported are all "unknown".
    SOLUTION:
    When AWStats find an IP address in your log file, it tries a reverse DNS lookup to find the hostname and domain if the DNSLookup parameter, in your AWStats config file, is DNSLookup=1 (Default value). So, first, check if you have the good value. The DNSLookup=0 must be used only if your log file contains already resolved IP address. For example, when you set up Apache with the HostNameLookups=on directive. When you ask your web server to make itself the reverse DNS lookup to log hostname instead of IP address, you will still find some IP addresses in your log file because the reverse DNS lookup is not always possible. But if your web server fails in it, AWStats will also fails (All reverse DNS lookups use the same system API). So to avoid AWStats to make an already done lookup (with success or not), you can set DNSLookup=0 in AWStats config file. If you prefer, you can make the reverse DNS lookup on a log file before running your log analyzer (If you only need to convert a logfile with IP Addresses into a logfile with resolved hostnames). You can use for this logresolvemerge tool provided with AWStats distribution (This tools is an improved version of logresolve provided with Apache).


    FAQ-COM250 : DIFFERENT RESULTS THAN OTHER ANALYZER
    PROBLEM:
    I also use Webalizer, Analog (or another log analyzer) and it doesn't report the same results than AWStats. Why ?
    SOLUTION:
    If you compare AWStats results with an other log file analyzer, you will found some differences, sometimes very important. In fact, all analyzer (even AWStats) make "over reporting" because of the problem of proxy-servers and robots. However AWStats is one of the most accurate and its "over reporting" is very low where all other analyzers, even the most famous, have a VERY HIGH error rate (10% to 200% more than reality !).
    This is the most important reasons why you can find important differences:
  • Some dynamic pages generated by CGI programs are not counted by some analyzer (ie Webalizer) like a "Page" (but only like a "Hit") if CGI prog does not end with a defined extension (.cgi, ...), so they are not included correctly in their statistics. AWStats use on oposite policy, assuming a file is a page except if type is in a list (See NotPageList parameter). Error rate with a such policy is lower.
  • AWStats is able to detect robots visits. Most analyzers think robots visits are human visitors. This error make them to report more visits and visitors than reality. When AWStats reports a "1 visitor", it means "1 human visitor" (even if it's not posible to detect all robots, most of them are detected). "Robots visitors" are reported separately in the "Robots/Spiders visitors" chart.
  • Some log analyzers use the "Hits" to count visitors. This is a very bad way of working : Some visitors use a lot of proxy servers to surf (ie: AOL users), this means it's possible that several hosts (with several IP addresses) are used to reach your site for only one visitor (ie: one proxy server download the page and 2 other servers download all images). Because of this, if stats of unique visitors are made on "hits", 3 users are reported but it's wrong. So AWStats considers only HTML "Pages" to count unique visitors. This decrease the error, not completely, because it's always possible that a proxy server download one HTML frame and another one download another frame, but this make the over-reporting of unique visitors less important.
  • Another important reason to have difference is that an error log files is not always completely sorted but only "nearly" sorted because of cache and writing log engines used by server. Nearly all log analyzers (commercial and not) assumes that log file is "exactly" sorted by hit date to calculate visits, entry and exit pages. But there is nothing that guaranties this and some log files are only "nearly" sorted, above all log files on highly loaded servers. AWStats has an advanced parsing algorithm that is able to count correctly visits, entry and exit pages even if log file is only "nearly" sorted.
  • Then, there is internal bugs in log analyzers that make reports wrong. For example, a lot of users have reported that Webalizer "doubles" the number of visits or visitors in some circumstances.
    There is also other reasons, however those points explains only small differences:
  • To differenciate new visits of a same visitor, log analyers uses a visit time-out. If value differs, then results differ (on visit count and entry and exit pages). A such time-out is a fixed value (For example 60 minutes) meaning if a visitor make a hit 59 minutes after downloading the previous page, it's the same visits, if he make it 61 minutes after, it's a new visit. Of course, there is no realy difference between 59 and 61, but couting visits without time-out is not possible. And because the most important is to have a time-out (and not really it's value), AWStats time-out is not an "exact" value but is "around" 60 minutes. This allows AWStats to have better speed processing time, so you also might experience little differences, in visit count, between AWStats and another log analyzer even if their time-out are both defined to same value (because AWStats time-out is not exactly but nearly value defined).
  • There is also differences in log analyzers databases and algorithms that make details of results less or more accurate:
    AWStats has a larger browsers, os', search engines and robots database, so reports concerning this are more accurate.
    AWStats has url syntax rules to find keywords or keyphrases used to find your site, but AWStats has also an algorithm to detect keywords of unknown search engines with unknown url syntax rule.
    AWStats does not count twice (by default) redirects made by rewrite rules that makes two hits into log files but that are only one page "viewed".
    Etc...

    If you want to check how serious is your log analyzer, try to parse the following log file:

    # This is a sample of log file that contains a lot of various data we can find
    # in a log file. Great sample to test reliability and accuracy of any log
    # analyzer.
    # ----------------------------------------------------------------------------
    # This sample log file contains 10 differents IPs that are :
    # 6 human visits done, by 5 different true visitors
    # 1 proxy visit done, by one of the 5 true visitors
    # 1 try of a 6th human visit failed because of wrong url
    # 1 bot visit
    # 1 worm attack
    # 1 add to favourites (two hits but first is non root hit with error)
    # ----------------------------------------------------------------------------
    # 80.8.55.1 2 visits (start at 00:00:00 and at 12:00:00 with both entry page on /)
    # 80.8.55.2 this is not a visit, only an image included into a page of an other site
    # 80.8.55.3 1 visit (and add home page to favourites)
    # 80.8.55.4 same visitor than 80.8.55.3 using aol proxy
    # 80.8.55.5 not a visit (but a bot indexing)
    # 80.8.55.6 1 visit (authenticated visitor)
    # 80.8.55.7 1 visit (authenticated visitor with space in name)
    # 80.8.55.8 not a visit (try but failed twice with 404 and 405 error)
    # 80.8.55.9 not a visit (but a worm attack)
    # 80.8.55.10 1 visit that come from web page not search engines
    # TOTAL:

    80.8.55.1 - - [01/Jan/2001:00:00:10 +0100] "GET /page1.html HTTP/1.0" 200 7009 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    80.8.55.1 - - [01/Jan/2001:00:00:00 +0100] "GET / HTTP/1.0" 200 7009 "http://www.sitereferer/cgi-bin/search.pl?q=a" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    80.8.55.1 - - [01/Jan/2001:00:00:20 +0100] "GET /page2.cgi HTTP/1.0" 200 7009 "http://localhost/page1.html" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    80.8.55.1 - - [01/Jan/2001:00:00:25 +0100] "GET /page3 HTTP/1.0" 200 7009 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    80.8.55.1 - - [01/Jan/2001:00:00:30 +0100] "GET /image.gif HTTP/1.0" 200 7009 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    80.8.55.1 - - [01/Jan/2001:00:00:35 +0100] "GET /image2.png HTTP/1.0" 200 7009 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    80.8.55.1 - - [01/Jan/2001:00:00:40 +0100] "GET /dir/favicon.ico HTTP/1.0" 404 299 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    80.8.55.1 - - [01/Jan/2001:00:00:40 +0100] "GET /favicon.ico HTTP/1.0" 200 299 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    80.8.55.1 - - [01/Jan/2001:12:00:00 +0100] "GET / HTTP/1.0" 200 7009 "http://WWW.SiteRefereR:80/cgi-bin/azerty.pl?q=a" "Mozilla/4.7 [fr] (Win95; I)"
    80.8.55.1 - - [01/Jan/2001:12:00:10 +0100] "GET /page1.html HTTP/1.0" 200 7009 "-" "Mozilla/4.7 [fr] (Win95; I)"
    80.8.55.1 - - [01/Jan/2001:12:00:20 +0100] "GET /page2.cgi HTTP/1.0" 200 7009 "-" "Mozilla/4.7 [fr] (Win95; I)"
    80.8.55.1 - - [01/Jan/2001:12:00:25 +0100] "GET /page3 HTTP/1.0" 200 7009 "-" "Mozilla/4.7 [fr] (Win95; I)"
    80.8.55.1 - - [01/Jan/2001:12:00:30 +0100] "GET /image.gif HTTP/1.0" 200 7009 "-" "Mozilla/4.7 [fr] (Win95; I)"
    80.8.55.1 - - [01/Jan/2001:12:00:35 +0100] "GET /image2.png HTTP/1.0" 200 7009 "-" "Mozilla/4.7 [fr] (Win95; I)"
    80.8.55.1 - - [01/Jan/2001:12:00:40 +0100] "GET /js/awstats_misc_tracker.js HTTP/1.1" 200 4998 "-" "Mozilla/4.7 [fr] (Win95; I)"
    80.8.55.1 - - [01/Jan/2001:12:00:45 +0100] "GET /js/awstats_misc_tracker.js?SCREEN=1024x768&CDI=32&JAVA=true&UC=UserCode1056710428488r6832&SC=SessionCode1056710428488r6832&SHK=N&FLA=Y&RP=N&MOV=N&WMA=Y&PDF=Y HTTP/1.1" 200 4998 "-" "Mozilla/4.7 [fr] (Win95; I)"

    80.8.55.2 - - [01/Jan/2001:12:01:00 +0100] "GET /hitfromothersitetoimage.gif HTTP/1.0" 200 7009 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.3) Gecko/20030312"

    80.8.55.3 - - [01/Jan/2001:12:01:10 +0100] "GET / HTTP/1.0" 200 7009 "http://www.sitereferer:81/cgi-bin/azerty.pl" "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5a) Gecko/20030728 Mozilla Firebird/0.6.1"
    80.8.55.3 - - [01/Jan/2001:12:01:15 +0100] "GET /page1.html HTTP/1.0" 200 7009 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5a) Gecko/20030728 Mozilla Firebird/0.6.1"
    80.8.55.3 - - [01/Jan/2001:12:01:20 +0100] "GET /page2.cgi?x=a&family=a&y=b&familx=x HTTP/1.0" 200 7009 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5a) Gecko/20030728 Mozilla Firebird/0.6.1"
    80.8.55.3 - - [01/Jan/2001:12:01:25 +0100] "GET /page3 HTTP/1.0" 200 7009 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5a) Gecko/20030728 Mozilla Firebird/0.6.1"
    80.8.55.3 - - [01/Jan/2001:12:01:30 +0100] "GET /image.gif HTTP/1.0" 200 7009 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5a) Gecko/20030728 Mozilla Firebird/0.6.1"
    80.8.55.3 - - [01/Jan/2001:12:01:35 +0100] "GET /image2.png HTTP/1.0" 200 7009 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5a) Gecko/20030728 Mozilla Firebird/0.6.1"

    80.8.55.4 - - [01/Jan/2001:12:01:45 +0100] "GET /samevisitorthan80.8.55.3usingaolproxy.gif HTTP/1.0" 200 7009 "-" "Mozilla/3.0 (Windows 98; U) Opera 6.03"

    80.8.55.5 - - [01/Jan/2001:12:02:00 +0200] "GET /robots.txt HTTP/1.0" 200 299 "-" "GoogleBot"
    80.8.55.5 - - [01/Jan/2001:12:02:05 +0200] "GET / HTTP/1.0" 200 7009 "-" "GoogleBot"

    80.8.55.6 - john [01/Jan/2001:13:00:00 +0100] "GET /cgi-bin/order.cgi?x=a&family=a&productId=998&titi=i&y=b&y=b HTTP/1.0" 200 7009 "http://www.google.com/search?sourceid=navclient&ie=utf-8&oe=utf-8&q=ma%C3%AEtre" "SAGEM-myX-5m/1.0_UP.Browser/6.1.0.6.1.103_(GUI)_MMP/1.0_(Google_WAP_Proxy/1.0)"
    80.8.55.6 - john [01/Jan/2001:13:00:00 +0100] "GET /cgi-bin/order.cgi?x=a&family=a&productId=998&titi=i&y=b&y=b HTTP/1.0" 200 7009 "http://www.google.com/search?sourceid=navclient&ie=utf-8&oe=utf-8&q=駘钁e" "SAGEM-myX-5m/1.0_UP.Browser/6.1.0.6.1.103_(GUI)_MMP/1.0_(Google_WAP_Proxy/1.0)"

    80.8.55.7 - John Begood [01/Jan/2001:13:01:00 +0100] "GET /cgi-bin/order.cgi;family=f&type=t&productId=999&titi=i#BIS HTTP/1.0" 200 7009 "-" "Mozilla/3.01 (compatible;)"

    80.8.55.8 - - [01/Jan/2001:14:01:20 +0100] "GET /404notfoundpage.html?paramnotpagefound=valparamnotpagefound HTTP/1.0" 404 0 "http://refererto404nofoundpage/pageswithbadlink.html?paramrefnotpagefound=valparamrefnotpagefound" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    80.8.55.8 - - [01/Jan/2001:14:01:20 +0100] "GET /405error.html HTTP/1.0" 405 0 "http://refererto405error/pagesfrom405.html" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"

    80.8.55.9 - - [01/Jan/2001:15:00:00 +0200] "GET /default.ida?XXXXXXXXXXXXXXXXXX%u6858%ucbd3%u7801%u9090%u9090%u8190%u00c3%u0003%u8b00%u531b%u53ff%u0078%u0000%u00=a HTTP/1.0" 404 299 "-" "-"

    80.8.55.10 - - [01/Jan/2001:16:00:00 -0300] "GET / HTTP/1.1" 200 70476 "http://us.f109.mail.yahoo.com/ym/ShowLetter?box=Inbox&MoreYahooParams..." "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90; Hotbar 4.2.8.0)"
    80.8.55.10 - - [01/Jan/2001:17:00:00 -0300] "GET /page1.html HTTP/1.1" 200 70476 "http://www.freeweb.hu/icecat/filmek/film04.html" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90; Hotbar 4.2.8.0)"
    80.8.55.10 - - [01/Jan/2001:18:00:00 -0300] "GET /cgi-bin/awredir.pl?url=http://xxx.com/aa.html HTTP/1.1" 302 70476 "http://us.f109.mail.yahoo.com/ym/ShowLetter?box=Inbox&MoreYahooParams..." "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90; Hotbar 4.2.8.0)"

    This is what you should find:
    6 true human visits
    5 different true visitors
    1 bot visit
    1 worm attack
    The entry pages for true visits should be "/" (even for 80.8.55.1) or "/cgi-bin/order.cgi" but nothing else.


    FAQ-COM300 : DIFFERENCE BETWEEN LOCAL HOURS AND AWSTATS REPORTED HOURS
    PROBLEM:
    I use IIS and there's a difference between local hour and AWStats reported hour. For example I made a hit on a page at 4:00 and AWStats report I hit it at 2:00.
    SOLUTION:
    This is not a problem of time in your local client host. AWStats use only time reported in logs by your server and all time are related to server hour. The problem is that IIS in some foreign versions puts GMT time in its log file (and not local time). So, you have also GMT time in your statistics.
    You can wait that Microsoft change this in next IIS versions. However, Microsoft sheet Q271196 "IIS Log File Entries Have the Incorrect Date and Time Stamp" says:
    The selected log file format is the W3C Extended Log File Format. The extended log file format is defined in the W3C Working Draft WD-logfile-960323 specification by Phillip M. Hallam-Baker and Brian Behlendorf. This document defines the Date and Time files to always be in GMT. This behavior is by design.
    So this means this way of working might never be changed, so another chance is to use the AWStats plugin 'timezone'. Warning, this plugin need the perl module Time::Local and it reduces seriously AWStats speed.
    To enable the plugin, uncomment the following line in your config file.
    LoadPlugin="timezone TZ"
    where TZ is value of your signed timezone (+2 for Paris, -8 for ...)


    FAQ-COM350 : HOW CAN I PROCESS OLD LOG FILE ?
    PROBLEM:
    I want to process an old log file to include its data in my AWStats reports.
    SOLUTION:
    You must change your LogFile parameter to point to the old log file and run the update (or use the -logfile option on command line to overwrite LogFile parameter). The update process can only accept files in chronological order for a particular month, so if you have already processed a recent file and forgot to run update on a log file that contains older data, you must before reset all your statistics (see FAQ-COM500) and restart all the update processes for all past log files and in chronological order.
    However, there is a "tip" that allow you to rebuild only the month were you missed data:
    Imagine we are on 5th of July 2003, all your statistics are up to date except for the 10th of April 2003 (you forgot to run the update process for this day, so there is no visit for this day). You can :
    - Reset the statistics for April only (this means remove the file awstats042003.[config.]txt as explained in FAQ-COM500),
    - Move the statistics history files for month after April (file awstats052003.[config.]txt, awstats062003.[config.]txt,...) into a temp directory (so that it is no more in DirData directory as if they were deleted).
    - Run update process on all log files for April (in chronological order). AWStats does not complain about "too old record" because there is no history files in DirData directory that contains compiled data more recent than records into log you process.
    - Moved back the month history files you saved into your DirData directory.
    Your statistics are up to date and the missing days are no more missing.


    FAQ-COM400 : HOW CAN I UPDATE MY STATISTICS WHEN I USE A LOAD BALANCING SYSTEM THAT SPLITS MY LOGS ?
    PROBLEM:
    How can I update my statistics when i use a load balancing system that split my logs ?
    SOLUTION:
    The best solution is to merge all split log files resulted from all your load balanced servers into one. For this, you can use the logresolvemerge tool provided with AWStats :
    logresolvemerge.pl file1.log file2.log ... filen.log > newfiletoprocess.log
    And setup the LogFile parameter in your config file to process the newfiletoprocess.log file or use the -logfile command line option to overwrite LogFile value.


    FAQ-COM500 : HOW CAN I RESET ALL MY STATISTICS ?
    PROBLEM:
    I want to reset all my statistics to restart the update process from the beginning.
    SOLUTION:
    All analyzed data are stored by AWStats in history files called awstatsMMYYYY.[config.]txt (one file each month). You will find those files in directory defined by DirData parameter (same directory than awstats.pl by default).
    To reset all your statistics, just delete all files awstatsMMYYYY.txt
    To reset all your statistics built for a particular config file, just delete all files awstatsMMYYYY.myconfig.txt
    Warning, if you delete those data files, you won't be able to recover your stats back, unless you kept old log files somewhere. You will have to process all past log files (in chronological order) to get your statistics back.


    FAQ-COM600 : HOW CAN I COMPILE AND BUILD STATISTICS ON A DAILY BASIS ONLY ?
    PROBLEM:
    How can I compile and build statistics on a daily basis. I mean i want to have a full report with all charts with data for a particular day only and want one report for each day of month.
    SOLUTION:
    This is an non documented and not supported trick, as this is not the standard way of working:
    First, run the update process at midnight (or on a log file that was rotated at midnight so that it contains only data for this particular day (you can choose another hour in night if you want to have days that "start" at an different hour).
    Once the update process has been ran, MOVE (and not copy) the history file built by AWStats. For example on Unix like systems:
    mv   mydirdata/awstatsMMYYYY.mydomain.txt   mydirdate/awstatsDDMMYYYY.mydomain.txt
    Note that the name has been changed by adding the day. Repeat this each day after the update process.
    With this you will have one history file for each day. You can then see full stats for a particular day by adding the non documented parameter -day=DD on command line (with others like -month=MM and -year=YYYY). If ran from a browser you can also add &day=DD on URL.
    However, if you have full day by day statistics, you don't have anymore statistics for full month, except if you create a second config file that whose history files would not be moved.


    FAQ-COM700 : CAN I SAFELY REMOVE A LINE IN HISTORY FILES (awstatsMMYYYY*.txt) ?
    PROBLEM:
    After processing a log file I want to change my statistics without running AWStats update process but changing directly data in AWStats historical database files.
    SOLUTION:
    If you remove a lines starting with "BEGIN_" or "END_", AWStats will find your file "corrupted" so you must not change those two kinds of lines.
    You can change, add or remove any line that is in any sections but if you do this, you must also update the MAP section (lines between BEGIN_MAP and END_MAP) because this section contains the offset in file of each other sections for direct I/O access. If history file is the last one, you can easily do that by removing completely the MAP section and run an update process. Like that AWStats will rewrite the history file and the MAP section will be rewritten (MAP section is not read by update process, only written). You do this at your own risk. The main risk is that some charts will report wrong values or be unavailable.




    FAQ-SET050 : ERROR "MISSING $ ON LOOP VARIABLE ..."
    PROBLEM:
    When I run awstats.pl from command line, I get:
    "Missing $ on loop variable at awstats.pl line xxx"
    SOLUTION:
    Problem is in your Perl interpreter. Try to install or reinstall a more recent/stable Perl interpreter.
    You can get new Perl version at ActivePerl (Win32) or Perl.com (Unix/Linux/Other).


    FAQ-SET100 : I SEE PERL SCRIPT'S SOURCE INSTEAD OF ITS EXECUTION
    PROBLEM:
    When I try to execute the Perl script through the web server, I see the Perl script's source instead of the HTML result page of its execution !
    SOLUTION:
    This is not a problem of AWStats but a problem in your web server setup. awstats.pl file must be in a directory defined in your web server to be a "cgi" directory, this means, a directory configured in your web server to contain "executable" files and not to documents files. You have to read your web server manual to know how to setup a directory to be an "executable cgi" directory (With IIS, you have some checkbox to check in directory properties, with Apache you have to use the "ExecCGI" option in the directory "Directive").


    FAQ-SET150 : INTERNAL ERROR 500 IN MY BROWSER
    FAQ-SET200 : ERROR "... COULDN'T CREATE/SPAWN CHILD PROCESS..."
    PROBLEM:
    AWStats seems to run fine at the command prompt but when ran as a CGI from a browser, I get an "Internal Error 500".
    I also also might have the following message in my Apache error log file (or in browser with Apache 2.0+):
    ...couldn't create/spawn child process: c:/mywebroot/cgi-bin/awstats.pl
    SOLUTION:
    First, try to run awstats.pl from command line to see if file is correct. If you get some syntax errors and use a Unix like OS, check if your file is a Unix like text file (This means each line end with a LF char and not a CR+LF char).
    If awstats.pl file runs correctly from command line, this is probably because your web server is not able to known how to run perl scripts. This problem can occur with Apache web servers with no internal Perl interpreter (mod_perl not active). To solve this, you must tell Apache where is your external Perl interpreter.
    For this, you have 2 solutions:
    1) Add the following directive in your Apache httpd.conf config (or remove the # to uncomment it if line is already available)
    ScriptInterpreterSource registry
    Then restart Apache. This will tell Apache to look into the registry to find the program associated to .pl extension.
    2) Other solution (not necessary if first solution works): Change the first line of awstats.pl file with the full path of your Perl interpreter.
    Example with Windows OS and ActivePerl Perl interpreter (installed in C:\Program Files\ActiveState\ActivePerl), you must change the first line of awstats.pl file with:
    #!c:/program files/activestate/activeperl/bin/perl


    FAQ-SET220 : CRASH WHILE RUNNING AWSTATS.PL OR PAGE CONTENT ONLY PARTIALY LOADED ON WINDOWS XP
    PROBLEM:
    Sometimes my browser (Most often IE6) crash while running awstats.pl with some AWStats configuration. With some other versions or browsers, page content is partialy loaded.
    SOLUTION:
    Problem was with WinXP and WinXPpro as documented at MS site Q317949;
    "Socket Sharing Creates Data Loss When Listen and Accept Occur on Different Processes"
    Result was that MSIE would crash or display nothing. Netscape and Opera handled the socket better but displayed the pages partially.
    The effect of the bug was more prononced as the page contents increased (above 30k).
    http://support.microsoft.com/default.aspx?scid=kb;EN-US;q317949
    And also at Apache.org
    http://www.apache.org/dist/httpd/binaries/win32/
    MS produced a Hotfix which is now included in SP1.


    FAQ-SET250 : LOG FORMAT SETUP OR ERRORS
    PROBLEM:
    Which value do I have to put in the LogFormat parameter to make AWStats working with my log file format ?
    SOLUTION:
    The AWStats config file give you all possible values for LogFormat parameter. To help you, this is some common cases of log file format, and the corresponding value for LogFormat you must use in your AWStats config file:
    If your log records are EXACTLY like this (NCSA combined/XLF/ELF log format):
    62.161.78.73 - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 1234 "http://www.from.com/from.htm" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
    You must use : LogFormat=1
    This is same than: LogFormat="%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot"
    If your log records are EXACTLY like this (NCSA combined with several virtualhostname sharing same log file).
    virtualserver1 62.161.78.73 - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 1234 "http://www.from.com/from.htm" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
    You must use : LogFormat="%virtualname %host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot"
    If your log records are EXACTLY like this (NCSA combined and mod_gzip format 1 with Apache 1.x):
    62.161.78.73 - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 3904 "http://www.from.com/from.htm" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" mod_gzip: 66pct.
    You must use : LogFormat="%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot %other %gzipratio"
    If your log records are EXACTLY like this (NCSA combined and mod_gzip format 2 with Apache 1.x):
    62.161.78.73 - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 3904 "http://www.from.com/from.htm" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" mod_gzip: DECHUNK:OK In:11393 Out:3904:66pct.
    You must use : LogFormat="%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot %other %other %gzipin %gzipout"
    If your log records are EXACTLY like this (NCSA combined and mod_deflate with Apache 2 ):
    62.161.78.73 - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 3904 "http://www.from.com/from.htm" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" (45)
    You must use : LogFormat="%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot %deflateratio"
    If your log records are EXACTLY like this (NCSA combined with 2 spaces between some fields with Zope):
    62.161.78.73  - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 3904 "http://www.from.com/from.htm" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" (45)
    You must use :
    LogFormat="%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot"
    LogSeparator=" *"
    If your log records are EXACTLY like this (NCSA common CLF log format):
    62.161.78.73 - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 1234
    You must use : LogFormat=4
    Note: Browsers, OS's, Keywords and Referers features are not available with a such format.
    If your log records are EXACTLY like this (With some Squid versions, after setting emulate_http_log to on):
    200.135.30.181 - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET http://www.mydomain.com/page.html HTTP/1.0" 200 456 TCP_CLIENT_REFRESH_MISS:DIRECT
    You must use : LogFormat="%host %other %logname %time1 %methodurl %code %bytesd %other"
    If your log records are EXACTLY like this (Some old IIS W3C log format):
    yyyy-mm-dd hh:mm:ss 62.161.78.73 - GET /page.html 200 1234 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+5.01;+Windows+NT+5.0) http://www.from.com/from.html
    You must use : LogFormat=2
    If your log records are EXACTLY like this (Some IIS W3C log format with some .net servers):
    yyyy-mm-dd hh:mm:ss GET /page.html - 62.161.78.73 - Mozilla/4.0+(compatible;+MSIE+5.01;+Windows+NT+5.0) http://www.from.com/from.html 200 1234 HTTP/1.1
    You must use : LogFormat="%time2 %method %url %logname %host %other %ua %referer %code %bytesd %other"
    If your log records are EXACTLY like this (Some IIS 6+ W3C log format):
    yyyy-mm-dd hh:mm:ss GET /page.html - 62.161.78.73 - Mozilla/4.0+(compatible;+MSIE+5.01;+Windows+NT+5.0) http://www.from.com/from.html 200 1234
    You must use : LogFormat="date time cs-method cs-uri-stem cs-username c-ip cs-version cs(User-Agent) cs(Referer) sc-status sc-bytes"
    If your log records are EXACTLY like this (With some WebSite versions):
    yyyy-mm-dd hh:mm:ss 62.161.78.73 - 192.168.1.1 80 GET /page.html - 200 11205 0 0 HTTP/1.1 mydomain.com Mozilla/4.0+(compatible;+MSIE+5.5;+Windows+98) - http://www.from.com/from.html
    You must use : LogFormat="%time2 %host %logname %other %other %method %url %other %code %bytesd %other %other %other %other %ua %other %referer"
    If your log records are EXACTLY like this (Webstar native log format):
    05/21/00 00:17:31 OK 200 212.242.30.6 Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt) http://www.cover.dk/ "www.cover.dk" :Documentation:graphics:starninelogo.white.gif 1133
    You must use : LogFormat=3
    If your log records are EXACTLY like this (With some Lotus Notes/Domino versions):
    62.161.78.73 - Name Surname Service [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 1234 "http://www.from.com/from.htm" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
    You must use : LogFormat=6
    If your log records are EXACTLY like this (Lotus Notes/Domino 6.x log format):
    62.161.78.73 - "Name Surname" Service [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 1234 "http://www.from.com/from.htm" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
    You must use : LogFormat="%host %other %lognamequot %time1 %methodurl %code %bytesd %refererquot %uaquot"
    If your log records are EXACTLY like this (With Oracle9iAS):
    62.161.78.73 - [dd/mmm/yyyy:hh:mm:ss +0x00] GET /page.html HTTP/1.1 200 1234 - "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
    Where separators are chars or several spaces, You must use : LogFormat="%host %logname %time1 %method %url %other %code %bytesd %referer %uaquot" and LogSeparator="\s+"
    If you use a FTP server like ProFTP:
    See FAQ-COM090.
    If you want to analyze a mail log file (Postfix, Sendmail, QMail, MDaemon, Exchange):
    See FAQ-COM100.
    If you use a Media Server (Realmedia, Windows Media Server):
    See FAQ-COM110.
    If your log records are EXACTLY like this (With some providers):
    62.161.78.73 - - [dd/Month/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" "-" 200 1234
    You must use : LogFormat="%host %other %logname %time1 %methodurl %other %code %bytesd"
    Note: Browsers, OS's, Keywords and Referers features are not available with a such format.
    There is a lot of other possible log formats.
    You must use a personalized log format LogFormat ="..." as described in config file to support other various log formats.


    FAQ-SET270 : ONLY CORRUPTED OR DROPPED RECORDS
    PROBLEM:
    After running an AWStats update process, all my records are reported to be corrupted or dropped
    SOLUTION:
    First, if you have only a small percent of corrupted or dropped records, don't worry. This is a normal behaviour. Few corrupted or dropped records can appear in a log file because of internal web server bug, virus attack, error writing, log purge or rotate during a writing, etc...
    However, if ALL your records are reported to be corrupted or dropped, check the following things:
    If they are all dropped, run the update process from command line adding the option -showdropped
    -> You will be able to know why a dropped record is discarded. In most cases, this is because you use a too large or bad filter parameter (SkipFiles, SkipHosts, OnlyFiles ...).
    If they are all corrupted, run the update process from command line adding the option -showcorrupted
    -> You will be able to know why a corrupted record is discarded.
    If this is because of the log format, check the FAQ-SET350 about log format errors.
    If this is because the date of a record is said to be lower than date of previous, this means that you ran update processes on different log files without keeping the chronological order of log files.
    If this is because the date is invalid, you might have a problem of date not computed correctly this it happens in some Pentium4/Xeon4 processors:
    On some (few) Intel Pentium4 (also Xeon4) based host systems, log file time can not be computed correctly. This is not an issue of AWStats itself. This error usually occurs on source-based linux distributions (gentoo, slackware etc.), where all system libraries are compiled with CPU optimization. AWStats is a highly developed PERL application. PERL itself relies on some system libraries, for example GLIBC. The GLIBC library usually is buggy in this case. There is an easy way to figure out whether the problem described here is responsible for AWStats problems on your system:
    If you have shell access to your machine, simply type the following command:
    perl -e "print int ('541234567891011165415658')"
    (NOTE: any 25-digit number works, there is no need to type this exact number)
    If everything goes fine, you should see a floating point number as output:
    5.41234567891011e+23
    In this case, please do more research on your log file formats. Your host system itself is not responsible for the error.
    But if simply a "0" returns or some other error, this is an indication of your glibc beeing corrupt.
    ATTENTION: The only solution in this case might be to recompile your GLIBC. This can be a quite tricky task. Please consult the documentation and F.A.Q.s of your linux distribution first!! (experts: first check your global compile flags, eg. march=Pentium4. Trying with other compile flags can solve problem quickly in some cases.
    NOTE: In some cases, this error might occour "suddenly", even though AWStats was already running perfect already.


    FAQ-SET280 : ERROR "NOT SAME NUMBER OF RECORDS OF..."
    PROBLEM:
    When I run AWStats from command line (or as a cgi from a browser), I get a message "Not same number of records of ...".
    SOLUTION:
    This means your AWStats reference database files (operating systems, browsers, robots...) are not correct. First try to update to last version. Then check in your disk that you have only ONE of those files. They should be in 'lib' directory ('db' with 4.0) where awstats.pl is installed:
    browsers.pm
    domains.pm
    operating_systems.pm
    robots.pm
    search_engines.pm
    worms.pm
    status_http.pm
    status_smtp.pm


    FAQ-SET300 : ERROR "COULDN'T OPEN FILE ..."
    PROBLEM:
    I have the following error:
    "Couldn't open file /workingpath/awstatsmmyyyy.tmp.9999: Permission denied."
    SOLUTION:
    This error means that the web server didn't succeed in writing the working temporary file (file ended by .tmp.9999 where 9999 is a number) because of permissions problems.
    First check that the directory /workingpath has "Write" permission for
    user nobody (default user used by Apache on Linux systems)
    or user IUSR_SERVERNAME (default used user by IIS on NT).
    With Unix, try with a path with no links.
    With NT, you must check NTFS permissions ("Read/Write/Modify"), if your directory is on a NTFS partition.
    With IIS, there is also a "Write" permission attribute, defined in directory properties in your IIS setup, that you must check.
    With IIS, if a default cgi-bin directory was created during IIS install, try to put AWStats directly into this directory.
    If this still fails, you can change the DirData parameter to say AWStats that you want to use another directory (A directory you are sure that the default user, used by web server process, can write into).


    FAQ-SET320 : ERROR "MALFORMED UTF-8 CHARACTER (UNEXPECTED ..."
    PROBLEM:
    When running AWStats from command line, I get one or several lines like this on my output:
    Malformed UTF-8 character (unexpected non-continuation byte 0x6d, immediately after start byte 0xe4) at /www/cgi-bin/lib/xxx.pm line 999.
    SOLUTION:
    This problem appeared with RedHat 8 and Perl 5.8.
    I don't know if RedHat provides a fix for this, but some users had reported that you can remove thoose warmless messages by changing your LANG environment variable, removing the ".UTF-8" at the end. For example, set LANG="en_US" instead of LANG="en_US.UTF8"


    FAQ-SET350 : EMPTY OR NULL STATISTICS REPORTED
    PROBLEM:
    AWStats seems to work but I'm not getting any results. i get a statistics page that looks like i have no hits.
    SOLUTION:
    That's one of the most common problem you can get and there is 3 possible reasons :

    1) Your log file format setup might be wrong.
    If you use Apache web server
    The best way of working is to use the "combined" log format (See the Setup and Use page to know the way to change your Apache server log from "common" log format into "combined"). Don't forget to stop Apache, reset your log file and restart Apache to make change into combined effective. Then you must setup your AWStats config file with value LogFormat=1.
    If you want to use another format, read the next FAQ to have examples of LogFile value according to log files format.
    If you use IIS server or Windows built-in web server
    The Internet Information Server default W3C Extended Log Format will not work correctly with AWStats. To make it work correctly, start the IIS Snap-in, select the web site and look at it's Properties. Choose W3C Extended Log Format, then Properties, then the Tab Extended Properties and uncheck everything under Extended Properties. Once they are all unchecked, check off the list given in the Setup and Use page ("With IIS Server" chapter).
    You can also read the next FAQ to have examples of LogFormat value according to log files format.

    2) You are viewing stats for a year or month when no hits was made on your server.
    When you run awstats, the reports is by default for the current month/year.
    If you want to see data for another month/year you must:
    Add -year=YYYY -month=MM on command line when building the html report page from command line.
    Use an URL like http://myserver/cgi-bin/awstats.pl?config=xxx&year=YYYY&month=MM if viewing stats with AWStats used as a CGI.

    3) When you read your statistics, AWStats does not use the same config file than the one used for the update process. Scan your disk for files that match awstats.*conf and remove all files that are not the config file(s) you need (awstats.conf files, if found, can be deleted. It is better to use a config file called awstats.mydomain.conf).


    FAQ-SET400 : PIPE REDIRECTION TO A FILE GIVE ME AN EMPTY FILE
    PROBLEM:
    I want to redirect awstats.pl output to a file with the following command :
    > awstats.pl -config=... [other_options] > myfile.html
    But myfile.html is empty (size is 0). If i remove the redirection, everythings works correctly.
    SOLUTION:
    This is not an AWStats bug but a problem between perl and Windows.
    You can easily solve this running the following command instead:
    > perl awstats.pl -config=... [other_options] > myfile.html


    FAQ-SET450 : NO PICTURES/GRAPHICS SHOWN
    PROBLEM:
    AWStats seems to work (all data and counters seem to be good) but I have no image shown.
    SOLUTION:
    With Apache web server, you might have troubles (no picture shown on stats page) if you use a directory called "icons" (because of Apache pre-defined "icons" alias directory), so use instead, for example, a directory called "icon" with no s at the end (Rename your directory physically and change the DirIcons parameter in config file to reflect this change).


    FAQ-SET700 : MY VISITS ARE DOUBLED FOR OLD MONTH I MIGRATED FROM 3.2 TO 5.X
    PROBLEM:
    After having migrated an old history file for a month, the number of visits for this month is doubled. So the number of "visits per visitor" is also doubled and "pages per visit" and "hits per visit" is divided by 2. All other data like "pages", "hits" and bandwith are correct.
    SOLUTION:
    This problem occurs when migrating history files from 3.2 to 5.x.
    To fix this you can use the following tip (warning, do this only after migrating from 3.2 to 5.x and if your visit value is doubled). The goal is to remove the line in history file that looks like this
    YYYYMM00 999 999 999 999
    where YYYY and MM are year and month of config file and 999 are numerical values.

    So if your OS is Unix/Linux
    grep -vE '^[0-9]{6}00' oldhistoryfile > newhistoryfile
    mv newhistoryfile oldhistoryfile
    And then run the migrate process again on the file.

    If your OS is windows and got cygwin
    You must follow same instructions than if OS is Unix/Linux BUT you must do this from a cygwin 'sh' shell and not from the DOS prompt (because the ^ is not understanded by DOS).
    And then run the migrate process again on the file.

    In any other case (in fact works for every OS)
    You must remove manually the line YYYYMM00 999 999 999 999 (must find one and only one such line) and then run the migrate process again on the file.


    FAQ-SET800 : AWSTATS SPEED/TIMEOUT PROBLEMS ?
    PROBLEM:
    When I analyze large log files, processing times are very important (Example: update process from a browser returns a timeout/internal error after a long wait). Is there a setup or things to do to avoid this and increase speed ?
    SOLUTION:
    You really need to understand how a log analyzer works to have good speed. There is also major setup changes you can do to decrease your processing time.
    See important advices in benchmark page.




    FAQ-SEC100 : CAN AWSTATS BE USED TO MAKE CROSS SITE SCRIPTING ATTACKS ?
    PROBLEM:
    If a bad user use a browser to make a hit on an URL that include a < SCRIPT > ... < /SCRIPT > section in its parameter, when AWStats will show the links on the report page, does the script will be executed ?
    SOLUTION:
    No. AWStats use a filter to remove all scripts codes that was included in an URL to make a Cross Site Scripting Attack using a log analyzer report page.


    FAQ-SEC150 : HOW CAN I PREVENT SOME USERS TO SEE STATISTICS OF OTHER USERS ?
    PROBLEM:
    I don't want a user xxx (having a site www.xxx.com) to see statistics of user yyy (having a site www.yyy.com). How can i setup AWStats for this ?
    SOLUTION:
    Take a look at the security page.


    FAQ-SEC200 : HOW TO MANAGE LOG FILES (AND STATISTICS) CORRUPTED BY 'WORMS' ATTACKS ?
    PROBLEM:
    My site is attacked by some worms viruses (like Nimba, Code Red...). This make my log file corrupted and full of 404 errors. So my statistics are also full of 404 errors. This make AWStats slower and my history files very large. Can I do something to avoid this ?
    SOLUTION:
    Yes.
    'Worms' attacks are infected browsers, robots or server changed into web client that make hits on your site using a very long unknown URL like this one:
    /default.ida?XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX%40%50...%40%50
    URL is generated by the infected robot and the purpose is to exploit a vulnerability of the web server (In most cases, only IIS is vulnerable). With such attacks, you will will always find a 'common string' in those URLs. For example, with Code Red worm, there is always default.ida in the URL string. Some other worms send URLs with cmd.exe in it.
    With 6.0 version and higher, you can set the LevelForFormDetection parameter to "2" and ShowWormsStats to "HBL" in config file to enable the worm filtering nd reporting.
    However, this feature reduce seriously AWStats speed and the worms database (lib/worms.pm file) can't contain all worms signatures. So if you still have rubish hits, you can modify the worms.pm file yourself or edit your config file to add in the SkipFiles parameter some values to discard the not required records, using a regex syntax like example :
    SkipFiles="REGEX[^\/default\.ida] REGEX[\/winnt\/system32\/cmd\.exe]"