Compare commits

...

3 Commits

Author SHA1 Message Date
Grégory Soutadé c9500e2e99 Update Changelog 2024-03-16 09:08:24 +01:00
Grégory Soutadé ca3c0eefdf Update documentation 2024-03-16 09:02:06 +01:00
Grégory Soutadé 1e09852d18 Update locales 2024-03-16 08:53:44 +01:00
6 changed files with 180 additions and 88 deletions

View File

@ -1,3 +1,30 @@
v0.7 (17/03/2024)
** User **
Awstats data updated (7.9)
Improve page/hit detection
--display-only switch now takes an argument (month/year), analyze is not yet necessary
Add --disable-display option
Geo IP plugin updated (use of [ip-api.com](https://ip-api.com/))
Add _subdomains_ plugin
New way to display global statistics : with links in months names instead of "Details" button
Add excluded domain option
** Dev **
Remove detection from awstats dataset for browser
Don't analyze referer for non viewed hits/pages
Remove all trailing slashs of URL before starting analyze
Main key for visits is now "remote\_ip" and not "remote\_addr"
Add IP type plugin to support IPv4 and IPv6
Update robot detection
Display visitor IP is now a filter
Generate HTML part in dry run mode (but don't write it to disk)
Set lang value in generated HTML page
Add no\_referrer\_domains list to defaut_conf for website that defines this policy
Set count\_hit\_only\_visitors to False by default
** Bugs **
Flags management for feeds display
v0.6 (20/11/2022)
** User **
Replace track_users by filter_users plugins which can interpret conditional filters from configuration

View File

@ -6,7 +6,7 @@ Introduction
iwla (Intelligent Web Log Analyzer) is basically a clone of [awstats](http://www.awstats.org). The main problem with awstats is that it's a very monolithic project with everything in one big PERL file. In opposite, iwla has been though to be very modular : a small core analysis and a lot of filters. It can be viewed as UNIX pipes. Philosophy of iwla is : add, update, delete ! That's the job of each filter : modify statistics until final result. It's written in Python.
Nevertheless, iwla is only focused on HTTP logs. It uses data (robots definitions, search engines definitions) and design from awstats. Moreover, it's not dynamic, but only generates static HTML page (with gzip compression option).
Nevertheless, iwla is only focused on HTTP logs. It uses data (search engines definitions) and design from awstats. Moreover, it's not dynamic, but only generates static HTML page (with gzip compression option).
Demo
----
@ -16,8 +16,7 @@ A demonstration instance is available [here](https://iwla-demo.soutade.fr)
Usage
-----
./iwla [-c|--config-file file] [-C|--clean-output] [-i|--stdin] [-f FILE|--file FILE] [-d LOGLEVEL|--log-level LOGLEVEL] [-r|--reset year/month] [-z|--dont-compress] [-p] [-D|--dry-run]
./iwla [-c|--config-file file] [-C|--clean-output] [-i|--stdin] [-f FILE|--file FILE] [-d LOGLEVEL|--log-level LOGLEVEL] [-r|--reset year/month] [-z|--dont-compress] [-p] [-P|--disable-display] [-D|--dry-run]
-c : Configuration file to use (default conf.py)
-C : Clean output (database and HTML) before starting
-i : Read data from stdin instead of conf.analyzed_filename
@ -26,6 +25,7 @@ Usage
-r : Reset analysis to a specific date (month/year)
-z : Don't compress databases (bigger but faster, not compatible with compressed databases)
-p : Only generate display
-P : Don't generate display
-d : Dry run (don't write/update files to disk)
Basic usage
@ -48,6 +48,7 @@ You can also append an element to an existing default configuration list by usin
multimedia_files_append = ['xml']
or
multimedia_files_append = 'xml'
Will append 'xml' to current multimedia_files list
Then, you can launch iwla. Output HTML files are created in _output_ directory by default. To quickly see it, go into _output_ and type
@ -87,7 +88,7 @@ To use plugins, just insert their file name (without _.py_ extension) in _pre_an
Statistics are stored in dictionaries :
* **month_stats** : Statistics of current analysed month
* **valid_visitor** : A subset of month_stats without robots
* **valid_visitors** : A subset of month_stats without robots
* **days_stats** : Statistics of current analysed day
* **visits** : All visitors with all of its requests (only if 'keep_requests' is true or filtered)
* **meta** : Final result of month statistics (by year)
@ -103,6 +104,7 @@ The two functions to overload are _load(self)_ that must returns True or False i
For display plugins, a lot of code has been wrote in _display.py_ that simplify the creation on HTML blocks, tables and bar graphs.
Plugins
=======
@ -129,6 +131,7 @@ Optional configuration values ends with *.
* plugins/display/top_pages_diff.py
* plugins/display/top_pages.py
* plugins/display/top_visitors.py
* plugins/display/visitor_ip.py
* plugins/post_analysis/anonymize_ip.py
* plugins/post_analysis/browsers.py
* plugins/post_analysis/feeds.py
@ -163,6 +166,7 @@ iwla
locales_path
compress_output_files
excluded_ip
excluded_domain_name
Output files :
DB_ROOT/meta.db
@ -203,7 +207,7 @@ iwla
nb_visitors
visits :
remote_addr =>
remote_ip =>
remote_addr
remote_ip
viewed_pages{0..31} # 0 contains total
@ -573,7 +577,6 @@ plugins.display.robot_bandwidth
None
Conf values needed :
display_visitor_ip*
create_all_robot_bandwidth_page*
Output files :
@ -763,7 +766,33 @@ plugins.display.top_visitors
None
Conf values needed :
display_visitor_ip*
None
Output files :
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
Statistics update :
None
Statistics deletion :
None
plugins.display.visitor_ip
--------------------------
Display hook
Display IP below visitor name
Plugin requirements :
None
Conf values needed :
compact_ip*
Output files :
OUTPUT_ROOT/year/month/index.html
@ -823,7 +852,7 @@ plugins.post_analysis.browsers
Statistics creation :
visits :
remote_addr =>
remote_ip =>
browser
month_stats :
@ -861,7 +890,7 @@ plugins.post_analysis.feeds
None
Statistics creation :
remote_addr =>
remote_ip =>
feed_parser
feed_name_analysed
feed_parser_last_access (for merged parser)
@ -916,13 +945,13 @@ plugins.post_analysis.filter_users
Statistics creation :
visits :
remote_addr =>
remote_ip =>
filtered
geo_location
Statistics update :
visits :
remote_addr =>
remote_ip =>
keep_requests
Statistics deletion :
@ -1014,7 +1043,7 @@ plugins.post_analysis.ip_type
Statistics creation :
visits :
remote_addr =>
remote_ip =>
ip_type
month_stats :
@ -1045,7 +1074,7 @@ plugins.post_analysis.operating_systems
Statistics creation :
visits :
remote_addr =>
remote_ip =>
operating_system
month_stats :
@ -1279,7 +1308,8 @@ plugins.pre_analysis.robots
None
Conf values needed :
None
count_hit_only_visitors
no_referrer_domains
Output files :
None

View File

@ -6,7 +6,7 @@ Introduction
iwla (Intelligent Web Log Analyzer) is basically a clone of [awstats](http://www.awstats.org). The main problem with awstats is that it's a very monolithic project with everything in one big PERL file. In opposite, iwla has been though to be very modular : a small core analysis and a lot of filters. It can be viewed as UNIX pipes. Philosophy of iwla is : add, update, delete ! That's the job of each filter : modify statistics until final result. It's written in Python.
Nevertheless, iwla is only focused on HTTP logs. It uses data (robots definitions, search engines definitions) and design from awstats. Moreover, it's not dynamic, but only generates static HTML page (with gzip compression option).
Nevertheless, iwla is only focused on HTTP logs. It uses data (search engines definitions) and design from awstats. Moreover, it's not dynamic, but only generates static HTML page (with gzip compression option).
Demo
----
@ -16,8 +16,7 @@ A demonstration instance is available [here](https://iwla-demo.soutade.fr)
Usage
-----
./iwla [-c|--config-file file] [-C|--clean-output] [-i|--stdin] [-f FILE|--file FILE] [-d LOGLEVEL|--log-level LOGLEVEL] [-r|--reset year/month] [-z|--dont-compress] [-p] [-D|--dry-run]
./iwla [-c|--config-file file] [-C|--clean-output] [-i|--stdin] [-f FILE|--file FILE] [-d LOGLEVEL|--log-level LOGLEVEL] [-r|--reset year/month] [-z|--dont-compress] [-p] [-P|--disable-display] [-D|--dry-run]
-c : Configuration file to use (default conf.py)
-C : Clean output (database and HTML) before starting
-i : Read data from stdin instead of conf.analyzed_filename
@ -26,6 +25,7 @@ Usage
-r : Reset analysis to a specific date (month/year)
-z : Don't compress databases (bigger but faster, not compatible with compressed databases)
-p : Only generate display
-P : Don't generate display
-d : Dry run (don't write/update files to disk)
Basic usage
@ -48,6 +48,7 @@ You can also append an element to an existing default configuration list by usin
multimedia_files_append = ['xml']
or
multimedia_files_append = 'xml'
Will append 'xml' to current multimedia_files list
Then, you can launch iwla. Output HTML files are created in _output_ directory by default. To quickly see it, go into _output_ and type
@ -87,7 +88,7 @@ To use plugins, just insert their file name (without _.py_ extension) in _pre_an
Statistics are stored in dictionaries :
* **month_stats** : Statistics of current analysed month
* **valid_visitor** : A subset of month_stats without robots
* **valid_visitors** : A subset of month_stats without robots
* **days_stats** : Statistics of current analysed day
* **visits** : All visitors with all of its requests (only if 'keep_requests' is true or filtered)
* **meta** : Final result of month statistics (by year)
@ -103,6 +104,7 @@ The two functions to overload are _load(self)_ that must returns True or False i
For display plugins, a lot of code has been wrote in _display.py_ that simplify the creation on HTML blocks, tables and bar graphs.
Plugins
=======

View File

@ -19,6 +19,7 @@
* plugins/display/top_pages_diff.py
* plugins/display/top_pages.py
* plugins/display/top_visitors.py
* plugins/display/visitor_ip.py
* plugins/post_analysis/anonymize_ip.py
* plugins/post_analysis/browsers.py
* plugins/post_analysis/feeds.py
@ -53,6 +54,7 @@ iwla
locales_path
compress_output_files
excluded_ip
excluded_domain_name
Output files :
DB_ROOT/meta.db
@ -93,7 +95,7 @@ iwla
nb_visitors
visits :
remote_addr =>
remote_ip =>
remote_addr
remote_ip
viewed_pages{0..31} # 0 contains total
@ -463,7 +465,6 @@ plugins.display.robot_bandwidth
None
Conf values needed :
display_visitor_ip*
create_all_robot_bandwidth_page*
Output files :
@ -653,7 +654,33 @@ plugins.display.top_visitors
None
Conf values needed :
display_visitor_ip*
None
Output files :
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
Statistics update :
None
Statistics deletion :
None
plugins.display.visitor_ip
--------------------------
Display hook
Display IP below visitor name
Plugin requirements :
None
Conf values needed :
compact_ip*
Output files :
OUTPUT_ROOT/year/month/index.html
@ -713,7 +740,7 @@ plugins.post_analysis.browsers
Statistics creation :
visits :
remote_addr =>
remote_ip =>
browser
month_stats :
@ -751,7 +778,7 @@ plugins.post_analysis.feeds
None
Statistics creation :
remote_addr =>
remote_ip =>
feed_parser
feed_name_analysed
feed_parser_last_access (for merged parser)
@ -806,13 +833,13 @@ plugins.post_analysis.filter_users
Statistics creation :
visits :
remote_addr =>
remote_ip =>
filtered
geo_location
Statistics update :
visits :
remote_addr =>
remote_ip =>
keep_requests
Statistics deletion :
@ -904,7 +931,7 @@ plugins.post_analysis.ip_type
Statistics creation :
visits :
remote_addr =>
remote_ip =>
ip_type
month_stats :
@ -935,7 +962,7 @@ plugins.post_analysis.operating_systems
Statistics creation :
visits :
remote_addr =>
remote_ip =>
operating_system
month_stats :
@ -1169,7 +1196,8 @@ plugins.pre_analysis.robots
None
Conf values needed :
None
count_hit_only_visitors
no_referrer_domains
Output files :
None

Binary file not shown.

View File

@ -5,8 +5,8 @@
msgid ""
msgstr ""
"Project-Id-Version: iwla\n"
"POT-Creation-Date: 2023-04-18 20:35+0200\n"
"PO-Revision-Date: 2023-04-18 20:36+0200\n"
"POT-Creation-Date: 2024-03-16 08:52+0100\n"
"PO-Revision-Date: 2024-03-16 08:53+0100\n"
"Last-Translator: Soutadé <soutade@gmail.com>\n"
"Language-Team: iwla\n"
"Language: fr\n"
@ -15,7 +15,7 @@ msgstr ""
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n > 1);\n"
"Generated-By: pygettext.py 1.5\n"
"X-Generator: Poedit 3.2.2\n"
"X-Generator: Poedit 3.4.2\n"
"X-Poedit-SourceCharset: UTF-8\n"
#: display.py:32
@ -38,11 +38,11 @@ msgstr "Juillet"
msgid "March"
msgstr "Mars"
#: display.py:32 iwla.py:463 iwla.py:515
#: display.py:32 iwla.py:474 iwla.py:526
msgid "June"
msgstr "Juin"
#: display.py:32 iwla.py:463 iwla.py:515
#: display.py:32 iwla.py:474 iwla.py:526
msgid "May"
msgstr "Mai"
@ -66,154 +66,155 @@ msgstr "Octobre"
msgid "September"
msgstr "Septembre"
#: display.py:199
#: display.py:207
msgid "Ratio"
msgstr "Pourcentage"
#: iwla.py:456
#: iwla.py:467
msgid "Statistics"
msgstr "Statistiques"
#: iwla.py:463 iwla.py:515
#: iwla.py:474 iwla.py:526
msgid "Apr"
msgstr "Avr"
#: iwla.py:463 iwla.py:515
#: iwla.py:474 iwla.py:526
msgid "Aug"
msgstr "Août"
#: iwla.py:463 iwla.py:515
#: iwla.py:474 iwla.py:526
msgid "Dec"
msgstr "Déc"
#: iwla.py:463 iwla.py:515
#: iwla.py:474 iwla.py:526
msgid "Feb"
msgstr "Fév"
#: iwla.py:463 iwla.py:515
#: iwla.py:474 iwla.py:526
msgid "Jan"
msgstr "Jan"
#: iwla.py:463 iwla.py:515
#: iwla.py:474 iwla.py:526
msgid "Jul"
msgstr "Jui"
#: iwla.py:463 iwla.py:515
#: iwla.py:474 iwla.py:526
msgid "Mar"
msgstr "Mars"
#: iwla.py:463 iwla.py:515
#: iwla.py:474 iwla.py:526
msgid "Nov"
msgstr "Nov"
#: iwla.py:463 iwla.py:515
#: iwla.py:474 iwla.py:526
msgid "Oct"
msgstr "Oct"
#: iwla.py:463 iwla.py:515
#: iwla.py:474 iwla.py:526
msgid "Sep"
msgstr "Sep"
#: iwla.py:465 iwla.py:517
#: iwla.py:476 iwla.py:528
msgid "Not viewed Bandwidth"
msgstr "Traffic non vu"
#: iwla.py:465 iwla.py:517
#: iwla.py:476 iwla.py:528
msgid "Visits"
msgstr "Visites"
#: iwla.py:465 iwla.py:517 plugins/display/all_visits.py:70
#: plugins/display/feeds.py:76 plugins/display/filter_users.py:77
#: iwla.py:476 iwla.py:528 plugins/display/all_visits.py:70
#: plugins/display/feeds.py:75 plugins/display/filter_users.py:77
#: plugins/display/filter_users.py:123 plugins/display/hours_stats.py:73
#: plugins/display/hours_stats.py:83 plugins/display/referers.py:95
#: plugins/display/referers.py:153 plugins/display/top_visitors.py:72
msgid "Pages"
msgstr "Pages"
#: iwla.py:465 iwla.py:517 plugins/display/all_visits.py:70
#: plugins/display/feeds.py:76 plugins/display/filter_users.py:123
#: iwla.py:476 iwla.py:528 plugins/display/all_visits.py:70
#: plugins/display/feeds.py:75 plugins/display/filter_users.py:123
#: plugins/display/hours_stats.py:73 plugins/display/hours_stats.py:83
#: plugins/display/referers.py:95 plugins/display/referers.py:153
#: plugins/display/top_downloads.py:97 plugins/display/top_visitors.py:72
msgid "Hits"
msgstr "Hits"
#: iwla.py:465 iwla.py:517 plugins/display/all_visits.py:70
#: iwla.py:476 iwla.py:528 plugins/display/all_visits.py:70
#: plugins/display/hours_stats.py:73 plugins/display/hours_stats.py:83
#: plugins/display/robot_bandwidth.py:92 plugins/display/robot_bandwidth.py:118
#: plugins/display/robot_bandwidth.py:90 plugins/display/robot_bandwidth.py:112
#: plugins/display/top_visitors.py:72
msgid "Bandwidth"
msgstr "Bande passante"
#: iwla.py:465 plugins/display/hours_stats.py:71
#: iwla.py:476 plugins/display/hours_stats.py:71
msgid "By day"
msgstr "Par jour"
#: iwla.py:465 plugins/display/hours_stats.py:73
#: iwla.py:476 plugins/display/hours_stats.py:73
msgid "Day"
msgstr "Jour"
#: iwla.py:505
#: iwla.py:516
msgid "Average"
msgstr "Moyenne"
#: iwla.py:508 iwla.py:542
#: iwla.py:519 iwla.py:553
msgid "Total"
msgstr "Total"
#: iwla.py:516
#: iwla.py:527
msgid "Summary"
msgstr "Résumé"
#: iwla.py:517
#: iwla.py:528
msgid "Month"
msgstr "Mois"
#: iwla.py:517 plugins/display/ip_to_geo.py:94 plugins/display/ip_to_geo.py:112
#: iwla.py:528 plugins/display/ip_to_geo.py:89 plugins/display/ip_to_geo.py:107
msgid "Visitors"
msgstr "Visiteurs"
#: iwla.py:553
#: iwla.py:564
msgid "Statistics for"
msgstr "Statistiques pour"
#: iwla.py:560
#: iwla.py:571
msgid "Last update"
msgstr "Dernière mise à jour"
#: iwla.py:564
#: iwla.py:575
msgid "Time analysis"
msgstr "Durée de l'analyse"
#: iwla.py:566
#: iwla.py:577
msgid "hours"
msgstr "heures"
#: iwla.py:567
#: iwla.py:578
msgid "minutes"
msgstr "minutes"
#: iwla.py:567
#: iwla.py:578
msgid "seconds"
msgstr "secondes"
#: plugins/display/all_visits.py:70 plugins/display/all_visits.py:92
#: plugins/display/all_visits.py:70 plugins/display/all_visits.py:87
#: plugins/display/all_visits_enlight.py:67
msgid "All visits"
msgstr "Toutes les visites"
#: plugins/display/all_visits.py:70 plugins/display/feeds.py:76
#: plugins/display/all_visits.py:70 plugins/display/feeds.py:75
#: plugins/display/filter_users.py:123 plugins/display/ip_to_geo.py:62
#: plugins/display/robot_bandwidth.py:92 plugins/display/top_visitors.py:72
#: plugins/display/robot_bandwidth.py:90 plugins/display/top_visitors.py:72
#: plugins/display/visitor_ip.py:54
msgid "Host"
msgstr "Hôte"
#: plugins/display/all_visits.py:70 plugins/display/robot_bandwidth.py:92
#: plugins/display/robot_bandwidth.py:118 plugins/display/top_visitors.py:72
#: plugins/display/all_visits.py:70 plugins/display/robot_bandwidth.py:90
#: plugins/display/robot_bandwidth.py:112 plugins/display/top_visitors.py:72
msgid "Last seen"
msgstr "Dernière visite"
#: plugins/display/all_visits.py:93 plugins/display/top_visitors.py:72
#: plugins/display/all_visits.py:88 plugins/display/top_visitors.py:72
msgid "Top visitors"
msgstr "Top visiteurs"
@ -234,14 +235,14 @@ msgid "Entrance"
msgstr "Entrées"
#: plugins/display/browsers.py:109 plugins/display/browsers.py:137
#: plugins/display/filter_users.py:133 plugins/display/referers.py:110
#: plugins/display/filter_users.py:128 plugins/display/referers.py:110
#: plugins/display/referers.py:125 plugins/display/referers.py:140
#: plugins/display/referers.py:163 plugins/display/referers.py:174
#: plugins/display/referers.py:185 plugins/display/referers.py:222
#: plugins/display/top_downloads.py:83 plugins/display/top_downloads.py:103
#: plugins/display/top_hits.py:82 plugins/display/top_hits.py:103
#: plugins/display/top_pages.py:82 plugins/display/top_pages.py:102
#: plugins/display/top_visitors.py:92
#: plugins/display/top_visitors.py:87
msgid "Others"
msgstr "Autres"
@ -257,32 +258,36 @@ msgstr "Tous les navigateurs"
msgid "All Feeds parsers"
msgstr "Tous les agrégateurs"
#: plugins/display/feeds.py:76
#: plugins/display/feeds.py:75
msgid "All feeds parsers"
msgstr "Tous les agrégateurs"
#: plugins/display/feeds.py:76 plugins/display/filter_users.py:77
#: plugins/display/feeds.py:75 plugins/display/filter_users.py:77
#: plugins/display/filter_users.py:123
msgid "Last Access"
msgstr "Dernière visite"
#: plugins/display/feeds.py:96
#: plugins/display/feeds.py:93
msgid "Merged feeds parsers"
msgstr "Agrégateurs fusionnés"
#: plugins/display/feeds.py:101
#: plugins/display/feeds.py:98
msgid "Feeds parsers"
msgstr "Agrégateurs"
#: plugins/display/feeds.py:103 plugins/display/filter_users.py:118
#: plugins/display/feeds.py:100 plugins/display/filter_users.py:118
#: plugins/display/operating_systems.py:90
msgid "Details"
msgstr "Détails"
#: plugins/display/feeds.py:108
#: plugins/display/feeds.py:105
msgid "Found"
msgstr "Trouvé"
#: plugins/display/filter_users.py:77
msgid "Location"
msgstr "Position"
#: plugins/display/filter_users.py:77
msgid "Referer"
msgstr "Origine"
@ -335,16 +340,16 @@ msgstr "Par heures"
msgid "Hours"
msgstr "Heures"
#: plugins/display/ip_to_geo.py:94
#: plugins/display/ip_to_geo.py:89
msgid "Country"
msgstr "Pays"
#: plugins/display/ip_to_geo.py:94 plugins/display/ip_to_geo.py:105
#: plugins/display/ip_to_geo.py:112
#: plugins/display/ip_to_geo.py:89 plugins/display/ip_to_geo.py:100
#: plugins/display/ip_to_geo.py:107
msgid "Countries"
msgstr "Pays"
#: plugins/display/ip_to_geo.py:107
#: plugins/display/ip_to_geo.py:102
msgid "All countries"
msgstr "Tous les pays"
@ -418,20 +423,20 @@ msgstr "Top phrases clé"
msgid "All key phrases"
msgstr "Toutes les phrases clé"
#: plugins/display/robot_bandwidth.py:92
#: plugins/display/robot_bandwidth.py:90
#, fuzzy
msgid "Name"
msgstr "Nom"
#: plugins/display/robot_bandwidth.py:111
#: plugins/display/robot_bandwidth.py:105
msgid "Robots bandwidth"
msgstr "Bande passante robots"
#: plugins/display/robot_bandwidth.py:113
#: plugins/display/robot_bandwidth.py:107
msgid "All robots bandwidth"
msgstr "Bande passante tous les robots"
#: plugins/display/robot_bandwidth.py:118
#: plugins/display/robot_bandwidth.py:112
msgid "Robot"
msgstr "Robot"