Gregory Soutade
bde91ca936
Move reverse DNS core management into iwla.py + Add robot_domains configuration
2024-10-27 09:16:01 +01:00
Gregory Soutade
70de0d3aca
Add no_merge_feeds_parsers_list conf value
2024-10-27 09:15:39 +01:00
Gregory Soutade
9939922c31
Move feeds and reverse_dns plugins from post_analysis to pre_analysis
2024-10-02 08:27:53 +02:00
Gregory Soutade
6d46ac4461
Robots: Improve compatible keyword detection for robots
2024-07-28 09:25:40 +02:00
Gregory Soutade
974d355dd4
Add no_referrer_domains list to defaut_conf for website that defines this policy
2024-01-30 11:24:52 +01:00
Gregory Soutade
16cd817fec
Increase not modified page threshold for robot detection
2023-07-05 09:15:48 +02:00
Gregory Soutade
71d8ee2113
Forgot Firefox icon
2023-03-25 08:11:57 +01:00
Gregory Soutade
440f51ddd1
Remove robot rule 1 page for phones
2023-03-23 21:17:52 +01:00
Gregory Soutade
a0a1f42df4
Update robot detection plugin :
...
* Do analyze only one time by month
* Reactivate rule : no page view if count_hit_only_visitors is False
* Add exception for "Less than 1 hit per page" rule if a phone is used
* Check for all error codes in 400..499, not only 403 and 404
* Referer '-' now counted as null
2023-03-11 20:48:17 +01:00
Gregory Soutade
c8dfdd17f7
Add "compatible" as a criteria for robot
2023-02-18 08:49:14 +01:00
Gregory Soutade
a5bef4ece6
Search for "compatible" in all requests, not only the first one
2023-02-18 08:48:57 +01:00
Gregory Soutade
21a21cd68f
Add a new rule for robots : 1 page and 1 hit, but not from the same source
2023-02-04 08:40:04 +01:00
Gregory Soutade
6a4fd4e9c8
New rule for robot : more than 10 not modified pages in a row
2023-01-28 09:40:26 +01:00
Gregory Soutade
ac246eabe2
Find robot name in 'compatible' string and group them
2023-01-28 09:38:59 +01:00
Gregory Soutade
975cc66bd5
Don't launch robot analysis rules for feed parsers
2022-11-16 21:10:11 +01:00
Gregory Soutade
4d3c2107f0
Don't save all visitors requests into database (save space and computing). Can be changed in deufalt_conf.py with keep_requests value
2022-06-23 21:16:30 +02:00
5130b1f6d8
Bad 2to3 python conversion : map() function needs to be included into list() operator. If not, they're only analyzed once
2021-08-06 08:45:04 +02:00
Gregory Soutade
0c2ac431d1
Be more strict with robots : requires at least 1 hit per viewed page
2021-06-03 08:52:04 +02:00
f457f4e390
Update code for Python3
2020-10-30 14:42:56 +01:00
Gregory Soutade
bb268114b2
Make backup before compressing (low memory servers)
...
Fix error : Call post hook plugins even in display only mode
Don't compute unordered hits (remove pasts if they are found after current)
Remove tags in stats diff
Don't do geolocalisation is visitor is not valid
Don't try to find search engine on robots
Update robot check rules
Add top_pages_diff plugin
2019-08-30 07:50:54 +02:00
Gregory Soutade
007be71ad6
New format for (not_)viewed pages/hits and bandwidth that are now recorded by day (in a dictionnary were only element 0 is initialized). Element 0 is the total. WARNING : not backward compatible with previous databases.
2017-08-24 07:55:53 +02:00
Gregory Soutade
68a67adecc
Add one more rule to robot detection : more than ten 404 pages viewed
2017-05-25 21:04:18 +02:00
Gregory Soutade
12cc80208d
Do merge
2016-02-06 14:45:09 +01:00
Gregory Soutade
4cb3b21ca5
Add reset feature
...
Allow to open .gz file transparently
Import debug in robots.py
2015-05-22 07:51:11 +02:00
Gregory Soutade
62be78845a
Add debug traces in robots plugin
2015-05-13 18:13:18 +02:00
Gregory Soutade
df78a3f4cb
[pre_analysis/robots] Don't checks for /robots.txt request, but endswith /robots.txt for robot detection
2015-04-06 17:52:31 +02:00
Gregory Soutade
1d9bf71b4b
Set arguments of page_to_hit facultative
2015-01-13 18:54:57 +01:00
Gregory Soutade
4c74a14037
Filter robot with *bot* and *crawl* re
2015-01-11 18:06:44 +01:00
Grégory Soutadé
a35d462cb7
Replace # for module description by """ (help auto extraction)
2014-12-19 11:34:25 +01:00
e740bf1e45
Add licence information
2014-12-18 19:54:31 +01:00
Gregory Soutade
3a246d5cd6
Optimize analysis using reverse loop
2014-12-14 15:10:13 +01:00
4f1c09867d
WIP
2014-12-10 07:09:05 +01:00
Grégory Soutadé
751a9b3fae
Start big comments (post analysis / referers)
2014-12-09 16:54:02 +01:00
Grégory Soutadé
c87ddfb1aa
Add hit_to_page_conf in addition to page_to_hit_conf
2014-11-27 13:46:58 +01:00
Grégory Soutadé
5ccc63c7ae
Add hasBeenViewed() function
2014-11-27 13:07:14 +01:00
Grégory Soutadé
9fbc5448bc
Add conf_requires.
...
Load plugins in order
2014-11-27 12:34:42 +01:00
Grégory Soutadé
dd8349ab08
Add option count_hit_only_visitors and function isValidForCurrentAnalysis()
2014-11-27 09:01:51 +01:00
6b0ed18f35
Remove viewed limitation in page_to_hit : skip good requests
2014-11-26 22:06:58 +01:00
fec5e375e4
Remove iwla parameter in hook functions
2014-11-26 20:31:13 +01:00
9571bf09b6
Work with time
2014-11-26 19:53:00 +01:00
Grégory Soutadé
e6b31fbf8a
WIP
2014-11-26 16:56:33 +01:00
Grégory Soutadé
81b3eee552
Do a lot of things
2014-11-26 16:17:16 +01:00
Grégory Soutadé
7405cf237a
Do a more generic plugin : page_to_hit
2014-11-25 16:22:07 +01:00
d5db763b48
Rework conf in plugins
2014-11-24 21:42:57 +01:00
549c0e5d97
Update conf management
2014-11-24 21:37:37 +01:00
Gregory Soutade
21a95cc2fa
Rework plugins with classes
2014-11-24 17:13:59 +01:00
Gregory Soutade
670f024905
Add bytesToStr()
...
Automatically convert list into strings in appendRow()
Add package information
2014-11-24 13:44:04 +01:00
Gregory Soutade
e51e07f65e
Very nice result
2014-11-21 16:56:58 +01:00
Gregory Soutade
7dada493ab
Plugins OK
2014-11-21 10:41:29 +01:00
Gregory Soutade
f3cb04b16c
Externalize plugins
2014-11-20 16:15:57 +01:00