Commit Graph

35 Commits

Author SHA1 Message Date
Gregory Soutade
6d46ac4461 Robots: Improve compatible keyword detection for robots 2024-07-28 09:25:40 +02:00
Gregory Soutade
974d355dd4 Add no_referrer_domains list to defaut_conf for website that defines this policy 2024-01-30 11:24:52 +01:00
Gregory Soutade
16cd817fec Increase not modified page threshold for robot detection 2023-07-05 09:15:48 +02:00
Gregory Soutade
71d8ee2113 Forgot Firefox icon 2023-03-25 08:11:57 +01:00
Gregory Soutade
440f51ddd1 Remove robot rule 1 page for phones 2023-03-23 21:17:52 +01:00
Gregory Soutade
a0a1f42df4 Update robot detection plugin :
* Do analyze only one time by month
  * Reactivate rule : no page view if count_hit_only_visitors is False
  * Add exception for "Less than 1 hit per page" rule if a phone is used
  * Check for all error codes in 400..499, not only 403 and 404
  * Referer '-' now counted as null
2023-03-11 20:48:17 +01:00
Gregory Soutade
c8dfdd17f7 Add "compatible" as a criteria for robot 2023-02-18 08:49:14 +01:00
Gregory Soutade
a5bef4ece6 Search for "compatible" in all requests, not only the first one 2023-02-18 08:48:57 +01:00
Gregory Soutade
21a21cd68f Add a new rule for robots : 1 page and 1 hit, but not from the same source 2023-02-04 08:40:04 +01:00
Gregory Soutade
6a4fd4e9c8 New rule for robot : more than 10 not modified pages in a row 2023-01-28 09:40:26 +01:00
Gregory Soutade
ac246eabe2 Find robot name in 'compatible' string and group them 2023-01-28 09:38:59 +01:00
Gregory Soutade
975cc66bd5 Don't launch robot analysis rules for feed parsers 2022-11-16 21:10:11 +01:00
Gregory Soutade
4d3c2107f0 Don't save all visitors requests into database (save space and computing). Can be changed in deufalt_conf.py with keep_requests value 2022-06-23 21:16:30 +02:00
5130b1f6d8 Bad 2to3 python conversion : map() function needs to be included into list() operator. If not, they're only analyzed once 2021-08-06 08:45:04 +02:00
Gregory Soutade
0c2ac431d1 Be more strict with robots : requires at least 1 hit per viewed page 2021-06-03 08:52:04 +02:00
f457f4e390 Update code for Python3 2020-10-30 14:42:56 +01:00
Gregory Soutade
bb268114b2 Make backup before compressing (low memory servers)
Fix error : Call post hook plugins even in display only mode
Don't compute unordered hits (remove pasts if they are found after current)
Remove tags in stats diff
Don't do geolocalisation is visitor is not valid
Don't try to find search engine on robots
Update robot check rules
Add top_pages_diff plugin
2019-08-30 07:50:54 +02:00
Gregory Soutade
007be71ad6 New format for (not_)viewed pages/hits and bandwidth that are now recorded by day (in a dictionnary were only element 0 is initialized). Element 0 is the total. WARNING : not backward compatible with previous databases. 2017-08-24 07:55:53 +02:00
Gregory Soutade
68a67adecc Add one more rule to robot detection : more than ten 404 pages viewed 2017-05-25 21:04:18 +02:00
Gregory Soutade
12cc80208d Do merge 2016-02-06 14:45:09 +01:00
Gregory Soutade
4cb3b21ca5 Add reset feature
Allow to open .gz file transparently
Import debug in robots.py
2015-05-22 07:51:11 +02:00
Gregory Soutade
62be78845a Add debug traces in robots plugin 2015-05-13 18:13:18 +02:00
Gregory Soutade
df78a3f4cb [pre_analysis/robots] Don't checks for /robots.txt request, but endswith /robots.txt for robot detection 2015-04-06 17:52:31 +02:00
Gregory Soutade
4c74a14037 Filter robot with *bot* and *crawl* re 2015-01-11 18:06:44 +01:00
Grégory Soutadé
a35d462cb7 Replace # for module description by """ (help auto extraction) 2014-12-19 11:34:25 +01:00
e740bf1e45 Add licence information 2014-12-18 19:54:31 +01:00
4f1c09867d WIP 2014-12-10 07:09:05 +01:00
Grégory Soutadé
751a9b3fae Start big comments (post analysis / referers) 2014-12-09 16:54:02 +01:00
Grégory Soutadé
dd8349ab08 Add option count_hit_only_visitors and function isValidForCurrentAnalysis() 2014-11-27 09:01:51 +01:00
fec5e375e4 Remove iwla parameter in hook functions 2014-11-26 20:31:13 +01:00
Grégory Soutadé
e6b31fbf8a WIP 2014-11-26 16:56:33 +01:00
Grégory Soutadé
81b3eee552 Do a lot of things 2014-11-26 16:17:16 +01:00
d5db763b48 Rework conf in plugins 2014-11-24 21:42:57 +01:00
549c0e5d97 Update conf management 2014-11-24 21:37:37 +01:00
Gregory Soutade
21a95cc2fa Rework plugins with classes 2014-11-24 17:13:59 +01:00