162 Commits

Author SHA1 Message Date
renaud gaudin
4b7e504d99 Updated test and stats to new crawl.json format 2023-01-31 11:12:36 +00:00
renaud gaudin
554fff5c87 Using browsertrix-crawler 0.8.0-beta.1 2023-01-31 10:34:32 +00:00
renaud gaudin
8fd9462e25 triggering a rebuild with updated (still main) warc2zim 2023-01-16 11:39:05 +00:00
renaud gaudin
0172c53c50 warc2zim is now at main branch, not master 2023-01-13 10:02:29 +00:00
renaud gaudin
3756c6612f Using browsertrix-crawler 0.8.0-beta.0 2023-01-13 09:59:07 +00:00
Kelson
511fccdc56
"main" is the new default branch 2022-12-21 11:07:37 +01:00
Kelson
859e79c165
"main" is the new default branch 2022-12-21 11:06:50 +01:00
renaud gaudin
cf26f8c33a Using browsertrix-crawler 0.7.1 2022-11-16 11:20:39 +00:00
renaud gaudin
0624c50121 Using browsertrix-crawler 0.7.0 (release) 2022-10-12 14:57:01 +00:00
renaud gaudin
fab4ff6bf5 using crawler 0.7.0-beta.5 2022-09-21 08:29:59 +00:00
renaud gaudin
a9cf1cd9c3 using crawler 0.7.0-beta.4 2022-09-09 07:26:03 +00:00
renaud gaudin
2d4375fd0a use crawler 0.7.0-beta.3 2022-09-03 18:44:48 +00:00
renaud gaudin
472c4cf41a trigger build for warc2zim update 2022-08-30 10:53:03 +00:00
renaud gaudin
ce68493087 increased check_url timeouts 2022-07-25 08:41:08 +00:00
renaud gaudin
857e044c84 Fixed --allowHashUrls incorrectly requiring a value 2022-07-18 10:23:16 +00:00
renaud gaudin
8c6d2bfb45 using browsertrix-crawler 0.7 beta 2022-07-04 15:08:49 +00:00
renaud gaudin
b79ad1b138 use master warc2zim in-between releases 2022-06-30 09:42:50 +00:00
renaud gaudin
142970bc0a Fixed #137: normalizes homepage redirects to standart ports 2022-06-22 09:57:01 +00:00
renaud gaudin
b29aeb08e6 back to dev 2022-06-21 17:20:30 +00:00
renaud gaudin
0eeb2ad9e3 Releasing 1.2.0 v1.2.0 2022-06-21 17:08:38 +00:00
renaud gaudin
dffc81860e updated docker publish action 2022-06-21 17:06:40 +00:00
renaud gaudin
e32aac3ec0 code styling 2022-06-21 17:05:08 +00:00
rgaudin
b2bb77cd65
Merge pull request #108 from openzim/crawler-with-video
update to latest browsertrix-crawler + warc2zim
2022-06-21 16:59:15 +00:00
renaud gaudin
932f97c999 updated tests for crawler and warc2zim 2022-06-21 16:55:32 +00:00
renaud gaudin
1f490ace8f Updated to browsertrix-crawler 0.6 and warc2zim 1.4 2022-06-21 12:04:56 +00:00
renaud gaudin
8b5eeb31c7 using crawler 0.6beta1 2022-06-14 14:58:33 +00:00
Ilya Kreymer
acf0aaf552 update to latest browsertrix-crawler
test with dev build of warc2zim 1.4.0 release
2022-06-14 14:58:33 +00:00
rgaudin
823e6bbb01
Merge pull request #132 from openzim/ci
updated CI test website URL
2022-06-13 10:05:25 +00:00
renaud gaudin
e29b6f3ad6 CI on push is suffiscient 2022-06-13 10:02:35 +00:00
renaud gaudin
885e1763a1 updated CI test website URL 2022-06-13 09:57:37 +00:00
Kelson
80f3d3293f
Merge pull request #129 from openzim/release-badge
Release badge
2022-06-11 20:06:20 +02:00
Emmanuel Engelhart
0025901959
Replace Docker Hub build badge with CI badge 2022-06-11 11:56:18 +02:00
Emmanuel Engelhart
99f8fbafe1
Movebot does not exist anymore 2022-06-11 11:53:35 +02:00
Emmanuel Engelhart
3d3f4fb121
Add release tag 2022-06-11 11:52:48 +02:00
rgaudin
8bcd692462
Merge pull request #125 from JensKorte/patch-1
Update README.md
2022-05-30 22:07:10 +02:00
JensKorte
1f31d6c1a5
Update README.md
relative link didn't work and replaced by https://github.com/openzim/warc2zim
2022-05-30 21:45:18 +02:00
renaud gaudin
98587045b4 Updated readme: warc2zim params can be passed 2022-05-03 10:31:34 +00:00
renaud gaudin
efd8ca53b4 updating crawler and warc2zim v1.1.5 2021-06-10 14:14:11 +00:00
renaud gaudin
14ced5c481 fixed tests for new folder structure 2021-05-12 17:15:19 +00:00
renaud gaudin
2e9c129523 new crawler folder structure v1.1.4 v.1.14 2021-05-12 17:03:48 +00:00
renaud gaudin
03abf6050a updated warc2zim and browsertrix-crawler 2021-05-12 16:28:34 +00:00
renaud gaudin
f746f7b020 use same waitUntil defaults as current crawler 2021-03-04 10:40:12 +00:00
renaud gaudin
14fc8ffe0f released v1.1.3 v1.1.3 2021-03-01 09:59:34 +00:00
rgaudin
ae820472de
Merge pull request #85 from openzim/limit-hit
capture and incorporates limit info from crawl
2021-02-15 17:23:42 +00:00
renaud gaudin
cfa4b0e7f8 capture and incorporates limit info from crawl 2021-02-15 17:20:43 +00:00
renaud gaudin
964746481f using crawler 0.2.0 2021-02-15 17:15:54 +00:00
rgaudin
69892a215f
Merge pull request #84 from myt00seven/master
Update README.md with a --exclude example
2021-01-26 08:12:09 +00:00
lakesidethinks
6da4714cff Update README.md 2021-01-25 12:31:09 -06:00
renaud gaudin
d0d51539fe updated CHANGELOG 2021-01-15 12:59:00 +00:00
rgaudin
c3a7a02121
Merge pull request #80 from openzim/issue76
more flexible url redirects acceptance
2021-01-15 12:55:14 +00:00