360 Commits

Author SHA1 Message Date
dependabot[bot]
1c30abd39d
Bump the production-dependencies group across 1 directory with 2 updates
Bumps the production-dependencies group with 2 updates in the / directory: [pyright](https://github.com/RobertCraigie/pyright-python) and [pytest](https://github.com/pytest-dev/pytest).


Updates `pyright` from 1.1.379 to 1.1.380
- [Release notes](https://github.com/RobertCraigie/pyright-python/releases)
- [Commits](https://github.com/RobertCraigie/pyright-python/compare/v1.1.379...v1.1.380)

Updates `pytest` from 8.3.2 to 8.3.3
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/8.3.2...8.3.3)

---
updated-dependencies:
- dependency-name: pyright
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production-dependencies
- dependency-name: pytest
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-11 22:22:13 +00:00
benoit74
3e2ddd1708
Prepare for 2.1.3 2024-09-10 07:59:50 +00:00
benoit74
6d5fc0bed0
Release 2.1.2 v2.1.2 2024-09-09 14:38:21 +00:00
benoit74
ca86c8c7cd
Merge pull request #386 from openzim/dependabot/pip/production-dependencies-8a1364bdbe
Bump ruff from 0.6.3 to 0.6.4 in the production-dependencies group
2024-09-09 16:28:57 +02:00
dependabot[bot]
1c9d927438 Bump ruff from 0.6.3 to 0.6.4 in the production-dependencies group
Bumps the production-dependencies group with 1 update: [ruff](https://github.com/astral-sh/ruff).


Updates `ruff` from 0.6.3 to 0.6.4
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.6.3...0.6.4)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-09 09:20:37 +02:00
benoit74
5e3c731fb2
Merge pull request #387 from openzim/update
Browsertrix crawler 1.3.0-beta.1
2024-09-09 09:19:22 +02:00
benoit74
113eeebf9c
Browsertrix crawler 1.3.0-beta.1 2024-09-09 07:14:41 +00:00
benoit74
6a804e9a8e
Prepare for 2.1.2 2024-09-05 08:25:44 +00:00
benoit74
501520d07f
Release 2.1.1 v2.1.1 2024-09-05 07:45:42 +00:00
benoit74
e0d1adf676
Merge pull request #384 from openzim/dependabot/pip/production-dependencies-e14423a93f
Bump pyright from 1.1.378 to 1.1.379 in the production-dependencies group across 1 directory
2024-09-05 09:11:08 +02:00
dependabot[bot]
7873667434
Bump pyright in the production-dependencies group across 1 directory
Bumps the production-dependencies group with 1 update in the / directory: [pyright](https://github.com/RobertCraigie/pyright-python).


Updates `pyright` from 1.1.378 to 1.1.379
- [Release notes](https://github.com/RobertCraigie/pyright-python/releases)
- [Commits](https://github.com/RobertCraigie/pyright-python/compare/v1.1.378...v1.1.379)

---
updated-dependencies:
- dependency-name: pyright
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-05 07:08:03 +00:00
benoit74
a1329974a1
Merge pull request #382 from openzim/upgrade
Upgrade to crawler 1.3.0-beta.0, Ubuntu Noble, and fix documentation
2024-09-05 09:06:48 +02:00
benoit74
6b3c725eeb
More precise usage on diskUtilization setting 2024-09-03 18:06:07 +00:00
benoit74
7f76415710
Upgrade to browsertrix crawler 1.3.0-beta.0
Among other changes, it includes the upgrade to Ubuntu Noble, so we no
longer need the additional deadsnakes ppa in Dockerfile.
2024-09-03 18:06:06 +00:00
dependabot[bot]
37c4beda6a
Bump the production-dependencies group with 3 updates
Bumps the production-dependencies group with 3 updates: [ruff](https://github.com/astral-sh/ruff), [pyright](https://github.com/RobertCraigie/pyright-python) and [selenium](https://github.com/SeleniumHQ/Selenium).


Updates `ruff` from 0.5.7 to 0.6.3
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.7...0.6.3)

Updates `pyright` from 1.1.375 to 1.1.378
- [Release notes](https://github.com/RobertCraigie/pyright-python/releases)
- [Commits](https://github.com/RobertCraigie/pyright-python/compare/v1.1.375...v1.1.378)

Updates `selenium` from 4.23.0 to 4.24.0
- [Release notes](https://github.com/SeleniumHQ/Selenium/releases)
- [Commits](https://github.com/SeleniumHQ/Selenium/compare/selenium-4.23.0...selenium-4.24.0)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: production-dependencies
- dependency-name: pyright
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production-dependencies
- dependency-name: selenium
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: production-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-03 13:31:28 +00:00
benoit74
ef12d01958
Configure dependabot.yml
Signed-off-by: benoit74 <benoit74@users.noreply.github.com>
2024-09-03 15:30:16 +02:00
benoit74
d814c23178
Merge pull request #373 from openzim/stream_dl
Stream files downloads to not exhaust memory
2024-08-12 22:23:17 +02:00
benoit74
efdf7804c0
Stream files downloads to not exhaust memory 2024-08-12 19:56:05 +00:00
benoit74
d0d0c6e6e6
Merge pull request #370 from openzim/add_warc_tar
Add support for tar files in --warcs
2024-08-12 14:35:23 +02:00
benoit74
be1e2d6745
Better wording for capabilities
Signed-off-by: benoit74 <benoit74@users.noreply.github.com>
2024-08-11 20:42:02 +02:00
benoit74
f7df467eab
Document capabilities and known limitations
Signed-off-by: benoit74 <benoit74@users.noreply.github.com>
2024-08-11 20:40:59 +02:00
benoit74
af48be8f82
Add support for tar files in --warcs 2024-08-09 09:27:57 +00:00
benoit74
7e69d8ab75
Prepare for 2.1.1 2024-08-09 08:14:10 +00:00
benoit74
2e082c41a9
Release 2.1.0 v2.1.0 2024-08-09 08:02:16 +00:00
benoit74
ad5adcd096
Merge pull request #368 from openzim/release
Upgrade dependencies
2024-08-09 09:56:40 +02:00
benoit74
bc06e85ced
Upgrade dependencies 2024-08-09 07:53:11 +00:00
benoit74
a0f802099a
Merge pull request #367 from openzim/sort_folder_mtime
Sort WARC directories passed to zimit by modification time
2024-08-09 09:47:56 +02:00
benoit74
eb32adfea7
Sort WARC directories passed to zimit by modification time 2024-08-07 12:16:08 +00:00
benoit74
0d5a08c912
Merge pull request #356 from openzim/only_warc2zim
Process WARC files directly and do not pass browsertrix version to warc2zim
2024-08-07 14:15:13 +02:00
benoit74
8cd1db6eef
Add option to directly process WARC files 2024-08-07 12:06:44 +00:00
benoit74
459a30a226
Do not log number of WARC files found 2024-08-07 12:06:43 +00:00
benoit74
861751a7ed
Stop fetching and passing browsertrix crawler version as scraperSuffix to warc2zim 2024-08-07 12:06:43 +00:00
benoit74
1ea533c75f
Merge pull request #351 from openzim/automated_daily_tests
Automate daily tests of ZIM behavior - Youtube only for now
2024-08-07 12:37:59 +02:00
benoit74
6d078c4dcf
Automate daily tests of ZIM behavior - Youtube only for now 2024-08-07 10:34:19 +00:00
benoit74
751e10473a
Merge pull request #348 from openzim/assert_zim_entries
Add test checking that expected entries are present
2024-08-07 12:31:29 +02:00
benoit74
f756c2c652
Fix CHANGELOG 2024-08-07 09:38:15 +00:00
benoit74
097613de29
Add test checking that expected entries are present 2024-08-07 09:38:08 +00:00
benoit74
4c35836395
Merge pull request #347 from openzim/fix_readme
Fix README and Dockerfile for imprecisions
2024-08-07 11:35:42 +02:00
benoit74
6e3951dfa7
Fix README and Dockerfile for imprecisions (#314) 2024-08-07 09:32:37 +00:00
benoit74
ea7653ef37
Merge pull request #346 from openzim/custom_behaviors
Add support for custom behaviors configuration
2024-08-07 11:31:57 +02:00
benoit74
80b6b26782
Add support for custom behaviors configuration 2024-08-07 09:28:07 +00:00
benoit74
6ab3401fa2
Merge pull request #345 from openzim/profile_is_url_doc
Make it clear that --profile argument can be an HTTP(S) URL
2024-08-07 11:26:55 +02:00
benoit74
a1efe8dccf
Make it clear that --profile argument can be an HTTP(S) URL (and not only a path) 2024-08-07 09:16:19 +00:00
benoit74
526019e095
Prepare for 2.0.7 2024-08-02 08:46:59 +00:00
benoit74
2452e60d9d
Release 2.0.6 v2.0.6 2024-08-02 08:17:58 +00:00
benoit74
dee57a8dd8
Merge pull request #363 from openzim/browsertrix_1_2_6
Upgrade to Browsertrix Crawler 1.2.6
2024-08-02 10:15:47 +02:00
benoit74
c92782bea0
Upgrade to Browsertrix Crawler 1.2.6 2024-08-02 08:07:46 +00:00
benoit74
7305f70300
Prepare for 2.0.6 2024-07-24 06:39:21 +00:00
benoit74
021654e6b3
Release 2.0.5 v2.0.5 2024-07-24 06:37:27 +00:00
benoit74
7357b1f2ce
Merge pull request #358 from openzim/prepare_release
Upgrade to Browsertrix Crawler 1.2.5 and warc2zim 2.0.3
2024-07-24 07:41:17 +02:00