Merge pull request #249 from openzim/fix_readme

Enhance README by removing Chrome and headless reference
This commit is contained in:
rgaudin 2023-11-16 12:56:46 +00:00 committed by GitHub
commit 5512e814c7
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -13,10 +13,11 @@ Zimit is a scraper allowing to create ZIM file from any Web site.
Technical background Technical background
-------------------- --------------------
This version of Zimit runs a single-site headless-Chrome based crawl in a Docker container and produces a ZIM of the crawled content. Zimit runs a fully automated browser-based crawl of a website property and produces a ZIM of the crawled content. Zimit runs in a Docker container.
The system extends the crawling system in [Browsertrix Crawler](https://github.com/webrecorder/browsertrix-crawler) and converts The system:
the crawled WARC files to ZIM using [warc2zim](https://github.com/openzim/warc2zim) - runs a website crawl with [Browsertrix Crawler](https://github.com/webrecorder/browsertrix-crawler), which produces WARC files
- converts the crawled WARC files to a single ZIM using [warc2zim](https://github.com/openzim/warc2zim)
The `zimit.py` is the entrypoint for the system. The `zimit.py` is the entrypoint for the system.