mirror of
https://github.com/openzim/zimit.git
synced 2025-09-23 03:52:16 -04:00
Added domains blocklist (#77)
All domains from the 3 [anudeepND](https://github.com/anudeepND/blacklist) lists are now blocked at local resolver level by updating /etc/hosts in entrypoint. - this saves network and CPU resources by failing early. - this is wanted in almost all cases - can be bypassed by setting a blank entrypoint
This commit is contained in:
parent
f4c11dc948
commit
e91cd7921e
13
Dockerfile
13
Dockerfile
@ -11,5 +11,16 @@ ADD zimit.py /app/
|
||||
|
||||
RUN ln -s /app/zimit.py /usr/bin/zimit
|
||||
|
||||
CMD ["zimit"]
|
||||
# download list of bad domains to filter-out. intentionnaly ran post-install
|
||||
# so it's not cached in earlier layers (url stays same but content updated)
|
||||
RUN mkdir -p /tmp/ads && cd /tmp/ads && \
|
||||
curl -L -O https://hosts.anudeep.me/mirror/adservers.txt && \
|
||||
curl -L -O https://hosts.anudeep.me/mirror/CoinMiner.txt && \
|
||||
curl -L -O https://hosts.anudeep.me/mirror/facebook.txt && \
|
||||
cat ./*.txt > /etc/blocklist.txt \
|
||||
&& rm ./*.txt
|
||||
RUN printf '#!/bin/sh\ncat /etc/blocklist.txt >> /etc/hosts\nexec "$@"' > /usr/local/bin/entrypoint.sh && \
|
||||
chmod +x /usr/local/bin/entrypoint.sh
|
||||
|
||||
ENTRYPOINT ["entrypoint.sh"]
|
||||
CMD ["zimit"]
|
||||
|
@ -60,6 +60,8 @@ docker run -v /output:/output --cap-add=SYS_ADMIN --cap-add=NET_ADMIN \
|
||||
The puppeteer-cluster provides monitoring output which is enabled by
|
||||
default and prints the crawl status to the Docker log.
|
||||
|
||||
**Note**: Image automatically filters out a large number of ads by using the 3 blocklists from [anudeepND](https://github.com/anudeepND/blacklist). If you don't want this filtering, disable the image's entrypoint in your container (`docker run --entrypoint="" openzim/zimit ...`).
|
||||
|
||||
Nota bene
|
||||
---------
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user