4 Combining Blocklists
lifenjoiner edited this page 2023-08-31 23:12:31 +08:00

Combining blocklists

dnscrypt-proxy includes a tool to build block lists from local and remote lists in common formats.

That tool:

  • Converts 3rd party lists in common formats (such as HOSTS-file lists) into entries suitable for dnscrypt-proxy
  • Removes junk/typos/entries that don't represent host names
  • Removes duplicates and merges overlapping entries in order to make lists drastically smaller and faster
  • Measures how 3rd party lists overlap
  • Can prevent time-restricted entries from being included

Named generate-domains-blocklist.py, the tool requires a Python interpreter. Python is available for virtually all operating systems. Any Python 3.x version should work. The script may also work with Python 2.x but this is not a supported configuration since Python 2.x has reached end of life.

The script is included in the dnscrypt-proxy source code in the utils directory. It can also be downloaded directly here:

It can be run with the python3 generate-domains-blocklist.py command, followed by relevant parameters.

usage: generate-domains-blocklist.py [-h] [-c CONFIG]
                                     [-a ALLOWLIST] [-r TIME_RESTRICTED] [-i]
                                     [-o OUTPUT_FILE] [-t TIMEOUT]

Create a unified blocklist from a set of local and remote files

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG, --config CONFIG
                        file containing blocklist sources
  -a ALLOWLIST, --allowlist ALLOWLIST
                        file containing a set of names to exclude from the
                        file containing a set of names to be time restricted
  -i, --ignore-retrieval-failure
                        generate list even if some urls couldn't be retrieved
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        save generated blocklist to a text file with the
                        provided file name

The most common usage pattern is simply from the directory where the script locates:

python3 generate-domains-blocklist.py -o blocklist.txt

which will load its configuration from a domains-blocklist.conf configuration file.

Edit that configuration file to include a combination of local (starting with file:) files and remote (starting with https://) lists.

The example configuration already includes some popular sources, that you can comment or uncomment according to your needs.

Be aware that all 3rd party lists include false positives and obsolete entries. Using a few high-quality, well-maintained lists is always preferable to trying to create the biggest possible list.

If an external source is unreachable or returns temporary errors, an existing output list will not be overwritten, so that the previous version can still be used. That behavior can be changed with the --ignore-retrieval-failure options.

For automated background updates, the script can be run as a cron job.