PDQ Hash Enricher#
Module type
PDQ Hash Enricher for generating perceptual hashes of media files.
Features#
Calculates perceptual hashes for image files using the PDQ hashing algorithm.
Enables detection of duplicate or near-duplicate visual content.
Processes images stored in
Metadataobjects, adding computed hashes to the correspondingMediaentries.Skips non-image media or files unsuitable for hashing (e.g., corrupted or unsupported formats).
Notes#
Best used after enrichers like
thumbnail_enricherorantibot_extractor_enricher(takes screenshots) to ensure images are available.Uses the
pdqhashlibrary to compute 256-bit perceptual hashes, which are stored as hexadecimal strings.
# steps configuration
steps:
...
enrichers:
- pdq_hash_enricher
...
# module configuration
...
# No configuration options for pdq_hash_enricher.*