Module Documentation#
These pages describe the core modules that come with Auto Archiver and provide the main functionality for archiving websites on the internet. There are five core module types:
Feeders - these ‘feed’ information (the URLs) from various sources to the Auto Archiver for processing
Extractors - these ‘extract’ the page data for a given URL that is fed in by a feeder
Enrichers - these ‘enrich’ the data extracted in the previous step with additional information
Storage - these ‘store’ the data in a persistent location (on disk, Google Drive etc.)
Databases - these ‘store’ the status of the entire archiving process in a log file or database.