ghostarchive_enricher#
Submodules#
Package Contents#
- class ghostarchive_enricher.GhostarchiveEnricher#
Bases:
auto_archiver.core.EnricherSubmits the current URL to Ghost Archive (ghostarchive.org) for archiving and stores the archived page URL as enrichment metadata.
Ghost Archive has no official API — this module interacts with the web form and parses HTML responses. The submission endpoint is protected by Cloudflare, so a headless browser (SeleniumBase) is used for archival submissions, while plain HTTP requests are used for searching existing archives.
Note: this module only confirms that Ghost Archive accepted the submission and returned an archive URL. It does not verify that the archived page content is complete or correctly rendered.
- GHOSTARCHIVE_BASE = 'https://ghostarchive.org'#
- ARCHIVE_ENDPOINT = 'https://ghostarchive.org/archive2'#
- SEARCH_ENDPOINT = 'https://ghostarchive.org/search'#
- ARCHIVE_URL_PATTERN#
- enrich(to_enrich: auto_archiver.core.Metadata) bool#
Enriches a Metadata object with additional information or context.
Takes the metadata object to enrich as an argument and modifies it in place, returning None.