ghostarchive_enricher#

Submodules#

Package Contents#

class ghostarchive_enricher.GhostarchiveEnricher#

Bases: auto_archiver.core.Enricher

Submits the current URL to Ghost Archive (ghostarchive.org) for archiving and stores the archived page URL as enrichment metadata.

Ghost Archive has no official API — this module interacts with the web form and parses HTML responses. The submission endpoint is protected by Cloudflare, so a headless browser (SeleniumBase) is used for archival submissions, while plain HTTP requests are used for searching existing archives.

Note: this module only confirms that Ghost Archive accepted the submission and returned an archive URL. It does not verify that the archived page content is complete or correctly rendered.

GHOSTARCHIVE_BASE = 'https://ghostarchive.org'#
ARCHIVE_ENDPOINT = 'https://ghostarchive.org/archive2'#
SEARCH_ENDPOINT = 'https://ghostarchive.org/search'#
ARCHIVE_URL_PATTERN#
enrich(to_enrich: auto_archiver.core.Metadata) bool#

Enriches a Metadata object with additional information or context.

Takes the metadata object to enrich as an argument and modifies it in place, returning None.