
# Ghost Archive Enricher
```{admonition} Module type

<span style='color: #0000FF'>[enricher](/core_modules.md#enricher-modules)</a></span>
```

Submits the current URL to [Ghost Archive](https://ghostarchive.org/) for archiving and returns the archived page URL.

Used as an **enricher** to add a Ghost Archive URL to items already extracted by other modules.

### Features
- Archives any public URL using the Ghost Archive service.
- Optionally checks for existing archives before submitting a new one.
- Supports HTTP and HTTPS proxies for requests.
- Parses HTML responses to extract archive URLs (Ghost Archive has no JSON API).

### Important
- This module confirms that Ghost Archive accepted the URL submission and returned an archive link.
It does **not** verify the contents or completeness of the archived page.

### Notes
- Ghost Archive is a free service with no authentication required.
- Archived pages must be smaller than 50 MB (including CSS, fonts, images, etc.).
- Videos are archived up to 360p and must be under 100 MB and shorter than 30 minutes.
- Archival may take up to 5 minutes depending on the queue and page complexity.
- Archived content is stored indefinitely.
- Ghost Archive does not archive pages that require authentication or form submission.

### Limitations
- No official API — this module interacts with the Ghost Archive web interface.
- The submission endpoint is protected by Cloudflare, so a headless browser (SeleniumBase) is used for new submissions.
- Searching for existing archives uses plain HTTP requests and does not require a browser.
- Rate limiting may apply; consider using a delay between requests if archiving many URLs.


## Configuration Options

### YAML
```{code} yaml

# steps configuration
steps:
...
  enrichers:
  - ghostarchive_enricher
...

# module configuration
...

ghostarchive_enricher:
  timeout: 120
  check_existing: true
  proxy_http:
  proxy_https:



```

### Command Line:
| Option | Description | Default | Type|
| --- | --- | --- | --- |
| `ghostarchive_enricher.timeout` | Optional. seconds to wait for successful archive confirmation from Ghost Archive. | 120 | int |
| `ghostarchive_enricher.check_existing` | Optional. whether to search for an existing archive before submitting a new one. | True | bool |
| `ghostarchive_enricher.proxy_http` | Optional. http proxy to use for requests, eg http://proxy-user:password@proxy-ip:port | None | string |
| `ghostarchive_enricher.proxy_https` | Optional. https proxy to use for requests, eg https://proxy-user:password@proxy-ip:port | None | string |

[API Reference](../../../autoapi/ghostarchive_enricher/index)
