
# Wayback Machine Enricher (and Extractor)
```{admonition} Module type

<span style='color: #0000FF'>[enricher](/core_modules.md#enricher-modules)</a></span>, <span style='color: #00FF00'>[extractor](/core_modules.md#extractor-modules)</a></span>
```

Submits the current URL to the Wayback Machine for archiving and returns either a job ID or the completed archive URL.

### Features
- Archives URLs using the Internet Archive's Wayback Machine API.
- Supports conditional archiving based on the existence of prior archives within a specified time range.
- Provides proxies for HTTP and HTTPS requests.
- Fetches and confirms the archive URL or provides a job ID for later status checks.

### Notes
- Requires a valid Wayback Machine API key and secret.
- Handles rate-limiting by Wayback Machine and retries status checks with exponential backoff.

### Steps to Get an Wayback API Key:
- Sign up for an account at [Internet Archive](https://archive.org/account/signup).
- Log in to your account.
- Navigte to your [account settings](https://archive.org/account).
- or: https://archive.org/developers/tutorial-get-ia-credentials.html
- Under Wayback Machine API Keys, generate a new key.
- Note down your API key and secret, as they will be required for authentication.


## Configuration Options

### YAML
```{code} yaml

# steps configuration
steps:
...
  enrichers:
  - wayback_extractor_enricher
  extractors:
  - wayback_extractor_enricher
...

# module configuration
...

wayback_extractor_enricher:
  timeout: 15
  if_not_archived_within:
  key: ''
  secret: ''
  proxy_http:
  proxy_https:



```

### Command Line:
| Option | Description | Default | Type|
| --- | --- | --- | --- |
| `wayback_extractor_enricher.timeout` | Optional. seconds to wait for successful archive confirmation from wayback, if more than this passes the result contains the job_id so the status can later be checked manually. | 15 | int |
| `wayback_extractor_enricher.if_not_archived_within` | Optional. only tell wayback to archive if no archive is available before the number of seconds specified, use None to ignore this option. For more information: https://docs.google.com/document/d/1Nsv52MvSjbLb2PCpHlat0gkzw0EvtSgpKHu4mk0MnrA | None | string |
| `wayback_extractor_enricher.key` | **Required**. wayback API key. to get credentials visit https://archive.org/account/s3.php |  | string |
| `wayback_extractor_enricher.secret` | **Required**. wayback API secret. to get credentials visit https://archive.org/account/s3.php |  | string |
| `wayback_extractor_enricher.proxy_http` | Optional. http proxy to use for wayback requests, eg http://proxy-user:password@proxy-ip:port | None | string |
| `wayback_extractor_enricher.proxy_https` | Optional. https proxy to use for wayback requests, eg https://proxy-user:password@proxy-ip:port | None | string |

[API Reference](../../../autoapi/wayback_extractor_enricher/index)
