CSV Feeder#

Module type

feeder

Reads URLs from CSV files and feeds them into the archiving process.

Features#

  • Supports reading URLs from multiple input files, specified as a comma-separated list.

  • Allows specifying the column number or name to extract URLs from.

  • Skips header rows if the first value is not a valid URL.

Setup#

  • Input files should be formatted with one URL per line, with or without a header row.

  • If you have a header row, you can specify the column number or name to read URLs from using the ‘column’ config option.

Configuration Options#

YAML#

csv_feeder:
  files:
  column:

Command Line:#

Option

Description

Default

Type

csv_feeder.files

Required. Path to the input file(s) to read the URLs from, comma separated. Input files should be formatted with one URL per line

None

valid_file

csv_feeder.column

Optional. Column number or name to read the URLs from, 0-indexed

None

string

API Reference