Getting Started#
Getting Started#
To get started with Auto Archiver, there are 3 main steps you need to complete.
Setup up your configuration (if you are ok with the default settings, you can skip this step)
The way you run the Auto Archiver depends on how you installed it (docker install or local install)
Running a Docker Install#
If you installed Auto Archiver using docker, open up your terminal, and copy-paste / type the following command:
docker run -it --rm -v $PWD/secrets:/app/secrets -v $PWD/local_archive:/app/local_archive bellingcat/auto-archiver -- "https://example.com/1/"
breaking this command down:
docker runtells docker to start a new container (an instance of the image)-ittells docker to run in ‘interactive mode’ so that we get nice colour logs--rmmakes sure this container is removed after execution (less garbage locally)-v $PWD/secrets:/app/secrets- your secrets folder with settings-vis a volume flag which means a folder that you have on your computer will be connected to a folder inside the docker container$PWD/secretspoints to asecrets/folder in your current working directory (where your console points to), we use this folder as a best practice to hold all the secrets/tokens/passwords/… you use/app/secretspoints to the path the docker container where this image can be found
-v $PWD/local_archive:/app/local_archive- (optional) if you use local_storage-vsame as above, this is a volume instruction$PWD/local_archiveis a folderlocal_archive/in case you want to archive locally and have the files accessible outside docker/app/local_archiveis a folder inside docker that you can reference in your orchestration.yml file
-- "https://example.com/1/"this will pass the URL to archive to the default command line feeder
Example invocations#
The invocations below will run the auto-archiver Docker image using a configuration file that you have specified
# Have auto-archiver run with the default settings, generating a settings file in ./secrets/orchestration.yaml
docker run -it --rm -v $PWD/secrets:/app/secrets -v $PWD/local_archive:/app/local_archive bellingcat/auto-archiver
# uses the same configuration, but with the `gsheet_feeder`, a header on row 2 and with some different column names
# Note this expects you to have followed the [Google Sheets setup](how_to/google_sheets.md) and added your service_account.json to the `secrets/` folder
# notice that columns is a dictionary so you need to pass it as JSON and it will override only the values provided
docker run -it --rm -v $PWD/secrets:/app/secrets -v $PWD/local_archive:/app/local_archive bellingcat/auto-archiver --feeders=gsheet_feeder --gsheet_feeder.sheet="use it on another sheets doc" --gsheet_feeder.header=2 --gsheet_feeder.columns='{"url": "link"}'
# Runs auto-archiver for the first time, but in 'full' mode, enabling all modules to get a full settings file
docker run -it --rm -v $PWD/secrets:/app/secrets -v $PWD/local_archive:/app/local_archive bellingcat/auto-archiver --mode full
Running a Local Install#
Example invocations#
Once all your local requirements are correctly installed, the
# all the configurations come from ./secrets/orchestration.yaml
auto-archiver --config secrets/orchestration.yaml
# uses the same configurations but for another google docs sheet
# with a header on row 2 and with some different column names
# notice that columns is a dictionary so you need to pass it as JSON and it will override only the values provided
auto-archiver --config secrets/orchestration.yaml --gsheet_feeder.sheet="use it on another sheets doc" --gsheet_feeder.header=2 --gsheet_feeder.columns='{"url": "link"}'
# all the configurations come from orchestration.yaml and specifies that s3 files should be private
auto-archiver --config secrets/orchestration.yaml --s3_storage.private=1