core.orchestrator
=================

.. py:module:: core.orchestrator

.. autoapi-nested-parse::

   Orchestrates all archiving steps, including feeding items,
   archiving them with specific archivers, enrichment, storage,
   formatting, database operations and clean up.





Module Contents
---------------

.. py:class:: ArchivingOrchestrator

   .. py:attribute:: module_factory
      :type:  core.module.ModuleFactory


   .. py:attribute:: setup_finished
      :type:  bool


   .. py:attribute:: logger_id
      :type:  int


   .. py:attribute:: feeders
      :type:  List[Type[core.Feeder]]


   .. py:attribute:: extractors
      :type:  List[Type[core.Extractor]]


   .. py:attribute:: enrichers
      :type:  List[Type[core.Enricher]]


   .. py:attribute:: databases
      :type:  List[Type[core.Database]]


   .. py:attribute:: storages
      :type:  List[Type[core.Storage]]


   .. py:attribute:: formatters
      :type:  List[Type[core.Formatter]]


   .. py:method:: setup_basic_parser()


   .. py:method:: check_steps(config)


   .. py:method:: setup_complete_parser(basic_config: dict, yaml_config: dict, unused_args: list[str]) -> None


   .. py:method:: add_modules_args(parser: argparse.ArgumentParser = None)


   .. py:method:: add_additional_args(parser: argparse.ArgumentParser = None)


   .. py:method:: add_individual_module_args(modules: list[core.module.LazyBaseModule] = None, parser: argparse.ArgumentParser = None) -> None


   .. py:method:: show_help(basic_config: dict)


   .. py:method:: setup_logging(config)


   .. py:method:: install_modules(modules_by_type)

      Traverses all modules in 'steps' and loads them into the orchestrator, storing them in the
      orchestrator's attributes (self.feeders, self.extractors etc.). If no modules of a certain type
      are loaded, the program will exit with an error message.



   .. py:method:: load_config(config_file: str) -> dict


   .. py:method:: setup_config(args: list) -> dict

      Sets up the configuration file, merging the default config with the user's config

      This function should only ever be run once.



   .. py:method:: check_for_updates()


   .. py:method:: setup(args: list)

      Function to configure all setup of the orchestrator: setup configs and load modules.

      This method should only ever be called once



   .. py:method:: cleanup() -> None


   .. py:method:: feed() -> Generator[core.metadata.Metadata]


   .. py:method:: feed_item(item: core.metadata.Metadata) -> core.metadata.Metadata

      Takes one item (URL) to archive and calls self.archive, additionally:
          - catches keyboard interruptions to do a clean exit
          - catches any unexpected error, logs it, and does a clean exit



   .. py:method:: archive(result: core.metadata.Metadata) -> Union[core.metadata.Metadata, None]

      Runs the archiving process for a single URL
      1. Each archiver can sanitize its own URLs
      2. Check for cached results in Databases, and signal start to the databases
      3. Call Archivers until one succeeds
      4. Call Enrichers
      5. Store all downloaded/generated media
      6. Call selected Formatter and store formatted if needed



   .. py:method:: setup_authentication(config: dict) -> dict

      Setup authentication for all modules that require it

      Split up strings into multiple sites if they are comma separated



   .. py:property:: all_modules
      :type: List[Type[core.base_module.BaseModule]]



