core.orchestrator
=================

.. py:module:: core.orchestrator

.. autoapi-nested-parse::

   Orchestrates all archiving steps, including feeding items,
   archiving them with specific archivers, enrichment, storage,
   formatting, database operations and clean up.







Module Contents
---------------

.. py:data:: DEFAULT_CONFIG_FILE
   :value: 'orchestration.yaml'


.. py:class:: JsonParseAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

   Bases: :py:obj:`argparse.Action`


   Information about how to convert command line strings to Python objects.

   Action objects are used by an ArgumentParser to represent the information
   needed to parse a single argument from one or more strings from the
   command line. The keyword arguments to the Action constructor are also
   all attributes of Action instances.

   :keyword - option_strings -- A list of command-line option strings which: should be associated with this action.
   :keyword - dest -- The name of the attribute to hold the created object:
   :kwtype - dest -- The name of the attribute to hold the created object: s
   :keyword - nargs -- The number of command-line arguments that should be: consumed. By default, one argument will be consumed and a single
                                                                            value will be produced.  Other values include:
                                                                                - N (an integer) consumes N arguments (and produces a list)
                                                                                - '?' consumes zero or one arguments
                                                                                - '*' consumes zero or more arguments (and produces a list)
                                                                                - '+' consumes one or more arguments (and produces a list)
                                                                            Note that the difference between the default and nargs=1 is that
                                                                            with the default, a single value will be produced, while with
                                                                            nargs=1, a list containing a single value will be produced.
   :keyword - const -- The value to be produced if the option is specified and the: option uses an action that takes no values.
   :keyword - default -- The value to be produced if the option is not specified.:
   :keyword - type -- A callable that accepts a single string argument, and: returns the converted value.  The standard Python types str, int,
                                                                             float, and complex are useful examples of such callables.  If None,
                                                                             str is used.
   :keyword - choices -- A container of values that should be allowed. If not None,: after a command-line argument has been converted to the appropriate
                                                                                     type, an exception will be raised if it is not a member of this
                                                                                     collection.
   :keyword - required -- True if the action must always be specified at the: command line. This is only meaningful for optional command-line
                                                                              arguments.
   :keyword - help -- The help string describing the argument.:
   :keyword - metavar -- The name to be used for the option's argument with the: help string. If None, the 'dest' value will be used as the name.


.. py:class:: AuthenticationJsonParseAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

   Bases: :py:obj:`JsonParseAction`


   Information about how to convert command line strings to Python objects.

   Action objects are used by an ArgumentParser to represent the information
   needed to parse a single argument from one or more strings from the
   command line. The keyword arguments to the Action constructor are also
   all attributes of Action instances.

   :keyword - option_strings -- A list of command-line option strings which: should be associated with this action.
   :keyword - dest -- The name of the attribute to hold the created object:
   :kwtype - dest -- The name of the attribute to hold the created object: s
   :keyword - nargs -- The number of command-line arguments that should be: consumed. By default, one argument will be consumed and a single
                                                                            value will be produced.  Other values include:
                                                                                - N (an integer) consumes N arguments (and produces a list)
                                                                                - '?' consumes zero or one arguments
                                                                                - '*' consumes zero or more arguments (and produces a list)
                                                                                - '+' consumes one or more arguments (and produces a list)
                                                                            Note that the difference between the default and nargs=1 is that
                                                                            with the default, a single value will be produced, while with
                                                                            nargs=1, a list containing a single value will be produced.
   :keyword - const -- The value to be produced if the option is specified and the: option uses an action that takes no values.
   :keyword - default -- The value to be produced if the option is not specified.:
   :keyword - type -- A callable that accepts a single string argument, and: returns the converted value.  The standard Python types str, int,
                                                                             float, and complex are useful examples of such callables.  If None,
                                                                             str is used.
   :keyword - choices -- A container of values that should be allowed. If not None,: after a command-line argument has been converted to the appropriate
                                                                                     type, an exception will be raised if it is not a member of this
                                                                                     collection.
   :keyword - required -- True if the action must always be specified at the: command line. This is only meaningful for optional command-line
                                                                              arguments.
   :keyword - help -- The help string describing the argument.:
   :keyword - metavar -- The name to be used for the option's argument with the: help string. If None, the 'dest' value will be used as the name.


.. py:class:: UniqueAppendAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

   Bases: :py:obj:`argparse.Action`


   Information about how to convert command line strings to Python objects.

   Action objects are used by an ArgumentParser to represent the information
   needed to parse a single argument from one or more strings from the
   command line. The keyword arguments to the Action constructor are also
   all attributes of Action instances.

   :keyword - option_strings -- A list of command-line option strings which: should be associated with this action.
   :keyword - dest -- The name of the attribute to hold the created object:
   :kwtype - dest -- The name of the attribute to hold the created object: s
   :keyword - nargs -- The number of command-line arguments that should be: consumed. By default, one argument will be consumed and a single
                                                                            value will be produced.  Other values include:
                                                                                - N (an integer) consumes N arguments (and produces a list)
                                                                                - '?' consumes zero or one arguments
                                                                                - '*' consumes zero or more arguments (and produces a list)
                                                                                - '+' consumes one or more arguments (and produces a list)
                                                                            Note that the difference between the default and nargs=1 is that
                                                                            with the default, a single value will be produced, while with
                                                                            nargs=1, a list containing a single value will be produced.
   :keyword - const -- The value to be produced if the option is specified and the: option uses an action that takes no values.
   :keyword - default -- The value to be produced if the option is not specified.:
   :keyword - type -- A callable that accepts a single string argument, and: returns the converted value.  The standard Python types str, int,
                                                                             float, and complex are useful examples of such callables.  If None,
                                                                             str is used.
   :keyword - choices -- A container of values that should be allowed. If not None,: after a command-line argument has been converted to the appropriate
                                                                                     type, an exception will be raised if it is not a member of this
                                                                                     collection.
   :keyword - required -- True if the action must always be specified at the: command line. This is only meaningful for optional command-line
                                                                              arguments.
   :keyword - help -- The help string describing the argument.:
   :keyword - metavar -- The name to be used for the option's argument with the: help string. If None, the 'dest' value will be used as the name.


.. py:class:: ArchivingOrchestrator

   .. py:attribute:: feeders
      :type:  List[Type[core.Feeder]]


   .. py:attribute:: extractors
      :type:  List[Type[core.Extractor]]


   .. py:attribute:: enrichers
      :type:  List[Type[core.Enricher]]


   .. py:attribute:: databases
      :type:  List[Type[core.Database]]


   .. py:attribute:: storages
      :type:  List[Type[core.Storage]]


   .. py:attribute:: formatters
      :type:  List[Type[core.Formatter]]


   .. py:method:: setup_basic_parser()


   .. py:method:: setup_complete_parser(basic_config: dict, yaml_config: dict, unused_args: list[str]) -> None


   .. py:method:: add_additional_args(parser: argparse.ArgumentParser = None)


   .. py:method:: add_module_args(modules: list[core.module.LazyBaseModule] = None, parser: argparse.ArgumentParser = None) -> None


   .. py:method:: show_help(basic_config: dict)


   .. py:method:: setup_logging()


   .. py:method:: install_modules(modules_by_type)

      Traverses all modules in 'steps' and loads them into the orchestrator, storing them in the
      orchestrator's attributes (self.feeders, self.extractors etc.). If no modules of a certain type
      are loaded, the program will exit with an error message.



   .. py:method:: load_config(config_file: str) -> dict


   .. py:method:: run(args: list) -> None


   .. py:method:: cleanup() -> None


   .. py:method:: feed() -> Generator[core.metadata.Metadata]


   .. py:method:: feed_item(item: core.metadata.Metadata) -> core.metadata.Metadata

      Takes one item (URL) to archive and calls self.archive, additionally:
          - catches keyboard interruptions to do a clean exit
          - catches any unexpected error, logs it, and does a clean exit



   .. py:method:: archive(result: core.metadata.Metadata) -> Union[core.metadata.Metadata, None]

      Runs the archiving process for a single URL
      1. Each archiver can sanitize its own URLs
      2. Check for cached results in Databases, and signal start to the databases
      3. Call Archivers until one succeeds
      4. Call Enrichers
      5. Store all downloaded/generated media
      6. Call selected Formatter and store formatted if needed



   .. py:method:: assert_valid_url(url: str) -> bool

      Blocks localhost, private, reserved, and link-local IPs and all non-http/https schemes.



   .. py:property:: all_modules
      :type: List[Type[core.module.BaseModule]]



