core.base_module
================

.. py:module:: core.base_module




Module Contents
---------------

.. py:class:: BaseModule

   Bases: :py:obj:`abc.ABC`


   Base module class. All modules should inherit from this class.

   The exact methods a class implements will depend on the type of module it is,
   however modules can have a .setup() method to run any setup code
   (e.g. logging in to a site, spinning up a browser etc.)

   See consts.MODULE_TYPES for the types of modules you can create, noting that
   a subclass can be of multiple types. For example, a module that extracts data from
   a website and stores it in a database would be both an 'extractor' and a 'database' module.

   Each module is a python package, and should have a __manifest__.py file in the
   same directory as the module file. The __manifest__.py specifies the module information
   like name, author, version, dependencies etc. See DEFAULT_MANIFEST for the
   default manifest structure.



   .. py:attribute:: MODULE_TYPES
      :value: ['feeder', 'extractor', 'enricher', 'database', 'storage', 'formatter']



   .. py:attribute:: config
      :type:  Mapping[str, Any]


   .. py:attribute:: authentication
      :type:  Mapping[str, Mapping[str, str]]


   .. py:attribute:: name
      :type:  str


   .. py:attribute:: module_factory
      :type:  core.module.ModuleFactory


   .. py:attribute:: tmp_dir
      :type:  tempfile.TemporaryDirectory
      :value: None



   .. py:property:: storages
      :type: list



   .. py:method:: config_setup(config: dict)


   .. py:method:: setup()


   .. py:method:: auth_for_site(site: str, extract_cookies=True) -> Mapping[str, Any]

      Returns the authentication information for a given site. This is used to authenticate
      with a site before extracting data. The site should be the domain of the site, e.g. 'twitter.com'

      :param site: the domain of the site to get authentication information for
      :param extract_cookies: whether or not to extract cookies from the given browser/file and return the cookie jar (disabling can speed up processing if you don't actually need the cookies jar).

      :returns: authdict dict of login information for the given site

      **Global options:**

      * cookies_from_browser: str - the name of the browser to extract cookies from (e.g. 'chrome', 'firefox' - uses ytdlp under the hood to extract

      * cookies_file: str - the path to a cookies file to use for login


      **Currently, the sites dict can have keys of the following types:**

      * username: str - the username to use for login

      * password: str - the password to use for login

      * api_key: str - the API key to use for login

      * api_secret: str - the API secret to use for login

      * cookie: str - a cookie string to use for login (specific to this site)

      * cookies_file: str - the path to a cookies file to use for login (specific to this site)

      * cookies_from_browser: str - the name of the browser to extract cookies from (specitic for this site)




   .. py:method:: repr()


