antibot_extractor_enricher.dropins.reddit

Contents

antibot_extractor_enricher.dropins.reddit#

Module Contents#

class antibot_extractor_enricher.dropins.reddit.RedditDropin(sb: seleniumbase.SB, extractor: auto_archiver.core.Extractor)#

Bases: auto_archiver.modules.antibot_extractor_enricher.dropin.Dropin

A class to handle Reddit drop-in functionality for the antibot extractor enricher module.

documentation() → Mapping[str, str]#: Each Dropin should auto-document itself with this method. Return dictionary can include: - ‘name’: A string representing the name of the dropin. - ‘description’: A string describing the functionality of the dropin. - ‘site’: A string representing the site this dropin is for. - ‘authentication’: A dictionary with authentication example for the site.

static suitable(url: str) → bool#: Check if the URL is suitable for processing with this dropin. :param url: The URL to check. :return: True if the URL is suitable for processing, False otherwise.

static images_selectors() → str#: CSS selector to find images in the HTML page

static video_selectors() → str#: CSS selector to find videos in the HTML page.

open_page(url) → bool#: Make sure the page is opened, even if it requires authentication, captcha solving, etc. :param url: The URL to open. :return: True if success, False otherwise.

add_extra_media(to_enrich: auto_archiver.core.metadata.Metadata) → tuple[int, int]#: Extract image and/or video data from the currently open post with SeleniumBase. Media is added to the to_enrich Metadata object. :return: A tuple (number of Images added, number of Videos added).