generic_extractor#

Submodules#

Package Contents#

class generic_extractor.GenericExtractor#

Bases: auto_archiver.core.extractor.Extractor

Base class for implementing extractors in the media archiving framework. Subclasses must implement the download method to define platform-specific behavior.

setup()#
update_ytdlp()#
suitable_extractors(url: str) Generator[str, None, None]#

Returns a list of valid extractors for the given URL

suitable(url: str) bool#

Checks for valid URLs out of all ytdlp extractors. Returns False for the GenericIE, which as labelled by yt-dlp: ‘Generic downloader that works on some sites’

download_additional_media(video_data: dict, info_extractor: yt_dlp.extractor.common.InfoExtractor, metadata: auto_archiver.core.Metadata) auto_archiver.core.Metadata#

Downloads additional media like images, comments, subtitles, etc.

Creates a ‘media’ object and attaches it to the metadata object.

keys_to_clean(info_extractor: yt_dlp.extractor.common.InfoExtractor, video_data: dict) dict#

Clean up the ytdlp generic video data to make it more readable and remove unnecessary keys that ytdlp adds

add_metadata(video_data: dict, info_extractor: yt_dlp.extractor.common.InfoExtractor, url: str, result: auto_archiver.core.Metadata) auto_archiver.core.Metadata#

Creates a Metadata object from the given video_data

get_metadata_for_post(info_extractor: Type[yt_dlp.extractor.common.InfoExtractor], url: str, ydl: yt_dlp.YoutubeDL) auto_archiver.core.Metadata#

Calls into the ytdlp InfoExtract subclass to use the private _extract_post method to get the post metadata.

get_metadata_for_video(data: dict, info_extractor: Type[yt_dlp.extractor.common.InfoExtractor], url: str, ydl: yt_dlp.YoutubeDL) auto_archiver.core.Metadata#
dropin_for_name(dropin_name: str, additional_paths=[], package=__package__) generic_extractor.dropin.GenericDropin#
download_for_extractor(info_extractor: yt_dlp.extractor.common.InfoExtractor, url: str, ydl: yt_dlp.YoutubeDL) auto_archiver.core.Metadata#

Tries to download the given url using the specified extractor

It first tries to use ytdlp directly to download the video. If the post is not a video, it will then try to use the extractor’s _extract_post method to get the post metadata if possible.

download(item: auto_archiver.core.Metadata) auto_archiver.core.Metadata#

Downloads the media from the given URL and returns a Metadata object with the downloaded media.

If the URL is not supported or the download fails, this method should return False.