Telegram Extractor#
Module type
The TelegramExtractor retrieves publicly available media content from Telegram message links without requiring login credentials.
It processes URLs to fetch images and videos embedded in Telegram messages, ensuring a structured output using Metadata
and Media objects. Recommended for scenarios where login-based archiving is not viable, although telethon_archiver
is advised for more comprehensive functionality, and higher quality media extraction.
Features#
Extracts images and videos from public Telegram message links (
t.me).Processes HTML content of messages to retrieve embedded media.
Sets structured metadata, including timestamps, content, and media details.
Does not require user authentication for Telegram.
# steps configuration
steps:
...
extractors:
- telegram_extractor
...
# module configuration
...
# No configuration options for telegram_extractor.*