core.metadata#

Acts as a container for metadata and media objects associated with an archived item.

Key Functionalities: - Store and retrieve metadata and associated media. - Merge metadata objects with conflict resolution. - Validate properties like URLs and timestamps. - Manage and deduplicate media objects. - Support for flexible metadata querying and appending.

Module Contents#

class core.metadata.Metadata#
status: str = 'no archiver'#
metadata: Dict[str, Any]#
media: List[core.media.Media] = []#
merge(right: Metadata, overwrite_left=True) Metadata#

Merges another Metadata instance into this one.

Conflicts are resolved based on the overwrite_left flag: - If True, this instance’s values are overwritten by right. - If False, the inverse applies.

store(storages=[])#
set(key: str, val: Any) Metadata#
append(key: str, val: Any) Metadata#
get(key: str, default: Any = None, create_if_missing=False) Metadata | str#
success(context: str = None) Metadata#
is_success() bool#
is_empty() bool#
property netloc: str#
set_url(url: str) Metadata#
get_url() str#
set_content(content: str) Metadata#
set_title(title: str) Metadata#
get_title() str#
set_timestamp(timestamp: datetime.datetime) Metadata#
get_timestamp(utc=True, iso=True) datetime.datetime#
add_media(media: core.media.Media, id: str = None) Metadata#
get_media_by_id(id: str, default=None) core.media.Media#
remove_duplicate_media_by_hash() None#
get_first_image(default=None) core.media.Media#
set_final_media(final: core.media.Media) Metadata#

final media is a special type of media: if you can show only 1 this is it, it’s useful for some DBs like GsheetDb

get_final_media() core.media.Media#
get_all_media() List[core.media.Media]#
static choose_most_complete(results: List[Metadata]) Metadata#
set_context(key: str, val: Any) Metadata#
get_context(key: str, default: Any = None) Any#