Skip to content

Identifiers

Overview

Identifiers are components that tag resources with specific resource tags.

The following is an example of the MagicMimeIdentifier, which uses libmagic file type identification to tag resources:

class MagicMimeIdentifier(Identifier[None]):
    id = b"MagicMimeIdentifier"
    targets = (File,)
    _tags_by_mime: Dict[str, ResourceTag] = dict()

    async def identify(self, resource: Resource, config=None):
        _magic = await resource.analyze(Magic)
        magic_mime = _magic.mime
        tag = MagicMimeIdentifier._tags_by_mime.get(magic_mime)
        if tag is not None:
            resource.add_tag(tag)
    @classmethod
    def register(cls, resource: ResourceTag, mime_types: Union[Iterable[str], str]):
        if isinstance(mime_types, str):
            mime_types = [mime_types]
        for mime_type in mime_types:
            if mime_type in cls._tags_by_mime:
                raise AlreadyExistError(f"Registering already-registered mime type: {mime_type}")
            cls._tags_by_mime[mime_type] = resource


...

MagicMimeIdentifier.register(GenericText, "text/plain")

The last line of the example, MagicMimeIdentifier.register(GenericText, "text/plain"), registers the "text/plain" pattern as one that maps to the GenericText resource tag.

Usage

Identifiers can be explicitly run using the Resource.identify method:

await resource.identify()

Note that running identifiers explicitly is often not needed, as Resource.unpack runs all registered identifiers before running unpackers.