Skip to content

Strategies Module

anonipy.anonymize.strategies

Module containing the strategies.

The strategies module provides a set of strategies used to anonymize the identified vulnerable data.

Classes:

Name Description
RedactionStrategy

The class representing the redaction strategy.

MaskingStrategy

The class representing the masking strategy.

PseudonymizationStrategy

The class representing the pseudonymization strategy.

anonipy.anonymize.strategies.RedactionStrategy

Bases: StrategyInterface

The class representing the redaction strategy.

Examples:

>>> from anonipy.anonymize.strategies import RedactionStrategy
>>> strategy = RedactionStrategy()
>>> strategy.anonymize(text, entities)

Attributes:

Name Type Description
substitute_label str

The label to substitute in the anonymized text.

Methods:

Name Description
anonymize

Anonymize the text based on the entities.

Source code in anonipy/anonymize/strategies/redaction.py
class RedactionStrategy(StrategyInterface):
    """The class representing the redaction strategy.

    Examples:
        >>> from anonipy.anonymize.strategies import RedactionStrategy
        >>> strategy = RedactionStrategy()
        >>> strategy.anonymize(text, entities)

    Attributes:
        substitute_label (str): The label to substitute in the anonymized text.

    Methods:
        anonymize(text, entities):
            Anonymize the text based on the entities.

    """

    def __init__(self, substitute_label: str = "[REDACTED]", *args, **kwargs) -> None:
        """Initializes the redaction strategy.

        Examples:
            >>> from anonipy.anonymize.strategies import RedactionStrategy
            >>> strategy = RedactionStrategy()

        Args:
            substitute_label: The label to substitute in the anonymized text.

        """

        super().__init__(*args, **kwargs)
        self.substitute_label = substitute_label or "[REDACTED]"

    def anonymize(
        self, text: str, entities: List[Entity], *args, **kwargs
    ) -> Tuple[str, List[Replacement]]:
        """Anonymize the text using the redaction strategy.

        Examples:
            >>> from anonipy.anonymize.strategies import RedactionStrategy
            >>> strategy = RedactionStrategy()
            >>> strategy.anonymize(text, entities)

        Args:
            text: The text to anonymize.
            entities: The list of entities to anonymize.

        Returns:
            The anonymized text.
            The list of applied replacements.

        """

        replacements = [self._create_replacement(ent) for ent in entities]
        anonymized_text, replacements = anonymize(text, replacements)
        return anonymized_text, replacements

    # ===========================================
    # Private methods
    # ===========================================

    def _create_replacement(self, entity: Entity) -> Replacement:
        """Creates a replacement for the entity.

        Args:
            entity: The entity to create the replacement for.

        Returns:
            The created replacement.

        """

        return {
            "original_text": entity.text,
            "label": entity.label,
            "start_index": entity.start_index,
            "end_index": entity.end_index,
            "anonymized_text": self.substitute_label,
        }

__init__(substitute_label='[REDACTED]', *args, **kwargs)

Initializes the redaction strategy.

Examples:

>>> from anonipy.anonymize.strategies import RedactionStrategy
>>> strategy = RedactionStrategy()

Parameters:

Name Type Description Default
substitute_label str

The label to substitute in the anonymized text.

'[REDACTED]'
Source code in anonipy/anonymize/strategies/redaction.py
def __init__(self, substitute_label: str = "[REDACTED]", *args, **kwargs) -> None:
    """Initializes the redaction strategy.

    Examples:
        >>> from anonipy.anonymize.strategies import RedactionStrategy
        >>> strategy = RedactionStrategy()

    Args:
        substitute_label: The label to substitute in the anonymized text.

    """

    super().__init__(*args, **kwargs)
    self.substitute_label = substitute_label or "[REDACTED]"

anonymize(text, entities, *args, **kwargs)

Anonymize the text using the redaction strategy.

Examples:

>>> from anonipy.anonymize.strategies import RedactionStrategy
>>> strategy = RedactionStrategy()
>>> strategy.anonymize(text, entities)

Parameters:

Name Type Description Default
text str

The text to anonymize.

required
entities List[Entity]

The list of entities to anonymize.

required

Returns:

Type Description
str

The anonymized text.

List[Replacement]

The list of applied replacements.

Source code in anonipy/anonymize/strategies/redaction.py
def anonymize(
    self, text: str, entities: List[Entity], *args, **kwargs
) -> Tuple[str, List[Replacement]]:
    """Anonymize the text using the redaction strategy.

    Examples:
        >>> from anonipy.anonymize.strategies import RedactionStrategy
        >>> strategy = RedactionStrategy()
        >>> strategy.anonymize(text, entities)

    Args:
        text: The text to anonymize.
        entities: The list of entities to anonymize.

    Returns:
        The anonymized text.
        The list of applied replacements.

    """

    replacements = [self._create_replacement(ent) for ent in entities]
    anonymized_text, replacements = anonymize(text, replacements)
    return anonymized_text, replacements

anonipy.anonymize.strategies.MaskingStrategy

Bases: StrategyInterface

The class representing the masking strategy.

Examples:

>>> from anonipy.anonymize.strategies import MaskingStrategy
>>> strategy = MaskingStrategy()
>>> strategy.anonymize(text, entities)

Attributes:

Name Type Description
substitute_label str

The label to substitute in the anonymized text.

Methods:

Name Description
anonymize

Anonymize the text based on the entities.

Source code in anonipy/anonymize/strategies/masking.py
class MaskingStrategy(StrategyInterface):
    """The class representing the masking strategy.

    Examples:
        >>> from anonipy.anonymize.strategies import MaskingStrategy
        >>> strategy = MaskingStrategy()
        >>> strategy.anonymize(text, entities)

    Attributes:
        substitute_label (str): The label to substitute in the anonymized text.

    Methods:
        anonymize(text, entities):
            Anonymize the text based on the entities.

    """

    def __init__(self, substitute_label: str = "*", *args, **kwargs):
        """Initializes the masking strategy.

        Examples:
            >>> from anonipy.anonymize.strategies import MaskingStrategy
            >>> strategy = MaskingStrategy()

        Args:
            substitute_label: The label to substitute in the anonymized text.

        """

        super().__init__(*args, **kwargs)
        self.substitute_label = substitute_label or "*"

    def anonymize(
        self, text: str, entities: List[Entity], *args, **kwargs
    ) -> Tuple[str, List[Replacement]]:
        """Anonymize the text using the masking strategy.

        Examples:
            >>> from anonipy.anonymize.strategies import MaskingStrategy
            >>> strategy = MaskingStrategy()
            >>> strategy.anonymize(text, entities)

        Args:
            text: The text to anonymize.
            entities: The list of entities to anonymize.

        Returns:
            The anonymized text.
            The list of applied replacements.

        """

        replacements = [self._create_replacement(ent) for ent in entities]
        anonymized_text, replacements = anonymize(text, replacements)
        return anonymized_text, replacements

    # ===========================================
    # Private methods
    # ===========================================

    def _create_replacement(self, entity: Entity) -> Replacement:
        """Creates a replacement for the entity.

        Args:
            entity: The entity to create the replacement for.

        Returns:
            The created replacement.

        """

        mask = self._create_mask(entity)
        return {
            "original_text": entity.text,
            "label": entity.label,
            "start_index": entity.start_index,
            "end_index": entity.end_index,
            "anonymized_text": mask,
        }

    def _create_mask(self, entity: Entity) -> str:
        """Creates a mask for the entity.

        Args:
            entity: The entity to create the mask for.

        Returns:
            The created mask.

        """

        # TODO: add random length substitution
        return " ".join(
            [
                self.substitute_label * len(chunk)
                for chunk in re.split(r"\s+", entity.text)
            ]
        )

__init__(substitute_label='*', *args, **kwargs)

Initializes the masking strategy.

Examples:

>>> from anonipy.anonymize.strategies import MaskingStrategy
>>> strategy = MaskingStrategy()

Parameters:

Name Type Description Default
substitute_label str

The label to substitute in the anonymized text.

'*'
Source code in anonipy/anonymize/strategies/masking.py
def __init__(self, substitute_label: str = "*", *args, **kwargs):
    """Initializes the masking strategy.

    Examples:
        >>> from anonipy.anonymize.strategies import MaskingStrategy
        >>> strategy = MaskingStrategy()

    Args:
        substitute_label: The label to substitute in the anonymized text.

    """

    super().__init__(*args, **kwargs)
    self.substitute_label = substitute_label or "*"

anonymize(text, entities, *args, **kwargs)

Anonymize the text using the masking strategy.

Examples:

>>> from anonipy.anonymize.strategies import MaskingStrategy
>>> strategy = MaskingStrategy()
>>> strategy.anonymize(text, entities)

Parameters:

Name Type Description Default
text str

The text to anonymize.

required
entities List[Entity]

The list of entities to anonymize.

required

Returns:

Type Description
str

The anonymized text.

List[Replacement]

The list of applied replacements.

Source code in anonipy/anonymize/strategies/masking.py
def anonymize(
    self, text: str, entities: List[Entity], *args, **kwargs
) -> Tuple[str, List[Replacement]]:
    """Anonymize the text using the masking strategy.

    Examples:
        >>> from anonipy.anonymize.strategies import MaskingStrategy
        >>> strategy = MaskingStrategy()
        >>> strategy.anonymize(text, entities)

    Args:
        text: The text to anonymize.
        entities: The list of entities to anonymize.

    Returns:
        The anonymized text.
        The list of applied replacements.

    """

    replacements = [self._create_replacement(ent) for ent in entities]
    anonymized_text, replacements = anonymize(text, replacements)
    return anonymized_text, replacements

anonipy.anonymize.strategies.PseudonymizationStrategy

Bases: StrategyInterface

The class representing the pseudonymization strategy.

Examples:

>>> from anonipy.anonymize.strategies import PseudonymizationStrategy
>>> strategy = PseudonymizationStrategy(mapping)
>>> strategy.anonymize(text, entities)

Attributes:

Name Type Description
mapping

The mapping of entities to pseudonyms.

Methods:

Name Description
anonymize

Anonymize the text based on the entities.

Source code in anonipy/anonymize/strategies/pseudonymization.py
class PseudonymizationStrategy(StrategyInterface):
    """The class representing the pseudonymization strategy.

    Examples:
        >>> from anonipy.anonymize.strategies import PseudonymizationStrategy
        >>> strategy = PseudonymizationStrategy(mapping)
        >>> strategy.anonymize(text, entities)

    Attributes:
        mapping: The mapping of entities to pseudonyms.

    Methods:
        anonymize(text, entities):
            Anonymize the text based on the entities.

    """

    def __init__(self, mapping: Callable, *args, **kwargs):
        """Initializes the pseudonymization strategy.

        Examples:
            >>> from anonipy.anonymize.strategies import PseudonymizationStrategy
            >>> strategy = PseudonymizationStrategy(mapping)

        Args:
            mapping: The mapping function on how to handle each entity type.

        """

        super().__init__(*args, **kwargs)
        self.mapping = mapping

    def anonymize(
        self, text: str, entities: List[Entity], *args, **kwargs
    ) -> Tuple[str, List[Replacement]]:
        """Anonymize the text using the pseudonymization strategy.

        Examples:
            >>> from anonipy.anonymize.strategies import PseudonymizationStrategy
            >>> strategy = PseudonymizationStrategy(mapping)
            >>> strategy.anonymize(text, entities)

        Args:
            text: The text to anonymize.
            entities: The list of entities to anonymize.

        Returns:
            The anonymized text.
            The list of applied replacements.

        """

        replacements = []
        for ent in entities:
            replacement = self._create_replacement(ent, text, replacements)
            replacements.append(replacement)
        anonymized_text, replacements = anonymize(text, replacements)
        return anonymized_text, replacements

    # ===========================================
    # Private methods
    # ===========================================

    def _create_replacement(
        self, entity: Entity, text: str, replacements: List[dict]
    ) -> Replacement:
        """Creates a replacement for the entity.

        Args:
            entity: The entity to create the replacement for.
            text: The text to anonymize.
            replacements: The list of existing replacements.

        Returns:
            The created replacement.

        """

        # check if the replacement already exists
        anonymized_text = self._check_replacement(entity, replacements)
        # create a new replacement if it doesn't exist
        anonymized_text = (
            self.mapping(text, entity) if not anonymized_text else anonymized_text
        )
        return {
            "original_text": entity.text,
            "label": entity.label,
            "start_index": entity.start_index,
            "end_index": entity.end_index,
            "anonymized_text": anonymized_text,
        }

    def _check_replacement(
        self, entity: Entity, replacements: List[Replacement]
    ) -> str:
        """Checks if a suitable replacement already exists.

        Args:
            entity: The entity to check.
            replacements: The list of replacements.

        Returns:
            The anonymized text if the replacement already exists, None otherwise.

        """
        existing_replacement = list(
            filter(lambda x: x["original_text"] == entity.text, replacements)
        )
        return (
            existing_replacement[0]["anonymized_text"]
            if len(existing_replacement) > 0
            else None
        )

__init__(mapping, *args, **kwargs)

Initializes the pseudonymization strategy.

Examples:

>>> from anonipy.anonymize.strategies import PseudonymizationStrategy
>>> strategy = PseudonymizationStrategy(mapping)

Parameters:

Name Type Description Default
mapping Callable

The mapping function on how to handle each entity type.

required
Source code in anonipy/anonymize/strategies/pseudonymization.py
def __init__(self, mapping: Callable, *args, **kwargs):
    """Initializes the pseudonymization strategy.

    Examples:
        >>> from anonipy.anonymize.strategies import PseudonymizationStrategy
        >>> strategy = PseudonymizationStrategy(mapping)

    Args:
        mapping: The mapping function on how to handle each entity type.

    """

    super().__init__(*args, **kwargs)
    self.mapping = mapping

anonymize(text, entities, *args, **kwargs)

Anonymize the text using the pseudonymization strategy.

Examples:

>>> from anonipy.anonymize.strategies import PseudonymizationStrategy
>>> strategy = PseudonymizationStrategy(mapping)
>>> strategy.anonymize(text, entities)

Parameters:

Name Type Description Default
text str

The text to anonymize.

required
entities List[Entity]

The list of entities to anonymize.

required

Returns:

Type Description
str

The anonymized text.

List[Replacement]

The list of applied replacements.

Source code in anonipy/anonymize/strategies/pseudonymization.py
def anonymize(
    self, text: str, entities: List[Entity], *args, **kwargs
) -> Tuple[str, List[Replacement]]:
    """Anonymize the text using the pseudonymization strategy.

    Examples:
        >>> from anonipy.anonymize.strategies import PseudonymizationStrategy
        >>> strategy = PseudonymizationStrategy(mapping)
        >>> strategy.anonymize(text, entities)

    Args:
        text: The text to anonymize.
        entities: The list of entities to anonymize.

    Returns:
        The anonymized text.
        The list of applied replacements.

    """

    replacements = []
    for ent in entities:
        replacement = self._create_replacement(ent, text, replacements)
        replacements.append(replacement)
    anonymized_text, replacements = anonymize(text, replacements)
    return anonymized_text, replacements

anonipy.anonymize.strategies.StrategyInterface

The class representing the strategy interface.

All strategies should inherit from this class.

Methods:

Name Description
anonymize

Anonymize the text based on the entities.

Source code in anonipy/anonymize/strategies/interface.py
class StrategyInterface:
    """The class representing the strategy interface.

    All strategies should inherit from this class.

    Methods:
        anonymize(text, entities):
            Anonymize the text based on the entities.

    """

    def __init__(self, *args, **kwargs):
        pass

    def anonymize(self, text: str, entities: List[Entity], *args, **kwargs):
        pass