Pipeline Module
anonipy.anonymize.pipeline
Module containing the pipeline
.
The pipeline
module provides a class for anonymizing files using a pipeline of
extractors and strategies.
Classes:
Name | Description |
---|---|
Pipeline |
The class representing the anonymization pipeline. |
anonipy.anonymize.pipeline.Pipeline
A class for anonymizing files using a pipeline of extractors and strategies.
Examples:
>>> from anonipy.anonymize.pipeline import Pipeline
>>> extractor = NERExtractor(labels, lang=LANGUAGES.ENGLISH)
>>> strategy = RedactionStrategy()
>>> pipeline = Pipeline(extractor, strategy)
>>> pipeline.anonymize("/path/to/input_dir", "/path/to/output_dir", flatten=True)
Attributes:
Name | Type | Description |
---|---|---|
extractor |
(ExtractorInterface, MultiExtractor, List[ExtractorInterface])
|
The extractor to use for entity extraction. |
strategy |
StrategyInterface
|
The strategy to use for anonymization. |
Methods:
Name | Description |
---|---|
anonymize |
Anonymize files in the input directory and save the anonymized files to the output directory. |
Source code in anonipy/anonymize/pipeline.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
|
__init__(extractor, strategy)
Initialize the pipeline.
Examples:
>>> from anonipy.anonymize.pipeline import Pipeline
>>> extractor = NERExtractor(labels, lang=LANGUAGES.ENGLISH)
>>> strategy = RedactionStrategy()
>>> pipeline = Pipeline(extractor, strategy)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
extractor
|
Union[ExtractorInterface, MultiExtractor, List[ExtractorInterface]]
|
The extractor to use for entity extraction. |
required |
strategy
|
StrategyInterface
|
The strategy to use for anonymization. |
required |
Source code in anonipy/anonymize/pipeline.py
anonymize(input_dir, output_dir, flatten=False)
Anonymize files in the input directory and save the anonymized files to the output directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_dir
|
str
|
The path to the input directory containing files to be anonymized. |
required |
output_dir
|
str
|
The path to the output directory where anonymized files will be saved. |
required |
flatten
|
bool
|
Whether to flatten the output directory structure. Defaults to False. |
False
|
Raises:
Type | Description |
---|---|
ValueError
|
If the input directory does not exist or if the input and output directories are the same. |
Returns:
Type | Description |
---|---|
dict
|
A dictionary mapping the original file paths to the anonymized file paths. |