Generators Module
anonipy.anonymize.generators
Module containing the generators
.
The generators
module provides a set of generators used to generate data
substitutes.
Classes:
Name | Description |
---|---|
LLMLabelGenerator |
The class representing the label generator utilizing LLMs. |
MaskLabelGenerator |
The class representing the label generator utilizing token masking. |
NumberGenerator |
The class representing the number generator. |
DateGenerator |
The class representing the date generator. |
anonipy.anonymize.generators.LLMLabelGenerator
Bases: GeneratorInterface
The class representing the LLM label generator.
Examples:
>>> from anonipy.anonymize.generators import LLMLabelGenerator
>>> generator = LLMLabelGenerator()
>>> generator.generate(entity)
Attributes:
Name | Type | Description |
---|---|---|
model |
Transformers
|
The model used to generate the label substitutes. |
Methods:
Name | Description |
---|---|
generate |
Generate the label based on the entity. |
Source code in anonipy/anonymize/generators/llm_label_generator.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
|
__init__(*args, model_name='HuggingFaceTB/SmolLM2-1.7B-Instruct', use_gpu=False, **kwargs)
Initializes the LLM label generator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name
|
str
|
The name of the model to use. |
'HuggingFaceTB/SmolLM2-1.7B-Instruct'
|
use_gpu
|
bool
|
Whether to use GPU or not. |
False
|
Examples:
>>> from anonipy.anonymize.generators import LLMLabelGenerator
>>> generator = LLMLabelGenerator()
LLMLabelGenerator()
Source code in anonipy/anonymize/generators/llm_label_generator.py
generate(entity, *args, add_entity_attrs='', temperature=1.0, top_p=0.95, **kwargs)
Generate the substitute for the entity based on it's attributes.
Examples:
>>> from anonipy.anonymize.generators import LLMLabelGenerator
>>> generator = LLMLabelGenerator()
>>> generator.generate(entity)
label
Parameters:
Name | Type | Description | Default |
---|---|---|---|
entity
|
Entity
|
The entity to generate the label from. |
required |
add_entity_attrs
|
str
|
Additional entity attribute description to add to the generation. |
''
|
temperature
|
float
|
The temperature to use for the generation. |
1.0
|
top_p
|
float
|
The top p to use for the generation. |
0.95
|
Returns:
Type | Description |
---|---|
str
|
The generated entity label substitute. |
Source code in anonipy/anonymize/generators/llm_label_generator.py
anonipy.anonymize.generators.MaskLabelGenerator
Bases: GeneratorInterface
The class representing the mask label generator.
Examples:
>>> from anonipy.anonymize.generators import MaskLabelGenerator
>>> generator = MaskLabelGenerator(model_name, context_window=100, use_gpu=False)
>>> generator.generate(entity)
Attributes:
Name | Type | Description |
---|---|---|
pipeline |
Pipeline
|
The transformers pipeline used to generate the label substitutes. |
context_window |
int
|
The context window size to use to generate the label substitutes. |
mask_token |
str
|
The mask token to use to replace the masked words. |
Methods:
Name | Description |
---|---|
generate |
Generate the substitute for the entity based on it's location in the text. |
Source code in anonipy/anonymize/generators/mask_label_generator.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
|
__init__(*args, model_name='FacebookAI/xlm-roberta-large', use_gpu=False, context_window=100, **kwargs)
Initializes the mask label generator.
Examples:
>>> from anonipy.anonymize.generators import MaskLabelGenerator
>>> generator = MaskLabelGenerator(context_window=120, use_gpu=True)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name
|
str
|
The name of the masking model to use. |
'FacebookAI/xlm-roberta-large'
|
use_gpu
|
bool
|
Whether to use GPU/CUDA, if available. |
False
|
context_window
|
int
|
The context window size. |
100
|
Source code in anonipy/anonymize/generators/mask_label_generator.py
generate(entity, text, *args, **kwargs)
Generate the substitute for the entity using the masking model.
Examples:
>>> from anonipy.anonymize.generators import MaskLabelGenerator
>>> generator = MaskLabelGenerator(context_window=120, use_gpu=True)
>>> generator.generate(entity, text)
label
Parameters:
Name | Type | Description | Default |
---|---|---|---|
entity
|
Entity
|
The entity used to generate the substitute. |
required |
text
|
str
|
The original text in which the entity is located; used to get the entity's context. |
required |
Returns:
Type | Description |
---|---|
str
|
The generated substitute text. |
Source code in anonipy/anonymize/generators/mask_label_generator.py
anonipy.anonymize.generators.NumberGenerator
Bases: GeneratorInterface
The class representing the number generator.
Examples:
>>> from anonipy.anonymize.generators import NumberGenerator
>>> generator = NumberGenerator()
>>> generator.generate(entity)
Methods:
Name | Description |
---|---|
generate |
Generates a substitute for the numeric entity. |
Source code in anonipy/anonymize/generators/number_generator.py
__init__(*args, **kwargs)
Initializes the number generator.
Examples:
Source code in anonipy/anonymize/generators/number_generator.py
generate(entity, *args, **kwargs)
Generates the substitute for the numeric entity.
Examples:
>>> from anonipy.anonymize.generators import NumberGenerator
>>> generator = NumberGenerator()
>>> generator.generate(entity)
"1234567890"
Parameters:
Name | Type | Description | Default |
---|---|---|---|
entity
|
Entity
|
The numeric entity to generate the numeric substitute. |
required |
Returns:
Type | Description |
---|---|
str
|
The generated numeric substitute. |
Raises:
Type | Description |
---|---|
ValueError
|
If the entity type is not |
Source code in anonipy/anonymize/generators/number_generator.py
anonipy.anonymize.generators.DateGenerator
Bases: GeneratorInterface
The class representing the date generator.
Examples:
>>> from anonipy.anonymize.generators import DateGenerator
>>> generator = DateGenerator(lang="de")
>>> generator.generate(entity)
Attributes:
Name | Type | Description |
---|---|---|
lang |
(str, LANGUAGES)
|
The language of the text. |
date_format |
str
|
The date format in which the date should be generated. |
day_sigma |
int
|
The range of the random date in days. |
Methods:
Name | Description |
---|---|
generate |
Generate the date substitute based on the input parameters. |
Source code in anonipy/anonymize/generators/date_generator.py
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 |
|
__init__(*args, lang='en', date_format='auto', day_sigma=30, **kwargs)
Initializes the date generator.
Examples:
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lang
|
Union[str, LANGUAGES]
|
The language of the text. |
'en'
|
date_format
|
str
|
The date format in which the date should be generated. More on date formats see here. |
'auto'
|
day_sigma
|
int
|
The range of the random date in days. |
30
|
Source code in anonipy/anonymize/generators/date_generator.py
generate(entity, *args, sub_variant=DATE_TRANSFORM_VARIANTS.RANDOM, **kwargs)
Generate the entity substitute based on the input parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
entity
|
Entity
|
The entity to generate the date substitute from. |
required |
sub_variant
|
DATE_TRANSFORM_VARIANTS
|
The substitute function variant to use. |
RANDOM
|
Returns:
Type | Description |
---|---|
str
|
The generated date substitute. |
Raises:
Type | Description |
---|---|
ValueError
|
If the entity type is not |
Source code in anonipy/anonymize/generators/date_generator.py
anonipy.anonymize.generators.GeneratorInterface
The class representing the generator interface.
All generators should inherit from this class.
Methods:
Name | Description |
---|---|
generate |
Generate a substitute for the entity. |