deduplicationdict
#
Package Contents#
Classes#
A dictionary that de-duplicates values. |
Attributes#
- class deduplicationdict.DeDuplicationDict(*args, _value_dict: dict = None, **kwargs)[source]#
Bases:
collections.abc.MutableMapping
A dictionary that de-duplicates values.
A dictionary-like class that deduplicates values by storing them in a separate dictionary and replacing them with their corresponding hash values. This class is particularly useful for large dictionaries with repetitive entries, as it can save memory by storing values only once and substituting recurring values with their hash representations.
This class supports nested structures by automatically converting nested dictionaries into DeDuplicationDict instances. It also provides various conversion methods to convert between regular dictionaries and DeDuplicationDict instances.
- Variables
hash_length (int) – The length of the hash value used for deduplication.
auto_clean_up (bool) – Whether to automatically clean up unused hash values when deleting items.
skip_update_on_setitem (bool) – Whether to skip updating the value dictionary when setting an item.
key_dict (dict) – A dictionary that maps hash values to their corresponding values.
value_dict (dict) – A dictionary that maps values to their corresponding hash values.
- _set_value_dict(value_dict: dict, skip_update: bool = False) DeDuplicationDict [source]#
Update the value dictionary and propagate the changes to nested DeDuplicationDict instances.
- Parameters
- Returns
self
- Return type
- __setitem__(key: KT, value: VT) None [source]#
Set the value for the given key, deduplicating the value if necessary.
- Parameters
key (KT) – The key to set the value for.
value (VT) – The value to set for the given key.
- all_hashes_in_use() set [source]#
Get all hash values currently in use.
- Returns
A set of all hash values in use.
- Return type
- clean_up() DeDuplicationDict [source]#
Remove unused hash values from the value dictionary.
- Returns
self
- Return type
- detach() DeDuplicationDict [source]#
Detach the DeDuplicationDict instance from its value dictionary, creating a standalone instance.
- Returns
A new DeDuplicationDict instance with its own value dictionary.
- Return type
- __deepcopy__(memo: dict) DeDuplicationDict [source]#
Create a deep copy of the DeDuplicationDict instance.
- Parameters
memo (dict) – A dictionary of memoized values.
- Returns
A new DeDuplicationDict instance with its own value dictionary.
- Return type
- _del_detach() DeDuplicationDict [source]#
Detach the DeDuplicationDict instance from its value dictionary and clean up unused hash values.
- Returns
self
- Return type
- __delitem__(key: KT) None [source]#
Delete the item with the given key.
- Parameters
key (KT) – The key of the item to delete.
- Raises
KeyError – If the key is not found in the dictionary.
- __len__() int [source]#
Get the number of items in the dictionary.
- Returns
The number of items in the dictionary.
- Return type
- __iter__() Iterator[T_co] [source]#
Get an iterator over the keys in the dictionary.
- Returns
An iterator over the keys in the dictionary.
- Return type
Iterator[T_co]
- __repr__() str [source]#
Get a string representation of the DeDuplicationDict instance.
- Returns
A string representation of the DeDuplicationDict instance.
- Return type
- to_dict() dict [source]#
Convert the DeDuplicationDict instance to a regular dictionary.
- Returns
A regular dictionary with the same key-value pairs as the DeDuplicationDict instance.
- Return type
- classmethod from_dict(d: dict) DeDuplicationDict [source]#
Create a DeDuplicationDict instance from a regular dictionary.
- Parameters
d (dict) – The dictionary to create the DeDuplicationDict instance from.
- Returns
A new DeDuplicationDict instance with the same key-value pairs as the given dictionary.
- Return type
- _get_key_dict() dict [source]#
Get the key dictionary of the DeDuplicationDict instance in a normal dictionary format.
- Returns
The key dictionary of the DeDuplicationDict instance.
- Return type
- to_json_save_dict() dict [source]#
Convert the DeDuplicationDict instance to a dictionary that can be saved to a JSON file.
- Returns
A dictionary that can be saved to a JSON file.
- Return type
- _set_key_dict(key_dict: dict) DeDuplicationDict [source]#
Set the key dictionary of the DeDuplicationDict instance from a normal dictionary format.
- Parameters
key_dict (dict) – The key dictionary to set.
- Returns
self
- Return type
- classmethod from_json_save_dict(d: dict, _v: dict = None) DeDuplicationDict [source]#
Create a DeDuplicationDict instance from a dictionary that was saved to a JSON file.
- Parameters
- Returns
A new DeDuplicationDict instance with the same key-value pairs as the given dictionary.
- Return type