deduplicationdict#
Package Contents#
Classes#
A dictionary that de-duplicates values. |
Attributes#
- class deduplicationdict.DeDuplicationDict(*args, _value_dict: dict = None, **kwargs)[source]#
Bases:
collections.abc.MutableMappingA dictionary that de-duplicates values.
A dictionary-like class that deduplicates values by storing them in a separate dictionary and replacing them with their corresponding hash values. This class is particularly useful for large dictionaries with repetitive entries, as it can save memory by storing values only once and substituting recurring values with their hash representations.
This class supports nested structures by automatically converting nested dictionaries into DeDuplicationDict instances. It also provides various conversion methods to convert between regular dictionaries and DeDuplicationDict instances.
- Variables:
hash_length (int) – The length of the hash value used for deduplication.
auto_clean_up (bool) – Whether to automatically clean up unused hash values when deleting items.
skip_update_on_setitem (bool) – Whether to skip updating the value dictionary when setting an item.
key_dict (dict) – A dictionary that maps hash values to their corresponding values.
value_dict (dict) – A dictionary that maps values to their corresponding hash values.
- _set_value_dict(value_dict: dict, skip_update: bool = False) DeDuplicationDict[source]#
Update the value dictionary and propagate the changes to nested DeDuplicationDict instances.
- Parameters:
- Returns:
self
- Return type:
- __setitem__(key: KT, value: VT) None[source]#
Set the value for the given key, deduplicating the value if necessary.
- Parameters:
key (KT) – The key to set the value for.
value (VT) – The value to set for the given key.
- all_hashes_in_use() set[source]#
Get all hash values currently in use.
- Returns:
A set of all hash values in use.
- Return type:
- clean_up() DeDuplicationDict[source]#
Remove unused hash values from the value dictionary.
- Returns:
self
- Return type:
- detach() DeDuplicationDict[source]#
Detach the DeDuplicationDict instance from its value dictionary, creating a standalone instance.
- Returns:
A new DeDuplicationDict instance with its own value dictionary.
- Return type:
- __deepcopy__(memo: dict) DeDuplicationDict[source]#
Create a deep copy of the DeDuplicationDict instance.
- Parameters:
memo (dict) – A dictionary of memoized values.
- Returns:
A new DeDuplicationDict instance with its own value dictionary.
- Return type:
- _del_detach() DeDuplicationDict[source]#
Detach the DeDuplicationDict instance from its value dictionary and clean up unused hash values.
- Returns:
self
- Return type:
- __delitem__(key: KT) None[source]#
Delete the item with the given key.
- Parameters:
key (KT) – The key of the item to delete.
- Raises:
KeyError – If the key is not found in the dictionary.
- __len__() int[source]#
Get the number of items in the dictionary.
- Returns:
The number of items in the dictionary.
- Return type:
- __iter__() Iterator[T_co][source]#
Get an iterator over the keys in the dictionary.
- Returns:
An iterator over the keys in the dictionary.
- Return type:
Iterator[T_co]
- __repr__() str[source]#
Get a string representation of the DeDuplicationDict instance.
- Returns:
A string representation of the DeDuplicationDict instance.
- Return type:
- to_dict() dict[source]#
Convert the DeDuplicationDict instance to a regular dictionary.
- Returns:
A regular dictionary with the same key-value pairs as the DeDuplicationDict instance.
- Return type:
- classmethod from_dict(d: dict) DeDuplicationDict[source]#
Create a DeDuplicationDict instance from a regular dictionary.
- Parameters:
d (dict) – The dictionary to create the DeDuplicationDict instance from.
- Returns:
A new DeDuplicationDict instance with the same key-value pairs as the given dictionary.
- Return type:
- _get_key_dict() dict[source]#
Get the key dictionary of the DeDuplicationDict instance in a normal dictionary format.
- Returns:
The key dictionary of the DeDuplicationDict instance.
- Return type:
- to_json_save_dict() dict[source]#
Convert the DeDuplicationDict instance to a dictionary that can be saved to a JSON file.
- Returns:
A dictionary that can be saved to a JSON file.
- Return type:
- _set_key_dict(key_dict: dict) DeDuplicationDict[source]#
Set the key dictionary of the DeDuplicationDict instance from a normal dictionary format.
- Parameters:
key_dict (dict) – The key dictionary to set.
- Returns:
self
- Return type:
- classmethod from_json_save_dict(d: dict, _v: dict = None) DeDuplicationDict[source]#
Create a DeDuplicationDict instance from a dictionary that was saved to a JSON file.
- Parameters:
- Returns:
A new DeDuplicationDict instance with the same key-value pairs as the given dictionary.
- Return type: