Addendum

Performance

bidict strives to be as performant as possible while being faithful to its purpose. The need for speed is balanced with the responsibility to protect users from shooting themselves in the foot.

In general, accomplishing some task using bidict should have about the same performance as keeping two inverse dicts in sync manually. The test suite includes benchmarks for common workloads to catch any performance regressions.

If you spot a case where bidict’s performance could be improved, please don’t hesitate to file an issue or submit a pull request.

Terminology

  • It’s intentional that the term “inverse” is used rather than “reverse”.

    Consider a collection of (k, v) pairs. Taking the reverse of the collection can only be done if it is ordered, and (as you’d expect) reverses the order of the pairs in the collection. But each original (k, v) pair remains in the resulting collection.

    By contrast, taking the inverse of such a collection neither requires the collection to be ordered nor guarantees any ordering in the result, but rather just replaces every (k, v) pair with the inverse pair (v, k).

  • “keys” and “values” could perhaps more properly be called “primary keys” and “secondary keys” (as in a database), or even “forward keys” and “inverse keys”, respectively. bidict sticks with the terms “keys” and “values” for the sake of familiarity and to avoid potential confusion, but technically values are also keys themselves.

    Concretely, this allows bidict to return a set-like (dict_keys) object for bidict.values (Python 3) / bidict.viewvalues() (Python 2.7), rather than a non-set-like dict_values object.

Missing bidicts in Stdlib!

The Python standard library actually contains some examples where bidicts could be used for fun and profit (depending on your ideas of fun and profit):

  • The logging module contains a private _levelToName dict which maps integer levels like 10 to their string names like DEBUG. If I had a nickel for every time I wanted that exposed in a bidirectional map (and as a public attribute, no less), I bet I could afford some better turns of phrase.
  • The dis module maintains a mapping from opnames to opcodes dis.opmap and a separate list of opnames indexed by opcode dis.opnames. These could be combined into a single bidict.
  • Python 3’s html.entities module / Python 2’s htmlentitydefs module maintains separate html.entities.name2codepoint and html.entities.codepoint2name dicts. These could be combined into a single bidict.

Caveats

Non-atomic Mutation

As with built-in dicts, mutating operations on a bidict are not atomic. If you need to mutate the same bidict from different threads, use a synchronization primitive to coordinate access. [1]

[1]See also: [2]

Bidicts Create Reference Cycles

As we’ve seen, a bidict b keeps a reference to its inverse b.inv, and its inverse bidict keeps a reference to it (b.inv.inv is b). So even when you no longer have any references to b, its refcount will not drop to zero because its inverse still has a reference to it. Reference cycles are also created in the doubly-linked list that backs an orderedbidict instance.

As long as your Python implementation’s garbage collector has not been disabled, it will detect and break these reference cycles so the memory allocated for a bidict can be reclaimed when you no longer have any references to it.

NOTE: Prior to CPython 3.4, __del__ methods prevented reference cycles from being garbage collected. No bidicts implement __del__, so this is only an issue if you implement __del__ in a bidict subclass.

Equivalence vs. Identity

Consider the following:

>>> d = dict([(1, 'int'), (1.0, 'float')])

How many items do you expect d to contain? If you expected d to be {1: ‘int’, 1.0: ‘float’}, the actual result might surprise you:

>>> d
{1: 'float'}

And similarly,

>>> dict([(1, 'int'), (1.0, 'float'), (1+0j, 'complex'), (True, 'bool')])
{1: 'bool'}
>>> set([True, 1+0j, 1.0, 1])  
{True}
>>> 1.0 in {True}
True

(Note that 1 == 1.0 == 1+0j == True.)

This illustrates that a mapping cannot contain two items with equivalent but distinct keys (and likewise a set cannot contain two equivalent but distinct elements). If an object being looked up in a set or mapping is equal to a contained object, the contained object will be found, even if it is distinct.

With bidict, since values function as keys in the inverse mapping, this behavior occurs on both sides:

>>> from bidict import bidict
>>> b = bidict({1: 1})
>>> b.inv[True]
1
>>> b[2] = True
Traceback (most recent call last):
   ...
ValueDuplicationError: 1
>>> b.put(True, 'true')
Traceback (most recent call last):
   ...
KeyDuplicationError: 1

nan As Key

In CPython, nan is especially tricky when used as a dictionary key:

>>> d = {float('nan'): 'nan'}
>>> d
{nan: 'nan'}
>>> d[float('nan')]  
Traceback (most recent call last):
    ...
KeyError: nan
>>> d[float('nan')] = 'not overwritten'
>>> d  
{nan: 'nan', nan: 'not overwritten'}

In other Python implementations such as PyPy, nan behaves just like any other dictionary key. But in CPython, beware of this unexpected behavior, which applies to bidicts too. bidict contains no special-case logic for dealing with nan as a key, so the behavior will match dict’s in the host environment.

Thanks

  • Thanks to Gregory Ewing for the name.
  • Thanks to Terry Reedy for suggesting the slice syntax (it was fun while it lasted).
  • Thanks to Raymond Hettinger for suggesting namedbidict and pointing out various caveats.
  • Thanks to Francis Carr for the idea of storing the inverse bidict.
  • Thanks to Brianna Laugher and Adopt Pytest Month for choosing bidict as one of Adopt Pytest Month 2015’s selected projects.
  • Thanks to Tom Viner for all the help as bidict’s mentor during Adopt Pytest Month 2015.
  • Thanks to the Pytest team and to David MacIver for the great testing tools.
  • Thanks to Daniel Pope for various suggestions enhancing bidict’s Python zen.
  • Thanks to David Turner for the generous code review.
  • Thanks to Michael Arntzenius for the valuable design discussion and feedback.
  • Thanks to Jozef Knaperek for the bugfix.