Python: Fastest way to check key in dictionary

The problem

When you want access value by it’s key in Python dictionary instance, your program will break on KeyError if the accessed key is not contained in a dictionary.

So it’s necessary to check the presence of a key before access it.

Here is the most obvious way to check it:

def k_in_d(d, k):
  return k in d.keys()

Is this way most efficient?

It might be done better:

def k_in_d(d, k):
  return k in d

But the following one is faster:

def k_in_d(d, k):
  try:
    v = d[k]
  except KeyError:
    return False
  return True

Performance checking

By the following code it is possible to check performance and make conclusion:

import datetime as dt
import random

# *********************************************************
# dict_size = 100
# num_iterations = 1000000
dict_size = 1000000
num_iterations = 100

d = {i: i for i in range(dict_size)}

# ---------------------------------------------------------
def k_in_d_object():
    k = random.randint(0,dict_size-1)
    return k in d
# ---------------------------------------------------------
def k_in_d_keys():
    k = random.randint(0,dict_size-1)
    return k in d.keys()
# ---------------------------------------------------------
def k_in_d_except():
    k = random.randint(0,dict_size-1)
    try:
    	v = d[k]
    except KeyError:
    	return False
    return True
# ---------------------------------------------------------
def test(func):
    t = dt.datetime.utcnow()
    for i in range(num_iterations):
        func()
    print ("%s --> %1.6f sec" % 
           (func, (dt.datetime.utcnow()-t).total_seconds()))

# *********************************************************
test(k_in_d_keys)
test(k_in_d_object)
test(k_in_d_except)

Statistics on different versions of Python

The code has been tested on computer from 2015 with i5 CPU.

Python 2.7.15

dict_size = 1000000 num_iterations = 100

<function k_in_d_keys at 0x10d7f50c8> --> 2.251373 sec
<function k_in_d_object at 0x10d7f5050> --> 0.000535 sec
<function k_in_d_except at 0x10d7f5140> --> 0.000242 sec

dict_size = 100 num_iterations = 1000000

<function k_in_d_keys at 0x1014fe0c8> --> 2.959747 sec
<function k_in_d_object at 0x1014fe050> --> 2.479454 sec
<function k_in_d_except at 0x1014fe140> --> 1.402201 sec

Python 3.7.2

dict_size = 1000000 num_iterations = 100

<function k_in_d_keys at 0x10ec81598> --> 0.000792 sec
<function k_in_d_object at 0x10ec81620> --> 0.001852 sec
<function k_in_d_except at 0x10ec81510> --> 0.000638 sec

dict_size = 100 num_iterations = 1000000

<function k_in_d_keys at 0x105209598> --> 1.761158 sec
<function k_in_d_object at 0x105209620> --> 2.526102 sec
<function k_in_d_except at 0x105209510> --> 1.677804 sec

Python 3.7.4

dict_size = 1000000 num_iterations = 100

<function k_in_d_keys at 0x102a94290> --> 0.002660 sec
<function k_in_d_object at 0x102a13e60> --> 0.002838 sec 
<function k_in_d_except at 0x102a94200> --> 0.000991 sec

dict_size = 100 num_iterations = 1000000

<function k_in_d_keys at 0x103cdb290> --> 1.739541 sec 
<function k_in_d_object at 0x103c5ae60> --> 1.649423 sec 
<function k_in_d_except at 0x103cdb200> --> 1.644620 sec

Conclusion

With a time performance of Python 3 become better.

Using exceptions is the fastest way to check key existence in a dictionary.

References

Stack Overflow: Python is ‘key in dict’ different/faster than ‘key in dict.keys()’

Leave a Reply

Your email address will not be published.