The problem
When you want access value by it’s key in Python dictionary instance, your program will break on KeyError if the accessed key is not contained in a dictionary.
So it’s necessary to check the presence of a key before access it.
Here is the most obvious way to check it:
def k_in_d(d, k): return k in d.keys()
Is this way most efficient?
It might be done better:
def k_in_d(d, k): return k in d
But the following one is faster:
def k_in_d(d, k): try: v = d[k] except KeyError: return False return True
Performance checking
By the following code it is possible to check performance and make conclusion:
import datetime as dt import random # ********************************************************* # dict_size = 100 # num_iterations = 1000000 dict_size = 1000000 num_iterations = 100 d = {i: i for i in range(dict_size)} # --------------------------------------------------------- def k_in_d_object(): k = random.randint(0,dict_size-1) return k in d # --------------------------------------------------------- def k_in_d_keys(): k = random.randint(0,dict_size-1) return k in d.keys() # --------------------------------------------------------- def k_in_d_except(): k = random.randint(0,dict_size-1) try: v = d[k] except KeyError: return False return True # --------------------------------------------------------- def test(func): t = dt.datetime.utcnow() for i in range(num_iterations): func() print ("%s --> %1.6f sec" % (func, (dt.datetime.utcnow()-t).total_seconds())) # ********************************************************* test(k_in_d_keys) test(k_in_d_object) test(k_in_d_except)
Statistics on different versions of Python
The code has been tested on computer from 2015 with i5 CPU.
Python 2.7.15
dict_size = 1000000 num_iterations = 100
<function k_in_d_keys at 0x10d7f50c8> --> 2.251373 sec <function k_in_d_object at 0x10d7f5050> --> 0.000535 sec <function k_in_d_except at 0x10d7f5140> --> 0.000242 sec
dict_size = 100 num_iterations = 1000000
<function k_in_d_keys at 0x1014fe0c8> --> 2.959747 sec <function k_in_d_object at 0x1014fe050> --> 2.479454 sec <function k_in_d_except at 0x1014fe140> --> 1.402201 sec
Python 3.7.2
dict_size = 1000000 num_iterations = 100
<function k_in_d_keys at 0x10ec81598> --> 0.000792 sec <function k_in_d_object at 0x10ec81620> --> 0.001852 sec <function k_in_d_except at 0x10ec81510> --> 0.000638 sec
dict_size = 100 num_iterations = 1000000
<function k_in_d_keys at 0x105209598> --> 1.761158 sec <function k_in_d_object at 0x105209620> --> 2.526102 sec <function k_in_d_except at 0x105209510> --> 1.677804 sec
Python 3.7.4
dict_size = 1000000 num_iterations = 100
<function k_in_d_keys at 0x102a94290> --> 0.002660 sec
<function k_in_d_object at 0x102a13e60> --> 0.002838 sec
<function k_in_d_except at 0x102a94200> --> 0.000991 sec
dict_size = 100 num_iterations = 1000000
<function k_in_d_keys at 0x103cdb290> --> 1.739541 sec
<function k_in_d_object at 0x103c5ae60> --> 1.649423 sec
<function k_in_d_except at 0x103cdb200> --> 1.644620 sec
Conclusion
With a time performance of Python 3 become better.
Using exceptions is the fastest way to check key existence in a dictionary.
References
Stack Overflow: Python is ‘key in dict’ different/faster than ‘key in dict.keys()’