In Python, the operators in
and not in
test membership in lists, tuples, dictionaries, and so on.
This article describes the following contents.
- How to use the
in
operator - Basic usage
- With
if
statement -
in
for the dictionary (dict
) -
in
for the string (str
)
-
not in
(negation of in
) -
in
for multiple elements - Time complexity of
in
- Slow for the list:
O(n)
- Fast for the set:
O(1)
- For the dictionary
-
in
in for
statements and list comprehensions
How to use the in operator in Python
Basic usage
x in y
returns True
if x
is included in y
, and False
if it is not.
print(1 in [0, 1, 2]) # True print(100 in [0, 1, 2]) # False
Not only list
, but also tuple
, set
, range
, and other iterable objects can be operated.
print(1 in (0, 1, 2)) # True print(1 in {0, 1, 2}) # True print(1 in range(3)) # True
The dictionary (dict
) and the string (str
) are described later.
With if statement
in
returns a bool value (True
, False
) and can be used directly in if
statement.
l = [0, 1, 2] i = 0 if i in l: print('{} is a member of {}.'.format(i, l)) else: print('{} is not a member of {}.'.format(i, l)) # 0 is a member of [0, 1, 2].
l = [0, 1, 2] i = 100 if i in l: print('{} is a member of {}.'.format(i, l)) else: print('{} is not a member of {}.'.format(i, l)) # 100 is not a member of [0, 1, 2].
Note that lists, tuples, strings, etc. are evaluated as False
if they are empty, and as True
if they are not. If you want to check whether an object is empty or not, you can use the object as it is.
l = [0, 1, 2] if l: print('not empty') else: print('empty') # not empty
l = [] if l: print('not empty') else: print('empty') # empty
"in" for the dictionary (dict)
The in
operation for the dictionary (dict
) tests on the key.
d = {'key1': 'value1', 'key2': 'value2', 'key3': 'value3'} print('key1' in d) # True print('value1' in d) # False
Use values()
, items()
if you want to test on values or key-value pairs.
print('value1' in d.values()) # True print(('key1', 'value1') in d.items()) # True print(('key1', 'value2') in d.items()) # False
"in" for the string (str)
The in
operation for the string (str
) tests the existence of a substring.
print('a' in 'abc') # True print('x' in 'abc') # False print('ab' in 'abc') # True print('ac' in 'abc') # False
not in (negation of "in")
x not in y
returns the negation of x in y
print(10 in [1, 2, 3]) # False print(10 not in [1, 2, 3]) # True
The same result is returned by adding not
to the entire in
operation.
print(not 10 in [1, 2, 3]) # True
However, if you add not
to the entire in
operation, it will be interpreted in two ways, as shown below, so it is recommended to use the more explicit not in
.
print(not (10 in [1, 2, 3])) # True print((not 10) in [1, 2, 3]) # False
Since in
has a higher precedence than not
, it is treated as the former if there are no parentheses.
The latter case is recognized as follows.
print(not 10) # False print(False in [1, 2, 3]) # False
"in" for multiple elements
If you want to check if multiple elements are included, using a list of those elements as follows will not work. It will be tested whether the list itself is included or not.
print([0, 1] in [0, 1, 2]) # False print([0, 1] in [[0, 1], [1, 0]]) # True
Use and
, or
or sets.
Use and, or
Combine multiple in
operations using and
and or
. It will be tested whether both or either are included.
l = [0, 1, 2] v1 = 0 v2 = 100 print(v1 in l and v2 in l) # False print(v1 in l or v2 in l) # True print((v1 in l) or (v2 in l)) # True
Since in
and not in
have higher precedence than and
and or
, parentheses are not necessary. Of course, if it is difficult to read, you can enclose it in parentheses as in the last example.
Use sets
If you have a lot of elements you want to check, it is easier to use the set than and
, or
.
For example, whether list A
contains all the elements of list B
is equivalent to whether list B
is a subset of list A
.
l1 = [0, 1, 2, 3, 4] l2 = [0, 1, 2] l3 = [0, 1, 5] l4 = [5, 6, 7] print(set(l2) <= set(l1)) # True print(set(l3) <= set(l1)) # False
Whether list A
does not contain the elements of list B
is equivalent to whether list A
and list B
are relatively prime.
print(set(l1).isdisjoint(set(l4))) # True
If list A
and list B
are not relatively prime, it means that list A
contains at least one element of list B
.
print(not set(l1).isdisjoint(set(l3))) # True
Time complexity of "in"
The execution speed of the in
operator depends on the type of the target object.
The measurement results of the execution time of in
for lists, sets, and dictionaries are shown below.
Note that the code below uses the Jupyter Notebook magic command %%timeit
and does not work when run as a Python script.
Take a list of 10 elements and 10000 elements as an example.
n_small = 10 n_large = 10000 l_small = list(range(n_small)) l_large = list(range(n_large))
The sample code below is executed in CPython 3.7.4, and of course, the results may vary depending on the environment.
Slow for the list: O(n)
The average time complexity of the in
operator for lists is O(n)
. It becomes slower when there are many elements.
%%timeit -1 in l_small # 178 ns ± 4.78 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) %%timeit -1 in l_large # 128 µs ± 11.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
The execution time varies greatly depending on the position of the value to look for. It takes the longest time when its value is at the end or when it does not exist.
%%timeit 0 in l_large # 33.4 ns ± 0.397 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) %%timeit 5000 in l_large # 66.1 µs ± 4.38 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) %%timeit 9999 in l_large # 127 µs ± 2.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Fast for the set: O(1)
The average time complexity of the in
operator for sets is O(1)
. It does not depend on the number of elements.
s_small = set(l_small) s_large = set(l_large) %%timeit -1 in s_small # 40.4 ns ± 0.572 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) %%timeit -1 in s_large # 39.4 ns ± 1.1 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
The execution time does not change depending on the value to look for.
%%timeit 0 in s_large # 39.7 ns ± 1.27 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) %%timeit 5000 in s_large # 53.1 ns ± 0.974 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) %%timeit 9999 in s_large # 52.4 ns ± 0.403 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
If you want to repeat in
operation for a list with many elements, it is faster to convert it to a set in advance.
%%timeit for i in range(n_large): i in l_large # 643 ms ± 29.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %%timeit s_large_ = set(l_large) for i in range(n_large): i in s_large_ # 746 µs ± 6.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Note that it takes time to convert a list to a set, so it may be faster to keep it as a list if the number of in
operations is small.
For the dictionary
Take the following dictionary as an example.
d = dict(zip(l_large, l_large)) print(len(d)) # 10000 print(d[0]) # 0 print(d[9999]) # 9999
As mentioned above, the in
operation for the dictionary tests on keys.
The key of the dictionary is a unique value as well as the set, and the execution time is about the same as for sets.
%%timeit for i in range(n_large): i in d # 756 µs ± 24.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
On the other hand, dictionary values are allowed to be duplicated like a list. The execution time of in
for values()
is about the same as for lists.
dv = d.values() %%timeit for i in range(n_large): i in dv # 990 ms ± 28.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Key-value pairs are unique. The execution time of in
for items()
is about set
+ α.
di = d.items() %%timeit for i in range(n_large): (i, i) in di # 1.18 ms ± 26.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
"in" in for statements and list comprehensions
The word in
is also used in for
statements and list comprehensions.
l = [0, 1, 2] for i in l: print(i) # 0 # 1 # 2
print([i * 10 for i in l]) # [0, 10, 20]
Note that the in
operator may be used as a conditional expression in list comprehensions, which is confusing.
l = ['oneXXXaaa', 'twoXXXbbb', 'three999aaa', '000111222'] l_in = [s for s in l if 'XXX' in s] print(l_in) # ['oneXXXaaa', 'twoXXXbbb']
The first in
is in
for the list comprehensions, and the second in
is the in
operator.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.