About this article
In this article, I’ll explore the set, one of 4 built-in types used to store collections in Python.
This post is the third article in a miniseries exploring the topic of built-in collection types in Python. I based the series on my notes while studying for a Python technical interview.
For quick access, here is the list of posts on this series:
- Python Lists
- Python Tuples
- Python Sets (this article)
- Python Dictionaries
Introduction to sets
In Python, a set is a data type used for storing a collection of elements. Sets have the following characteristics:
Sets are unordered
Elements in a set don’t have a defined order. As a result, the members can appear in a different order every time you use them and can’t be referred to by index or key.
Set elements must be unique
Duplicate elements are not allowed within a set.
Sets are mutable
The set itself is mutable. Therefore, you can add and remove elements from a set.
Set elements must be immutable
Set members, on the other hand, must be immutable. Therefore, allowed data types for set members are: booleans, integers, floats, tuples, strings, and frozensets.
Creating a set
You can create a set using a literal by wrapping its elements in curly braces:
You can also use the set() constructor by passing any iterable:
You can create an empty set only by using the set() constructor. Using a literal to create an empty set will create an empty dict instead.
Accessing elements from a set
You can access the set elements by iterating over them. This operation has a linear O(n) time complexity:
You can also test membership using the in / not in operators. This operation has a constant O(1) time complexity:
Operators vs. Methods
You can perform many set operations in Python using either an operator or a method. Let’s look at an example to understand the subtle difference between them.
Performing a union using the | operator or the .union() method works the same way if both objects used in the operation are sets:
However, if you perform the union between a set and another iterable, only the .union() method will work.
Set built-in methods
set_1.add(x)
This method adds x to set_1. This method has a constant time complexity of O(1).
set_1.clear()
This method removes all elements from set_1. This method has a linear time complexity of O(n), where n is the length set_1.
set_1.copy()
This method returns a copy of the set_1. This method has a linear time complexity of O(n), where n is the length set_1.
set_1.difference(set_2, …)
This method returns a set containing the difference between set_1 and all sets in the arguments. You can also use the difference – operator. This method has a linear time complexity of O(n), where n is the length of set_1.
set_1.difference_update(set_2, …)
This method removes the items in set_1 that are members of the sets in the arguments. This method has a linear time complexity of O(n), where n is the combined length of the sets in the arguments.
set_1.discard()
This method removes the specified member from set_1. This method has a constant time complexity of O(1).
set_1.intersection(set_2, …)
This method returns a set that is the intersection of set_1 and all the sets in the arguments. You can also use the intersection & operator. This operation has a time complexity (len(set_1)-1) * O(l)), where l is the max length of all the sets in the arguments.
set_1.intersection_update(set_2, …)
This method removes the members of set_1 that are not present in the sets passed in the arguments. This operation has a linear time complexity (len(set_1)-1) * O(l)), where l is the max length of all the sets in the arguments.
set_1.isdisjoint(set_2)
This method returns True if the sets have no elements in common. This operation has a linear time complexity of O(n), where n is the length of the smaller set in the comparison.
set_1.issubset(set_2)
This method returns True if all the members of set_1 are members of set_2. You can also use the issubset <= operator. This operation has a linear time complexity of O(n), where n is the length of set_1.
set_1.issuperset(set_2)
This method returns True if all the members of set_2 are members of set_1. You can also use the issuperset >= operator. This operation has a linear time complexity of O(n), where n is the length of set_2.
set_1.pop()
This method returns a random member from set_1 and removes it from the set. You get a KeyError when popping from an empty set. This method has a constant time complexity of O(1).
set_1.remove(x)
This method removes x from set_1. Unlike the discard method, remove raises a KeyError if x is not in set_1. This method has a constant time complexity of O(1).
set_1.symmetric_difference(set_2, …)
This method returns a set with the symmetric difference of set_1 and all the sets in the arguments. You can also use the symmetric difference ^ operator. This operation has a time complexity O(len(set_1) * len(set_2) * …).
set.symmetric_difference_update()
This method updates set_1 with the symmetric difference of set_1 and all the sets in the arguments. This operation has a time complexity O(len(set_1) * len(set_2) * …).
set_1.union(set_2, …)
This method returns a set containing the union of set_1 and all the sets in the arguments. You can also use the union | operator. This operation has a linear time complexity O(n), where n is the length of all the sets in the arguments.
set_1.update(set_2, …)
This method updates a set_1 with the union of set_1 and all the sets in the arguments. This operation has a linear time complexity O(n), where n is the length of all the sets in the arguments.
The frozenset
Python includes a frozenset type. Frozensets are immutable sets. Therefore, frozensets share all methods and operators from sets, except those that add/remove elements.
When to use sets?
Sets in Python share the attributes and behaviors of sets in the mathematics domain. Therefore, it makes sense to use sets instead of lists, tuples, or dicts, when trying to model mathematical sets.
Conclusion
This article covered Python sets, a built-in data type that models sets in the mathematics domain. We reviewed sets’ main characteristics, built-in methods and operators, and time complexity.
You should have enough familiarity with sets to have them in your coding tool belt and use them in future projects.
The next article explores Python dictionaries. See you there!