Blog Post #78: More Set Operations: Difference and Symmetric Difference

In Post #77, we learned how to combine sets with union (|) and find commonalities with intersection (&). These are powerful tools, but what if we want to find the elements that are not shared between sets?

In this post, we’ll cover the two remaining key set operations: difference, to find elements unique to one set, and symmetric difference, to find elements that are in one set or the other, but not both.

Setting the Stage (Again)

Let’s continue with our example artist playlists from the previous post to keep things consistent.

local_artists = {"Taylor Swift", "Ed Sheeran", "Dua Lipa", "Harry Styles"}
global_artists = {"Ed Sheeran", "The Weeknd", "Harry Styles", "Adele"}

Difference: What’s in One Set but Not the Other?

The difference between set A and set B gives you a new set containing only the items that are in set A but not in set B.

Using the - Operator

The Python operator for set difference is the minus sign (-). The order of the sets is very important for this operation, as it answers a one-way question.

Let’s find the artists that are only on the local playlist, but not the global one.

local_only_artists = local_artists - global_artists
print(f"Artists only on the local playlist: {local_only_artists}")

The output will be: Artists only on the local playlist: {'Dua Lipa', 'Taylor Swift'}.

Now let’s flip the order to find the artists that are only on the global playlist.

global_only_artists = global_artists - local_artists
print(f"Artists only on the global playlist: {global_only_artists}")

The output is completely different: Artists only on the global playlist: {'Adele', 'The Weeknd'}.

Using the .difference() Method

As with the other operations, there is also a method you can use: local_artists.difference(global_artists). It achieves the same result as the - operator.

Symmetric Difference: Items in One Set or the Other, but Not Both

The symmetric difference gives you a new set with all the items that are in either the first set or the second set, but not in both. It’s effectively the opposite of intersection; it gives you everything except the common elements.

Using the ^ (Caret) Operator

The operator for symmetric difference is the caret (^). Unlike with regular difference, the order of the sets does not matter for this operation (A ^ B is the same as B ^ A).

# Gets all artists that are not on both lists
unique_to_each_list = local_artists ^ global_artists
print(f"Artists unique to one playlist or the other: {unique_to_each_list}")

The output is a combination of the two difference results from the previous section:

Artists unique to one playlist or the other: {‘Adele’, ‘The Weeknd’, ‘Taylor Swift’, ‘Dua Lipa’}.

Using the .symmetric_difference() Method

The corresponding method for this operation is .symmetric_difference(). It can be called on either set and will produce the same result: local_artists.symmetric_difference(global_artists).

What’s Next?

You now have a complete understanding of the four fundamental set operations: union (|), intersection (&), difference (-), and symmetric difference (^). These tools allow you to compare and combine collections of unique data in powerful and efficient ways.

These concepts can sometimes be a bit abstract. A great way to solidify your understanding of how these four operations work is to visualize them. In Post #79, we will use Venn diagrams to create a simple, intuitive mental model for set operations.

Author

Debjeet Bhowmik

Experienced Cloud & DevOps Engineer with hands-on experience in AWS, GCP, Terraform, Ansible, ELK, Docker, Git, GitLab, Python, PowerShell, Shell, and theoretical knowledge on Azure, Kubernetes & Jenkins. In my free time, I write blogs on ckdbtech.com

Leave a Comment