Mastering Sets In Terraform

In today’s blog post, we will learn sets in terraform. In particular, how to define a set, how to find union/intersection/difference between two sets, how to check set membership, how to use for expression in sets, and best practices in using sets. So without any further delay, let us get started.

Why Set Manipulation is Required

In many cloud infrastructure scenarios, you will encounter requirements where the order of elements does not matter, but their uniqueness does. Here are common situations where sets are used in terraform:

  • Managing Unique Identifiers: Ensuring a list of IDs, names, or keys contains no duplicates.
  • Defining Permissions or Policies: Specifying a unique collection of actions, resources, or principals in IAM policies where duplicate entries are redundant or problematic.
  • Security Group Rules: Handling a collection of source CIDR blocks or target ports where the order of rules does not matter, but each entry must be unique.
  • Tagging: Aggregating unique tag values across multiple resources.
  • Dependency Management: Creating distinct sets of module outputs or resource attributes for downstream consumption.

Using sets ensures that your configuration reflects the desired state without duplicate or conflicting definitions, which can lead to clearer plans and more reliable deployments.

Defining Sets

Terraform sets are typically created either implicitly through functions or expressions that enforce uniqueness and discard order, or explicitly by declaring a set type for a variable.

Implicit Set Creation

The most common way to get a set is by using functions that inherently produce unique, unordered collections or by transforming lists.

# main.tf

# Define a list with duplicate elements
variable "raw_tags" {
  description = "A list of tags, potentially with duplicates."
  type        = list(string)
  default     = ["environment:dev", "owner:teamA", "environment:dev", "cost_center:X"]
}

# Use distinct to get unique elements (result is a list, but conceptually a set)
output "unique_tags_list" {
  description = "A list of unique tags, in original order."
  value       = distinct(var.raw_tags)
  # Outcome: ["environment:dev", "owner:teamA", "cost_center:X"]
}

# Convert a list to a true set type using toset()
output "unique_tags_set" {
  description = "A set of unique tags."
  value       = toset(var.raw_tags)
  # Outcome: toset(["environment:dev", "owner:teamA", "cost_center:X"]) (order not guaranteed)
}

Explanation of the code:

  • The distinct function removes duplicates from a list, returning a new list. While the output is a list, the concept of uniqueness is applied.
  • The toset function explicitly converts any collection (like a list) into a set type, which enforces uniqueness and disregards order. This is the canonical way to ensure a set.

Explicit Set Definition

You can define variables with the set type, which will automatically enforce uniqueness and disregard order for any values assigned to them.

# main.tf

# Define a variable explicitly as a set of strings
variable "allowed_ip_cidrs" {
  description = "A set of allowed IP CIDR blocks for a security group."
  type        = set(string)
  default     = ["192.168.1.0/24", "10.0.0.0/8", "192.168.1.0/24"] # Duplicate is ignored
}

output "final_allowed_cidrs" {
  description = "The resulting set of unique CIDRs."
  value       = var.allowed_ip_cidrs
  # Outcome: toset(["192.168.1.0/24", "10.0.0.0/8"]) (order not guaranteed)
}

Explanation of the code:

  • The allowed_ip_cidrs variable is declared as a set(string). When a default value or any assigned value contains duplicates, Terraform automatically de-duplicates them.

Core Set Operations

Terraform provides specific functions for common set operations: union, intersection, and difference. These functions work by treating their inputs as mathematical sets.

Union of Sets

The setunion function returns a set containing all unique elements that are present in any of the input sets.

# main.tf

variable "dev_env_features" {
  type    = set(string)
  default = ["featureA", "featureB"]
}

variable "prod_env_features" {
  type    = set(string)
  default = ["featureB", "featureC"]
}

output "all_deployed_features" {
  description = "Union of features from dev and prod environments."
  value       = setunion(var.dev_env_features, var.prod_env_features)
  # Outcome: toset(["featureA", "featureB", "featureC"]) (order not guaranteed)
}

Explanation of the code:

  • setunion is useful for combining distinct sets of features, permissions, or any unique identifiers from multiple sources.

Intersection of Sets

The setintersection function returns a set containing only the unique elements that are common to all input sets.

# main.tf

output "common_features" {
  description = "Features present in both dev and prod environments."
  value       = setintersection(var.dev_env_features, var.prod_env_features)
  # Outcome: toset(["featureB"]) (order not guaranteed)
}

Explanation of the code:

  • setintersection helps identify shared elements across different sets, such as features that are common to all environments or users who belong to multiple groups.

Difference Between Sets

The setdifference function returns a set containing unique elements from the first set that are not present in any of the subsequent sets.

# main.tf

output "features_only_in_dev" {
  description = "Features present in dev but not in prod."
  value       = setdifference(var.dev_env_features, var.prod_env_features)
  # Outcome: toset(["featureA"]) (order not guaranteed)
}

output "features_only_in_prod" {
  description = "Features present in prod but not in dev."
  value       = setdifference(var.prod_env_features, var.dev_env_features)
  # Outcome: toset(["featureC"]) (order not guaranteed)
}

Explanation of the code:

  • setdifference is useful for finding elements that are unique to one specific set, for example, identifying permissions specific to a test environment or resources that need to be removed from a target state.

Checking Set Membership

The contains function can be used to check if a set (or list) includes a specific element.

# main.tf

variable "admin_roles" {
  type    = set(string)
  default = ["sysadmin", "networkadmin"]
}

output "is_sysadmin_admin" {
  description = "Checks if 'sysadmin' is an admin role."
  value       = contains(var.admin_roles, "sysadmin")
  # Outcome: true
}

output "is_dev_admin" {
  description = "Checks if 'developer' is an admin role."
  value       = contains(var.admin_roles, "developer")
  # Outcome: false
}

Explanation of the code:

  • contains provides a straightforward boolean check to see if a particular element is part of a given set, useful for conditional logic.

Transforming with For Expressions

While sets are unordered, you can iterate over them using for expressions to produce new lists or maps, or even to build new sets with specific transformations or filtering.

# main.tf

variable "policy_actions" {
  type    = set(string)
  default = ["s3:GetObject", "s3:PutObject", "ec2:RunInstances"]
}

output "formatted_iam_actions" {
  description = "List of IAM actions formatted for a policy statement."
  value       = [for action in var.policy_actions : "arn:aws:iam:::action/${action}"]
  # Outcome: ["arn:aws:iam:::action/s3:GetObject", "arn:aws:iam:::action/s3:PutObject", "arn:aws:iam:::action/ec2:RunInstances"] (order not guaranteed)
}

output "s3_actions_set" {
  description = "Set of S3-related actions."
  value       = toset([for action in var.policy_actions : action if startswith(action, "s3:")])
  # Outcome: toset(["s3:GetObject", "s3:PutObject"]) (order not guaranteed)
}

Explanation of the code:

  • The first for expression iterates over the set of actions and formats each into an ARN-like string, producing a list.
  • The second for expression filters the set, keeping only actions that start with “s3:”, and then converts the result back into a new set using toset().

Practical Use Cases

  • Security Group Ingress/Egress: If you have multiple sources of IP CIDRs for a security group, you can combine them using setunion to ensure all unique sources are allowed, preventing redundant rules.
  • IAM Policy Statements: Dynamically generate IAM policy statements by creating a set of unique actions or resources from various inputs, then use for_each on the set to define policy statements.
  • Unique Tag Management: When defining tags across different modules, you might want to ensure a base set of tags is always present. You can use setunion to merge default tags with resource-specific ones, letting Terraform handle uniqueness.
  • Resource for_each with Dynamic Inputs: If you need to create a resource for each unique string from a variable that might contain duplicates (e.g., a list of usernames that might have accidental repeats), converting it to a set first ensures for_each operates on unique keys.

Best Practices for Set Manipulation

  • When to Use Sets: Opt for sets when the order of elements is irrelevant, and uniqueness is a strict requirement. If order matters or duplicates are intended, use lists.
  • Understanding Immutability: Terraform values are immutable. Set operations always return a new set; they do not modify the original set in place.
  • Order is Not Guaranteed: Remember that sets are unordered. Do not rely on elements appearing in a specific sequence when iterating over a set or performing set operations.
  • Combine with for_each: For provisioning multiple resources based on unique strings, converting a list to a set and then using for_each on that set is often the most robust pattern, as it cleanly handles additions and removals without affecting unrelated resources.
  • Type Constraints: Explicitly define set(type) for variables to leverage Terraform’s type checking and automatic de-duplication at an early stage.

Conclusion

Mastering set manipulation in Terraform is required for managing unique and unordered collections of data within your infrastructure definitions. Understanding the core characteristics of sets and using functions like setunion, setintersection, setdifference, and toset, can help you on your day to day terraform work. Sets are particularly valuable for scenarios demanding uniqueness, such as security policies, IAM permissions, and resource tagging, ultimately leading to cleaner code and more predictable infrastructure deployments.

Related Items

Author

Debjeet Bhowmik

Experienced Cloud & DevOps Engineer with hands-on experience in AWS, GCP, Terraform, Ansible, ELK, Docker, Git, GitLab, Python, PowerShell, Shell, and theoretical knowledge on Azure, Kubernetes & Jenkins.
In my free time, I write blogs on ckdbtech.com

Leave a Comment