Engineering, Product management, User experience

Feature Flag

A technique used to enable or disable a certain feature or functionality in a software application without having to deploy code.

Also called: Feature Toggle, Feature Switch, Feature Gate, Feature Flagging, and Feature Control

Relevant metrics: Number of feature flags deployed, Number of feature flags enabled, Number of feature flags disabled, Number of feature flags tested, and Number of feature flags rolled back

What is Feature Flag?

A Feature Flag is a software development technique that allows developers to enable or disable certain features of a product without having to deploy a new version of the product. This technique is used to control the release of new features, allowing developers to test and deploy them in a controlled manner.

Feature Flags are also used to enable or disable certain features for specific users or groups of users, allowing gradual roll out of features - or even a segmented user experience across user groups.

Where did Feature Flag come from?

The term originated in the early 2000s when software engineers began using feature flags to control the release of new features in software applications. This allowed them to test and deploy new features without having to completely rewrite the code. Feature flags are now used in a variety of software development processes, including continuous integration, continuous delivery, and A/B testing.

They are also used to control the release of features to different user groups, such as beta testers or customers. Feature flags are an important tool for software developers, as they allow them to quickly and easily control the release of new features without having to rewrite code.

Decoupling deployment from release with Feature Flags

Traditionally, deployment and release were synonymous—deploying code to the production environment meant it was immediately running in production. This created significant risks because any new deploy could potentially disrupt the existing application. To mitigate these risks, teams would bundle multiple features into a single release, often scheduled weekly, monthly, or even quarterly. However, this approach increased the stakes of each deploy. If any single feature in the bundle was broken, the entire release had to be rolled back, slowing down the development process.

Feature flags revolutionize the traditional deployment workflow by decoupling deployment from release. With feature flags, new code can be deployed to production but remain dormant (a practice known as a dark launch). The code is not executed until the feature flag is turned on, which allows for safer and more frequent deployments.

Benefits of Decoupling Deploy and Release

Reduced risk. Since new features can be deployed without being activated, the risk associated with deploying code is significantly lowered. If a new feature has issues, it can remain inactive until it is fixed, preventing it from impacting the production environment.
Increased deployment frequency. Teams can deploy code to production more frequently because each deploy carries less risk. This leads to more iterative and incremental development practices, enhancing overall agility.
Enhanced stability. By separating deployment from release, teams can ensure that only well-tested and stable features are activated in production. This enhances the stability and reliability of the application.
Faster development velocity. With feature flags, the need for bundling multiple features into a single release is eliminated. Teams can deploy and release features independently, allowing for faster iterations and quicker feedback loops.

The ability to decouple deploy and release is a hallmark of high-performing engineering teams. Feature flags enable more frequent deployments while maintaining stability, allowing teams to deploy code continuously without the pressure of immediate release, thus increasing both speed and stability.

Feature flags fundamentally change the dynamics of deployment and release by decoupling them. This allows new code to be safely deployed to production without being immediately active. As a result, teams can deploy more frequently, reduce risks, and increase their development velocity. High-performing engineering teams leverage this approach to achieve greater agility and stability, ultimately leading to more robust and reliable software delivery.

Types of Feature Flags

While it might be tempting to manage all feature toggles similarly, this approach can lead to complications. Different categories of toggles are influenced by two primary factors:

Their longevity
The dynamism of the toggling decision.

Understanding these categories helps in managing toggles more effectively. Toggles can be seen as static or dynamic:

Static toggles. These toggles have simplified routing with straightforward on/off configurations.
Dynamic toggles. They require sophisticated toggle routers and complex configurations, often using algorithms for cohorting and dynamic decision-making.

Let’s examine the most typical types of feature flags in more detail

Release Toggles

Release toggles are used to enable trunk-based development and continuous delivery by allowing in-progress features to be included in the main branch but hidden from end users. These toggles prevent incomplete or untested code from being exposed in production. For example, a product manager might use them to hide a new feature until it is fully ready. This approach helps in aligning feature releases with marketing campaigns.

Release toggles are typically short-lived, usually not longer than a week or two. However, product-centric toggles may last longer if needed.

The toggling decision for release toggles is generally static. Once a decision is made for a release version, it remains the same until the next release.

Experiment Toggles

Experiment toggles facilitate A/B or multivariate testing to compare the effects of different code paths on user behavior. Each user is assigned to a cohort, and the toggle routes them based on their cohort. This method is used to gather data-driven insights, such as optimizing an ecommerce purchase flow or testing different call-to-action buttons.

These toggles need to remain in place long enough to gather statistically significant results, usually ranging from hours to weeks.

Experiment toggles are highly dynamic. Each incoming request might be routed differently based on the user’s cohort.

Ops Toggles

Ops toggles control operational aspects of the system, allowing quick disablement or degradation of features in production. They are introduced to manage the performance impacts of new features or to disable non-critical functionality during high load periods. Some systems use long-term “kill switches” to maintain stability during peak demands.

While most ops toggles are short-lived, some critical operational controls may remain in place for longer durations.

These toggles need to be reconfigured quickly, as operational issues require rapid responses. Rolling out a new release to change these toggles is typically undesirable.

Permissioning Toggles

Permissioning toggles alter feature sets or product experiences for specific user groups. These toggles manage features for different user types, such as premium, internal, or beta users. They ensure controlled exposure to new features, allowing only selected users to access them.

Permissioning toggles can be very long-lived, potentially spanning multiple years. The toggling decision is user-specific and made on a per-request basis, making these toggles highly dynamic.

Feature Flags introduce complexity

Feature toggles are a powerful tool for enabling Continuous Delivery and allowing new features to be tested and deployed without exposing them to all users. However, they introduce significant complexity to the validation process. Understanding and managing this complexity is crucial for maintaining the integrity and reliability of your deployments.

With feature-flagged systems, the Continuous Delivery (CD) process must account for multiple code paths within the same artifact. For instance, if a system can either use a new optimized tax calculation algorithm or continue with the existing one depending on the toggle state, we must test both scenarios. This is because, during the CD pipeline, it is uncertain whether the toggle will be turned on or off in production. To ensure all possible live code paths are validated, we need to test the artifact in both states: with the toggle flipped on and flipped off.

When a single toggle is in play, testing requirements double for that feature. However, with multiple toggles, the number of possible toggle states increases exponentially, leading to a combinatorial explosion of testing scenarios. Validating each possible state would be an overwhelming task, causing some skepticism towards feature flags from a testing perspective.

When to use a feature flag

Deciding whether to implement a feature flag for a particular feature or scenario involves careful consideration of various factors. Implementing feature flags can bring significant benefits, but they also introduce complexity. This chapter outlines the key criteria to help you decide when a feature flag is appropriate.

Understand its purpose

Understanding the purpose of the feature flag is the first step in making an informed decision. Common purposes include:

Gradual rollout. Is the feature being gradually rolled out to users to minimize risk and gather feedback?
A/B testing. Will the feature be used to run experiments to determine the best option based on user behavior?
Operational control. Is the feature related to system operations, requiring the ability to quickly enable or disable it in response to issues?
Permission management. Does the feature need to be enabled for specific user groups, such as beta testers or premium users?

If the feature aligns with one of these purposes, a feature flag is likely justified.

Understand its impact

Consider the impact the new feature will have on users.

Features with a significant impact on user experience, performance, or functionality are good candidates for feature flags. This allows for controlled exposure and monitoring. Minor features or cosmetic changes may not warrant the added complexity of a feature flag unless they serve a specific testing or operational purpose.

It depends on your release strategy

Evaluating how a feature fits into your development and release strategy is crucial when deciding whether to implement a feature flag. Different development methodologies and release strategies can benefit significantly from the use of feature flags.

Trunk-based development

Trunk-based development is a version control strategy where developers integrate small, frequent changes into the main branch (often called “trunk”). Instead of working on long-lived branches, which can lead to complex merge conflicts and integration issues, developers continuously commit their changes to the main branch.

In trunk-based development, developers integrate their work into the main branch multiple times a day. Feature branches, if used, are short-lived and merged back into the trunk quickly. Automated tests and continuous integration (CI) systems provide immediate feedback on the impact of changes.

In trunk-based development, feature flags allow in-progress features to be merged into the main branch without affecting the production environment. Since the main branch is always kept in a deployable state, feature flags enable developers to commit incomplete or experimental features without exposing them to end users. This maintains the stability of the main branch while allowing for ongoing development and integration.

Continuous delivery

Continuous delivery (CD) is a software engineering approach where teams aim to produce software in short cycles, ensuring that the software can be reliably released at any time. CD involves automatically building, testing, and preparing code changes for a release to production, making it possible to deploy changes more frequently and with greater confidence.

With continuous delivery, the build and test processes are automated, ensuring that code changes are validated before deployment. Code is released to production frequently, often multiple times a day. By releasing small, incremental changes, the risk associated with each release is minimized.

Feature flags decouple deployment from release, allowing teams to deploy code to production without making it immediately available to users. This means that new features can be integrated, tested, and deployed continuously, but only activated when they are fully ready. This approach supports the core principles of continuous delivery by enabling frequent, low-risk deployments while maintaining control over when new features are actually released to users.

Alignment with marketing or external events

In some cases, the release of a feature needs to be coordinated with external factors such as marketing campaigns, product launches, or special events. Timing the release of features to align with these activities can maximize impact and ensure a cohesive user experience.

Feature flags allow precise control over the release timing of new features. By deploying the feature code in advance and using a feature flag to control its activation, teams can ensure that the feature is enabled at exactly the right moment, synchronizing with marketing campaigns, events, or other strategic initiatives. This capability is particularly valuable for product launches or promotions that require all elements to go live simultaneously.

What are your testing needs?

Features that require extensive testing often involve multiple user paths, complex integrations, or significant changes to core functionality. Thorough testing in such scenarios is essential to ensure the feature works as intended and does not negatively impact other parts of the application.

Feature flags allow new features to be tested in production environments without exposing them to all users. This enables real-world testing conditions while limiting potential impact. With feature flags, both the new feature and the existing functionality can be tested in parallel. This helps in identifying any issues that arise from the interaction between the new and existing code.

Feature flags enable selective user access, allowing the feature to be incrementally tested by specific user groups. This can be based on criteria such as user type, geography, or other demographics. By gradually increasing the percentage of users who have access to the new feature, teams can monitor the feature’s performance and user feedback in stages, making it easier to manage and resolve issues.

Feature flags can also enable a product discovery strategy where teams roll out new experiences to a small, targeted user group for continuous feedback. This allows for real-time testing and refinement based on actual user interactions. With feature flags, designers can run multiple experiments simultaneously, testing different variations of a feature or interface element. This helps in quickly comparing and contrasting potential solutions to identify the most effective design.

In this way, Feature flags provide the flexibility to collect data from different user segments, enabling teams to make informed decisions based on comprehensive user feedback and behavior analysis. By easily managing and controlling the rollout of new features to beta testers, feature falgs can selectively enable the feature small groups while keeping it hidden from the general user base.

Lifecycle Management of Feature Flags

Managing feature flags effectively over their lifecycle is essential for maintaining a clean, efficient, and reliable codebase. Let us examine best practices for managing feature flags from creation to removal, ensuring they serve their purpose without becoming a source of technical debt.

Feature flag creation and roll out

The lifecycle of a feature flag begins with its creation. When introducing a new feature, developers need to consider whether a feature flag is necessary. Key considerations include:

Clearly define the purpose and scope of the feature flag. Is it for a gradual rollout, A/B testing, operational control, or permission management? Will it affect a small part of the application, or does it have broader implications?

Use a consistent and descriptive naming convention. Names should reflect the feature or functionality being controlled and its purpose, e.g., new_checkout_process_experiment. Once these considerations are addressed, the feature flag can be implemented in the codebase.

When ready, the feature flag can be gradually rolled out to users. If the feature is not ready for release to everyone, consider just enabling it for a few users, with whom you are doing a beta-testing program with. Once these initial users experience value from the feature and it is usable for a broader grup, use the feature flag to perform a gradual rollout. Start with a small percentage of users and incrementally increase the rollout based on performance and user feedback.

Continuously monitor the impact of the feature. Use metrics, logs, and user feedback to assess the performance and behavior of the feature in both enabled and disabled states. Be prepared to adjust the feature flag’s settings based on monitoring data. This might involve rolling back the feature, adjusting the rollout speed, or fixing any issues that arise.

Feature flag maintenance and removal

While the feature flag is in use, it requires ongoing maintenance. Regularly review the status and usage of feature flags. Ensure they are still serving their intended purpose and are not causing unnecessary complexity.

If it isn’t being toggle on and off anymore, consider integrating the feature flag into actual production code. That is, if it is used to give a differentiated user experience across user segments, it probably shouldn’t be a feature flag, but part of the actual code.

Once the feature controlled by the flag is stable and widely adopted, or if the feature is no longer needed, the feature flag should be deactivated and removed. Gradually deactivate the feature flag by setting it to a state where the feature is permanently enabled or disabled for all users. Remove the feature flag from the codebase. This involves deleting the flag definition and any conditional logic associated with it.

Perform thorough testing after removing the feature flag to ensure that the application behaves correctly without it and update the documentation to reflect the removal of the feature flag. Include details on why and when it was removed.

Make the developer experience friction free

Use automation tools to manage feature flags, including their creation, configuration, monitoring, and removal. Aim to keep feature flags as short-lived as possible. The longer a feature flag exists, the more potential it has to become a source of technical debt.

Integrate feature flag management with continuous integration and continuous delivery pipelines to streamline the rollout and monitoring processes. Utilize dedicated feature flag management tools to handle the complexity of managing multiple feature flags across different environments and user groups.

Clear ownership of each feature flag can help keep order in a seemingly confusing mess. The owner is responsible for the flag’s lifecycle, from creation to removal.

Frequently asked quetions about feature flags

Should I do a gradual roll-out or big bang release?

Answer: A gradual roll-out is often preferable to a big bang release because it allows for incremental exposure of new features to users, which can help in identifying and resolving issues early, reducing the risk of widespread disruption. Feature flags enable gradual roll-outs by allowing you to control the availability of a feature for different segments of users. With feature flags, you can start by enabling a feature for a small percentage of users and gradually increase this percentage as confidence in the feature’s stability grows. This controlled approach provides the opportunity to gather feedback and make necessary adjustments before the feature is fully rolled out to all users.

What strategies can be used to roll out features gradually using feature flags?

Several strategies can be used to roll out features gradually using feature flags:

Percentage rollout. Gradually increase the percentage of users who have access to the new feature. Start with a small percentage and incrementally increase it based on performance and feedback.
User segmentation. Enable the feature for specific user segments, such as internal users, beta testers, or premium users, before rolling it out to the general user base.
Geographic rollout. Roll out the feature in specific geographic regions first to limit exposure and address region-specific issues.
Behavior-based rollout. Enable the feature for users exhibiting certain behaviors or meeting specific criteria that align with the feature’s use case.
Time-based rollout. Gradually enable the feature over a specified period, allowing for continuous monitoring and adjustment.

What is the difference between feature gate and feature flag?

The terms “feature gate” and “feature flag” are often used interchangeably, but they can have slightly different connotations:

Feature Flag. A mechanism used to enable or disable features at runtime. It allows for dynamic control over which features are active, often used for gradual rollouts, A/B testing, and operational control.
Feature Gate. Typically refers to a higher-level control mechanism that determines whether a feature is accessible or not. While it can function similarly to a feature flag, it may also include additional criteria or rules that need to be met for the feature to be activated.

In essence, both terms refer to controlling the availability of features, but “feature flag” is more commonly used in the context of dynamic and granular control.

What are the disadvantages of feature flags?

While feature flags offer many advantages, they also come with some disadvantages:

Increased complexity. Managing multiple feature flags can add significant complexity to the codebase, making it harder to maintain and understand.
Technical Debt. Long-lived feature flags that are not properly managed can accumulate technical debt, especially if they are not cleaned up after their intended use.
Performance overhead. Checking the status of feature flags can introduce performance overhead, particularly if there are many flags or if the checks are done frequently.
Testing challenges. The presence of multiple feature flags can lead to a combinatorial explosion of possible states that need to be tested, making it difficult to ensure comprehensive coverage.
Security risks. Improperly managed feature flags can lead to security risks, such as inadvertently exposing sensitive features or data to unauthorized users.

Relevant questions to ask

What is the purpose of the feature flag?
What are the expected outcomes of using the feature flag?
Hint The expected outcomes of using a feature flag should be to be able to quickly and easily test, validate, and roll out changes without impacting the user experience.
What are the risks associated with using the feature flag?
Hint The risks associated with using a feature flag include the potential for bugs or errors to be introduced, as well as the potential for user confusion or dissatisfaction if the feature flag is not managed properly.
How will the feature flag be managed and monitored?
Hint The feature flag should be managed and monitored by a team of developers and testers who are responsible for ensuring that the feature flag is properly implemented and that any changes are tested and validated before being rolled out.
How will the feature flag be tested and validated?
Hint The feature flag should be tested and validated by a team of developers and testers who are responsible for ensuring that the feature flag is properly implemented and that any changes are tested and validated before being rolled out.
How will the feature flag be rolled out and removed?
Hint The feature flag should be rolled out and removed in a controlled manner, with the team of developers and testers responsible for ensuring that any changes are tested and validated before being rolled out.
What are the security implications of using the feature flag?
Hint The security implications of using a feature flag could include the potential for unauthorized access to the feature flag, as well as the potential for malicious actors to exploit the feature flag to gain access to sensitive data or systems.
How will the feature flag affect performance?
Hint The performance implications of using a feature flag depend on the complexity of the feature flag and the number of features that are enabled or disabled.
How will the feature flag affect user experience?
Hint The user experience implications of using a feature flag depend on how the feature flag is managed and monitored, as well as how the feature flag is tested and validated. If the feature flag is not managed and monitored properly, users may experience confusion or dissatisfaction.