3 Design Antipatterns That Sabotage K8s App Scalability – Container Journal

Long Live Containerization!
Software design patterns were popularized in the 1990s by the authors of the influential computer science book Design Patterns: Elements of Reusable Object-Oriented Software. Although the book focuses on software development, design patterns can be used to address many IT engineering challenges, including designing Kubernetes infrastructures.
So what are design patterns? A design pattern is an established, general solution to a common problem. Design patterns evolve from the collective wisdom of experienced practitioners in a given field and provide a template for best practices.
Antipatterns can be thought of as the opposite of design patterns: They are common pitfalls that initially appear to be good solutions but prove to be ineffective and are often counterproductive. Antipatterns may seem like attractive solutions, especially when time or resources are constrained. However, they also introduce or exacerbate problems and have a negative effect overall.
This article describes three common antipatterns that can hinder effective scalability in Kubernetes environments. To explore more common antipatterns, consider reading Optimizing Java: Practical Techniques for Improving JVM Application Performance by Benjamin J Evans, James Gough, and Chris Newland.
This antipattern manifests when we target only the simplest or easiest-to-change parts of a system rather than analyzing and diagnosing the whole system. Plucking the low-hanging fruit can be deceptive because it often seems like we’re making real progress. The reality is that we’re choosing not to optimize parts of the system we aren’t comfortable with and, even if we affect real change, it’s usually only a local optimum.
For example, let’s say we decide to configure our cluster to autoscale so that our application remains highly available during a half-hour morning spike in traffic. There’s a chance that more efficiently provisioned pods and some network traffic analysis would provide a comparable performance boost, but that requires a deeper analysis of our resource profiles. Instead, we end up paying our cloud provider for a two-hour block of cluster resources which sits underutilized for the majority of that time. The price-performance tradeoff is poor. Alternatively, we may assign a high priority to pods we deem critical but find that our aggregate service performance suffers because other pods are evicted at disproportionate rates.
Being distracted by the simple is often a defensive response to scaling challenges that stretch beyond a team’s comfort zone or to challenges that are thought to be tedious and difficult to solve. We can address this antipattern by ensuring that the team obtains a level of understanding necessary to scale each part of an application and is comfortable iterating through various application tunables to understand how each one affects various performance characteristics. 
The availability of information on the internet makes it easy to fall prey to this antipattern. Leading responses on Stack Overflow and similar sites are often popular because they provide easily digestible recommendations that provide immediate benefits with minimal effort. As more users discover the fix, their enthusiasm can create a legend. As with the ‘distracted by the simple’ antipattern, tuning by folklore often feels productive at first. The solutions can seem deceptively simple, and they work! 
Unfortunately, as with any legend, much of the truth is masked by a lack of context—and misinformation today is magnified by search engine rankings. Even when the antipattern works for the specific component versions we use and in the specific environment to which we deploy, the solutions are rarely robust or efficient. 
Missing the bigger picture is one of the most pervasive antipatterns in siloed or small teams. Developers tend to focus on individual settings or components, often relying on benchmarks to inform their configurations without examining the system more holistically.
This antipattern is a product of specialization and of the human tendency to see patterns where there may be none. A single person is unlikely to have the knowledge to examine an entire system, so they focus on what they know and are more likely to attribute differences in effects to the variables they can control.
Even when optimizing small parts of a system produces measurable results, it’s almost always more efficient to consider the whole system instead. In a system as potentially complex as a Kubernetes cluster, it’s unlikely that maximizing the performance of any one component will increase service quality to the degree we’re looking for. In fact, unless we have decisively pinpointed a performance bottleneck, it’s likely that focusing on individual components will actually produce diminishing returns. So unless we’re looking for a bottleneck, it’s best to examine our system holistically to catch interactions and emergent effects that aren’t evident at smaller scales.
Why do so many of us end up engaging in these common design antipatterns for Kubernetes application scalability? The major drivers include:
Automation solutions are available today to offload tedious and time-intensive tasks, including gathering the data necessary to make informed decisions. However, this type of automation alone is not powerful enough to overcome our difficulty in analyzing data with many variables.
Machine learning becomes crucial here, as it augments our abilities—analyzing data in a manner that we humans simply cannot. When combined with automation, ML performs this difficult data analysis on an ongoing basis. It efficiently addresses the dynamic nature of our applications by continually adjusting settings and making recommendations based on analysis that we would simply miss.
To learn about successful design patterns for Kubernetes application scalability, read this StormForge white paper on this topic. And make sure to visit StormForge.io and request a demo to see how ML can help you implement these design patterns and avoid common antipatterns when scaling your Kubernetes applications.
To hear more about cloud-native topics, join the Cloud Native Computing Foundation and the cloud-native community at KubeCon+CloudNativeCon North America 2022 – October 24-28, 2022
Patrick is a senior solution architect at StormForge. He is a seasoned system architect who earned his spurs in several adventures at Red Hat, Verizon, and Deloitte over the past two decades. For the past six years he deepened his understanding in Kubernetes with different K8s engines. When he is not helping others build their cloud-native apps, you will find him building furniture in his woodshop, roasting coffee beans from different parts of the world, or training with other taekwondo blackbelts at his local dojang.
Patrick Tavares has 1 posts and counting. See all posts by Patrick Tavares
For the last eight years, Google Cloud’s DevOps Research and Assessment (DORA) team has produced the Accelerate State of DevOps report, hearing from 33,000 professionals along the way. The team’s research has focused on examining how capabilities and practices predict the outcomes that are central to DevOps. Broadcom Software, a proud sponsor of this year’s […] The post 2022 Accelerate State of DevOps: What’s New and What’s Next appeared first on DevOps.com. […]
Distributed tracing tools were supposed to make our lives easier by helping us find where a problem is in our stacks. But in reality, they didn’t. Instead, distributed tracing is used infrequently and if it is used, the practice is used by only a few advanced users as part of first-step workflow troubleshooting. The post Distributed Tracing Doesn’t Have to Suck appeared first on DevOps.com. […]
Providing reliable and secure services doesn’t just happen.  Traditionally, teams of “humans” spent hours monitoring and managing the performance of services without the benefit of computer assistance. But, what happens when the humans are aided with capabilities like analytics, automation and AIOps? Join us as we discuss: How analytics forms the foundation for great observability […] The post AAA: Analytics, Automation and AI Ops appeared first on DevOps.com. […]
Platform engineers are often left to figure things out on their own. They troubleshoot problems in isolation and don’t involve those who truly understand the application or project requirements — the development teams. In this talk, we will walk through a number of advanced Git techniques that will help platform engineers integrate operations and security […] The post Git Workflows for Platform Engineers appeared first on DevOps.com. […]
Are you confident that with a CI/CD tool like Jenkins, your software delivery solutions can quickly integrate, easily manage, and rapidly scale to deliver a secure and quality application? Join this webinar to understand the challenges of using Jenkins and how they can be addressed by migrating to CircleCI on AWS to accelerate your software […] The post Migrate From Jenkins to CircleCI on AWS to Accelerate Your Software Delivery appeared first on DevOps.com. […]
This webinar provides an overview of the executive order including what constitutes an SBOM, and their intended purpose, usage and shortcomings in software supply chain security. We will then explore how a pipeline bill of materials (PBOM) can be used to expand upon the foundation provided by SBOMs to give you more visibility and control.. The post Understanding SBOMs: A Practical Guide to Implementing NIST/CISA’s Software Bill of Materials (SBOM) Requirements appeared first on Security Boulevard. […]
In this webinar, Mike Rothman, GM of Techstrong Research, will share findings from a newly released PulseMeter on creating a strategy to capture all of your log data. In addition, Arfan Sharif, technical marketing engineer at Crowdstrike, will discuss the business and technical challenges around capturing and analyzing data from across the enterprise and how.. The post Unleashing the Value of All Log Data appeared first on Security Boulevard. […]
This session will walk through some generally accepted value statements associated with XDR while attempting to debunk a few common myths that continue to muddy the water for security teams. The post Debunking Common Myths About XDR appeared first on Security Boulevard. […]
Join Duke University Professor of Psychology & Neuroscience, Dr. Aaron C. Kay and Nudge Security CEO and co-founder Russell Spitler as they present new research on how human emotion influences security behavior and what CISOs can do to improve compliance with security controls in the modern, distributed workplace. The post Debunking the ‘Stupid User’ Myth in Security appeared first on Security Boulevard. […]
 Learn how to define and enforce zero-trust segmentation for network, processes and file access within any Kubernetes cluster without impacting performance. We’ll compare traditional, deny list-based security controls against modern, zero-trust allow list-based controls followed by a demonstration of how zero-trust can protect against zero-day attacks as well as exploits such as Log4j and Spring4shell. The post Zero-Effort Zero-Trust for Blocking Zero-Days in Kubernetes appeared first on Security Boulevard. […]

source

Leave a Comment