Navigating the Kubernetes Landscape: Expert Insights on Cloud Native Operations and DevOps Best Practices

As Kubernetes becomes the de facto standard for container orchestration, organizations face increasing complexity in managing Cloud Native infrastructure. We spoke with Dmitry Shurupov, co-founder of Palark, a DevOps agency that is one of the Top 100 contributors to the Kubernetes project, about the evolving Cloud Native ecosystem, operational best practices, and career advice for aspiring DevOps professionals.

The State of Kubernetes: Complexity as a Feature, Not a Bug

TechTimes: Kubernetes has a reputation for being complex. How did it become so popular anyway?

Yes, Kubernetes is complicated—but that's because it addresses serious challenges. This complexity shouldn't be unexpected. With 20 years of experience in IT operations and Open Source, I view Kubernetes as a natural evolution of global engineering efforts to address real-world operational challenges. Kubernetes offers sophisticated solutions for orchestrating containerized workloads at scale, and with that comes inherent complexity.

That said, it's perfectly acceptable to start simple. If you're just beginning to build your infrastructure, running your software on regular virtual machines without Kubernetes is often a valid approach. However, you should expect to transition to Kubernetes at some point, as your product grows and your development team expands.

Remember that K8s is a standard not just for the modern IT infrastructure itself, but also for engineers operating it and for the developers who create software that you run there. Hence, relying on Kubernetes will be beneficial even in terms of finding new engineers on the job market and onboarding them.

What would you recommend to the businesses adopting Kubernetes?

First, be rational and realistic when using Kubernetes. It's not just another negligible sysadmin tool; it's a powerful and flexible framework, a cornerstone for building a platform tailored to your needs. Its efficiency depends very much on how you use it.

Secondly, when you are just starting to use Kubernetes, remember that it comes with numerous built-in features that enable best practices for running software efficiently. Learn about them and utilize them fully to see the actual benefits. Choose the correct default update strategy for your application's replicas, leverage node affinity and anti-affinity for your pods, use various priorities for various workloads, configure reasonable requests and limits for your pods, leverage all kinds of probes, use Pod Disruption Budget to ensure the application's availability, and benefit from both horizontal and vertical autoscalers. These aren't theoretical features. They're battle-tested mechanisms that, when properly configured, dramatically improve application reliability and resource efficiency.

Exploring a broader Cloud Native ecosystem is also essential. There are 200+ CNCF projects and a much bigger CNCF Landscape featuring tools for almost any task you might think of. Evaluate what already exists before creating your own scripts and solutions to address your business needs. All these projects are Open Source, meaning that you can see in detail how they actually function and even improve them when needed.

Staying Current in a Rapidly Evolving Ecosystem

TechTimes: Speaking of a larger Cloud Native ecosystem, it evolves rapidly...

That's true: the Cloud Native ecosystem is vast and constantly evolving. Here's a simple illustration: not long ago, upgrading the Kubernetes version was a significant challenge and a widely discussed pain point. It involved dealing with critical incompatibilities and other issues. Today, it's much easier thanks to improved tooling, processes, and K8s becoming a more mature project in general.

At the same time, some things you might get used to can be deprecated soon, and you'll have to switch to newer solutions. A perfect example is what just happened to the Ingress NGINX controller, a vital component of most Kubernetes clusters, which will be retired in March 2026 and, thus, requires a prompt migration. There are many alternatives in place, but migrating some setups will take a substantial effort for operators.

Most CNCF projects are frequently updated with new features and improvements, while others become stagnant and are eventually archived. You need to be prepared to learn a great deal, remain open-minded, and stay flexible.

How can teams keep up with these changes?

The Cloud Native community is vast and very friendly. You can find a wealth of information regarding the projects you use and related issues online. Don't hesitate to ask your questions and join discussions on GitHub and in Slack channels. The community is genuinely invested in helping each other succeed. The more you rely on certain projects in your infrastructure, the more essential it is to stay informed about their news and updates by following relevant mailing lists, blogs, and social network accounts.

I'd also recommend attending KubeCons, Kubernetes Community Days, and similar events—even virtually—and participating in smaller local Kubernetes meetups. These connections, including your further LinkedIn communication, help you stay informed about what's coming and what's being deprecated. Many event organizers upload the recordings of the talks on YouTube, which is also helpful.

IaC and GitOps: The Foundation of Modern Operations

TechTimes: Besides using Kubernetes, what operational practices should organizations prioritize?

Managing infrastructure today requires a thoughtful Infrastructure as Code or GitOps approach. Following these practices helps make infrastructure predictable and reliable, and makes it easy to enhance further and collaborate on. Version control for infrastructure provides the same benefits as it does for application code: a history of changes and reasoning behind them, auditability, peer review, rollback capabilities... These aren't nice-to-haves—they're essential for operating at scale. Documenting non-trivial configurations also saves an enormous amount of time as the infrastructure grows and evolves.

Security: A First-Class Citizen from Day One

TechTimes: What other things are often overlooked?

No matter how obvious it may sound to some, security is not a joke. It should be considered a first-class citizen and implemented from the very beginning. For Kubernetes-based infrastructure, I strongly recommend following the so-called 4C security approach to ensure security on each layer: Code, Container, Cluster, and Cloud.

There are numerous ready-to-use tools and mechanisms that can assist with each of these levels. Again, Kubernetes offers numerous built-in features, including Role-Based Access Control, Pod Security Admission, and API auditing. You have CIS Kubernetes benchmarks, as well as tools, such as Kubescape and kube-bench, which leverage them and other security frameworks. There are plenty of tools for container image scanning and signing, network security, runtime security, and so on. Obviously, if you rely on a cloud provider, it provides you with a whole bunch of security capabilities as well.

Generally speaking, you need to be aware of existing security risks and understand the capabilities required to address them. Then, you can weigh the costs for implementing and maintaining security and the potential consequences of not doing so, and find your own balanced approach.

Disaster Recovery: Beyond Just Taking Snapshots

TechTimes: Perhaps security is the operational risk we see most frequently in the news. Anything else of comparable magnitude?

I think mentioning proper backups is never enough. Backups are more than mere snapshots of your data. It's the actual ability to restore all this data, allowing your business-critical applications to return to regular operation. It's essential to have a DRP—disaster recovery plan—that is regularly tested to ensure it remains operational.

I cannot overstate this: an untested backup is not a backup. Schedule regular disaster recovery drills. Practice restoring from backups. Time your recovery processes. Document every step. When a real disaster strikes, you won't have time to figure things out on the fly.

Observability: Comprehensive Without Being Overwhelming

TechTimes: What's your perspective on observability in complex Cloud Native environments?

Good observability does not mean you simply have tons of metrics for everything. It means you have enough information to understand the current state of your system, troubleshoot issues, and prevent problems before they impact users.

While observability should be comprehensive, covering different levels of your applications and related infrastructure, it also should not become overwhelming, leading to alert fatigue. I've seen teams with hundreds of unnecessary alerts where genuine incidents get lost in the noise.

Focus on meaningful data—whether it's metrics, logs, or traces—that actually provide value.

Cloud Cost Management: Right-Sizing for Efficiency

TechTimes: Monitoring should also help control your cloud costs. How relevant is that for the Kubernetes-based infrastructure?

More than ever! Right-sizing your cloud resources is a must-have. According to recent Cast AI's 2025 Kubernetes Cost Benchmark Report, the average CPU utilization for Kubernetes clusters is a shocking 10%. Keep this in mind when deploying new services and set up at least basic monitoring to perform regular analysis of your costs, ensuring they align with your business's current needs.

With modern cloud providers and Kubernetes, it's incredibly easy to spin up lots of huge machines you don't really need on a permanent basis. For example, you don't need permanent instances for CI/CD previews. These can be dynamic—created when needed and destroyed when the testing or demo is complete. Similarly, cluster autoscaling can reduce costs during off-peak hours. Kubernetes offers powerful tools like Horizontal Pod Autoscaler and Vertical Pod Autoscaler, not to mention other tools, such as Karpenter, that can help optimize resource usage.

You can also benefit from cost management and optimization solutions from cloud providers, and third-party Kubernetes-specific tools, including a CNCF project called OpenCost.

Career Advice: Building a Strong Foundation

TechTimes: What advice do you have for aspiring DevOps engineers?

Don't make the mistake of becoming a DevOps engineer by skipping system administration fundamentals: operating system internals, networking, storage, and so on. It's a significant trend now that young professionals want to jump into the high-paying, highly sought-after world of DevOps immediately. But you can't become a good DevOps engineer without knowing how everything "under the hood" works.

While you primarily operate high-level software and abstractions, when something breaks—and it will—you need to understand what could possibly lead to these issues. You need to know how TCP works, how DNS resolution occurs, how Linux manages memory, and how I/O operations are performed. These fundamentals never go away.

TechTimes: How is AI affecting the DevOps career landscape?

Generative AI increases the gap between senior and junior engineers. Senior engineers are still in very high demand, but the tasks that junior engineers typically perform are getting automated with AI. It leaves junior engineers without the essential practical experience they need to grow. I'm actually afraid that if we lose junior engineers to AI, at some point, we will be left with no new senior engineers, because there's no reasonable shortcut here.

Thus, I keep saying that my advice is to go deeper than what AI can automate. Focus on understanding systems, not just configuring them. Learn to troubleshoot complex issues. Develop debugging skills. These are areas where human expertise remains irreplaceable.

The Value of Certifications and Open Source

TechTimes: Should engineers pursue certifications?

Certifications don't guarantee you a job, but they can be helpful if you're a newcomer without real-world experience. If you just finished learning new technology—for example, Kubernetes and Prometheus—and don't have a real task to apply this knowledge, getting certified with CKA (Certified Kubernetes Administrator) and PCA (Prometheus Certified Associate) will help you prove at least some practical skills with it.

However, certification is never on par with actual experience using these technologies in production for your employer. View certifications as a starting point, not a destination.

One more thing I recommend is setting up a homelab, where you can experiment with new technologies. Moving forward, publishing such developments on GitHub will help you secure your first DevOps or SRE position.

What about Open Source contributions on GitHub?

Being active in Open Source communities is beneficial in so many ways. It confirms your technical expertise—especially if you contribute meaningful features or bug fixes to well-known software projects. It develops your soft skills through collaboration and communication. It connects you with like-minded people and contributes to better software available to everyone. And honestly, it's just fun!

CNCF offers an excellent portal for new contributors, called "CNCF Contributors," that can serve as a starting point for your Open Source journey. Contributing to projects you use professionally also gives you deeper insight into how they work, which makes you more effective at your job.

Final Thoughts

TechTimes: What's your final advice for organizations navigating their Cloud Native journey?

Start small, and remember that complexity should serve a purpose. Every tool you add, every pattern you adopt should solve a real problem. If you can't articulate why something is necessary, you probably don't need it yet.

Never stop thinking about a strong foundation that involves infrastructure as code, security, observability, and backups. Stay informed about recent developments in a constantly evolving Cloud Native ecosystem, but don't rush to adopt every new technology that appears.

Invest in your team's education. Kubernetes is complex, and having people who understand it and related tools deeply is your greatest asset. If it's not your core competency, working with a reliable outsourcing partner can benefit a lot.

About Palark

Palark is a Germany-based B2B agency that provides DevOps as a Service and SRE as a Service to enterprises globally. Created by a team with 15+ years of experience operating high-load production systems, Palark offers 24/7 Kubernetes infrastructure support, comprehensive DevOps consulting, and round-the-clock SRE support with guaranteed SLAs. Learn more at palark.com.

Join the Discussion