GitOps Best Practices: Balancing Autonomy and Stability

A guide to finding the right balance between team autonomy and system stability

In modern cloud-native environments, implementing GitOps principles effectively requires finding the right balance between team autonomy and system stability. This article explores key patterns and practices that can be applied across different GitOps tools (like ArgoCD, Flux, or Helm) to achieve this balance while maintaining system integrity.

The Core Challenge

When implementing GitOps, organizations often face several fundamental challenges:

  1. Security and Stability: Critical infrastructure components require careful management and protection
  2. Team Autonomy: Development teams need the ability to make changes without bottlenecks
  3. System Reliability: Core services must remain stable and protected from accidental changes
  4. Operational Efficiency: Teams should work independently while maintaining system integrity

Repository Structure and Access Control

1. Infrastructure Repository

This repository contains critical infrastructure components that should be managed by the DevOps/SRE team:

[Read More]

How k8s CPU limits work

What happens when a pod hits 100% CPU?

Everyone knows that once a pod hits 100% of memory, it will be killed by the OOM killer. But did you know what happens with CPU?

Let’s say you have a pod limited to 100mCPU. What actually happens under the hood?

Understanding CPU Limits in Kubernetes

As per official documentation:

cpu limits are enforced by CPU throttling. When a container approaches its cpu limit, the kernel will restrict access to the CPU corresponding to the container’s limit. Thus, a cpu limit is a hard limit the kernel enforces. Containers may not use more CPU than is specified in their cpu limit.

[Read More]