Practical articles on Kubernetes, LLM in production, and DevOps infrastructure.
A real case study: a RAG system took 28 seconds to respond. What we did to bring it down to 1.1 seconds — without changing the model.
Practical optimization techniques: proper resource requests/limits, Cluster Autoscaler, Spot instances, and VPA. Real numbers from production.
Breaking down key differences, common mistakes when choosing a tool, and real production scenarios. Spoiler: the right answer is to use both.
Detailed comparison of ArgoCD and Flux: architecture, secrets security, image automation, and real production scenarios. What we use at InfoScale and why.