Using Kubernetes Sidecar Pattern for Efficient Data Sync

Kubernetes

sidecar pattern

design patterns

devops

infrastructure

Data Synchronization

Using Kubernetes Sidecar Pattern for Efficient Data Sync

by: Jerrish Varghese

March 09, 2023

Introduction:

In the world of Kubernetes, the sidecar container pattern has emerged as a powerful tool for enhancing the functionality, flexibility, and resilience of applications. This pattern involves running a second container—the sidecar container—alongside the primary application container within the same Kubernetes pod. This approach allows additional functionalities, such as logging, monitoring, configuration management, or data synchronization, to be added without modifying the core logic of the main application.

A sidecar container essentially acts as a helper container that extends the primary application’s capabilities, making Kubernetes deployments more modular and maintainable. In this blog post, we'll explore a real-world example of the sidecar pattern, demonstrating how it can be used to synchronize data from an AWS S3 bucket to a shared volume. We will also discuss best practices for Kubernetes logging, handling container watchdog sidecar configurations, and ensuring smooth communication between containers.

What is a Sidecar Container?

A sidecar container is a companion container that runs alongside the primary application within the same Kubernetes pod. Unlike standalone Docker containers, Kubernetes sidecar containers share the same network, storage, and lifecycle as the main application. This architecture enables them to provide supportive functionalities such as:

Logging & Monitoring (e.g., using Fluent Bit sidecar for log aggregation)
Security Enhancements (e.g., managing authentication or encryption)
Configuration Management (e.g., fetching environment configurations from external sources)
Data Synchronization (e.g., syncing files from an external storage system like AWS S3)
Process Execution (e.g., running a custom sidecar script to trigger actions after a primary container starts)

Now, let's dive into a real-world scenario demonstrating the Kubernetes sidecar container pattern in action.

The Deployment YAML:

Below is an example deployment.yaml file that defines a Kubernetes sidecar container responsible for synchronizing data from an AWS S3 bucket:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Values.app }}
  namespace: {{ .Values.namespace }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ .Values.app }}
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ .Values.app }}
    spec:
      containers:
      - name: {{ .Values.app }}
        image: {{ .Values.image.name }}:{{ .Values.image.tag }}
        imagePullPolicy: {{ .Values.image.pullPolicy }}
        env:
        # ... (existing environment variables)
        volumeMounts:
        - name: shared-data
          mountPath: /mnt/els-palms-data
      - name: s3-sync-sidecar
        image: amazon/aws-cli:latest
        command: ["/bin/sh"]
        args:
        - "-c"
        - |
          while true; do
            echo 'Syncing S3 bucket...'
            start_time=$(date +%s)
            aws s3 sync s3://${BUCKET_NAME} /mnt/els-palms-data --delete --exact-timestamps
            end_time=$(date +%s)
            duration=$((end_time - start_time))
            sleep_time=$((60 - duration))
            echo "Duration: $duration"
            [ $sleep_time -gt 0 ] && sleep $sleep_time || sleep 0
          done
        env:
        - name: BUCKET_NAME
          value: {{ .Values.palms.bucketName }}
        volumeMounts:
        - name: shared-data
          mountPath: /mnt/els-palms-data
      volumes:
      - name: shared-data
        emptyDir: {}
      serviceAccountName: {{ .Values.serviceAccount.name }}

Understanding the Sidecar Container:

The sidecar container, named s3-sync-sidecar, is responsible for syncing data from an AWS S3 bucket to a shared volume (/mnt/els-palms-data). This Kubernetes sidecar continuously runs a loop, executing the aws s3 sync command every 60 seconds to keep the data in sync. The --delete and --exact-timestamps flags ensure that the local volume mirrors the latest state of the S3 bucket.

This sidecar application is independent of the main application and can be reused across different deployments that require S3 synchronization.

The Importance of Shared Volumes:

The deployment defines a shared volume (shared-data) using emptyDir. This allows both the primary container and the sidecar container to access the same synchronized data. Kubernetes sidecar containers often use shared volumes for data persistence, log collection (e.g., Fluent Bit sidecar), and inter-container communication.

Kubernetes Logging Best Practices with Sidecars:

Using a Logging Sidecar:
- Instead of configuring logging within the primary application, a separate container (e.g., a Fluent Bit sidecar) can collect logs and forward them to a centralized logging system.
Using Container Watchdog Sidecar:
- A watchdog sidecar container can be used to monitor the primary container’s health and restart it if needed.
K8s Pod Container Run Command After Container Started:
- If you need to execute commands after the main container starts, a custom sidecar can watch the logs and trigger actions.

ECS Sidecar Containers:

In AWS ECS (Elastic Container Service), sidecar containers are also widely used. ECS sidecar containers function similarly to Kubernetes sidecars, enabling log forwarding, proxying, and data synchronization in ECS task definitions.

Why Use the Sidecar Pattern?

By implementing the sidecar pattern in your Kubernetes deployments, you can achieve the following advantages:

Separation of Concerns:
- The main application container focuses on its primary function while the sidecar container handles supporting tasks like logging, data synchronization, or security.
Reusability:
- A well-designed sidecar container (e.g., docker sidecar for logging) can be reused across multiple projects, reducing development time.
Flexibility:
- The sidecar container can be updated, replaced, or modified independently of the main application, offering better maintainability.
Scalability:
- With Kubernetes sidecar containers, additional functionalities can be introduced without modifying the main application, making deployments more modular.

Conclusion:

In this blog post, we've explored the sidecar container pattern and its real-world application in Kubernetes. From data synchronization with AWS S3 to logging best practices using Fluent Bit sidecars, we've seen how this architecture enhances flexibility, maintainability, and scalability in containerized applications.

Whether you are using Kubernetes, Docker, or ECS, understanding and leveraging sidecar containers can significantly improve your application’s resilience and extensibility. If you're considering implementing sidecar containers, start with a simple use case like data synchronization or log aggregation, and expand from there!