Backup System for Apache Kafka ( Part 1 )

"BOOTSTRAP_SERVERS": "kafka01:9092,kafka02:9092,kafka03:9092",
"TOPIC_NAMES": ["davinder.test"], # Only 1 Topic is supported
"GROUP_ID": "Kafka-BackUp-Consumer-Group",
$ git clone
$ cd apache-kafka-backup-and-restore
$ python3 <path_to_backup.json>
$ python3 backup.json
{ "@timestamp": "2020-06-10 12:49:43,871","level": "INFO","thread": "S3 Upload","name": "botocore.credentials","message": "Found credentials in environment variables." }
{ "@timestamp": "2020-06-10 12:49:43,912","level": "INFO","thread": "Kafka Consumer 1","name": "root","message": "started polling on davinder.test" }
{ "@timestamp": "2020-06-10 12:49:43,915","level": "INFO","thread": "Kafka Consumer 0","name": "root","message": "started polling on davinder.test" }
{ "@timestamp": "2020-06-10 12:49:43,916","level": "INFO","thread": "Kafka Consumer 2","name": "root","message": "started polling on davinder.test" }
{ "@timestamp": "2020-06-10 12:49:44,307","level": "INFO","thread": "S3 Upload","name": "root","message": "upload successful at s3://davinder-test-kafka-backup/davinder.test/0/20200608-102909.tar.gz" }
{ "@timestamp": "2020-06-10 12:49:45,996","level": "INFO","thread": "S3 Upload","name": "root","message": "waiting for new files to be generated" }
{ "@timestamp": "2020-06-10 12:52:33,130","level": "INFO","thread": "Kafka Consumer 0","name": "root","message": "Created Successful Backupfile /tmp/davinder.test/0/20200610-125233.tar.gz" }
{ "@timestamp": "2020-06-10 12:52:33,155","level": "INFO","thread": "Kafka Consumer 0","name": "root","message": "Created Successful Backup sha256 file of /tmp/davinder.test/0/20200610-125233.tar.gz.sha256" }
  1. This application can be started in Container/Docker.
  2. This application can be started in SystemD.
  3. This application can be started multiple times to increase the speed of backup if required.
  4. This application supports UTF-8 based messages only.
  5. This application performs compression by default to tar.gz.
  6. This application creates sha256 hash files to ensure the integrity of backup files.
  7. Upload to S3 is based on RETRY_UPLOAD_SECONDS.




Senior Software Engineer III ( R&D )

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Bokeh 1.4.0 Released

Searching for “CustomJS” in the Bokeh documentation website

This page summarizes the tips and checklists found in The Pragmatic Programmer.

Three-Legged OAuth2 from Single-Page Applications: A Use Case for a Function-as-a-Service

Connecting Applications with Blockchain Ecosystems (Axelar Network)

API Fluctuation Data Endpoint For Wheat Prices

Making Sense of ISO/IEC 19788 (part I)

Twenty-one ways to be a Crappy Software Developer

Data Classes in Python 🐍

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Davinder Pal

Davinder Pal

Senior Software Engineer III ( R&D )

More from Medium

AWS S3 data-source for Grafana

Configure your NodeJS Application with IBM Cloud App Configuration using nconf-appconfig

Docker Architecture, Life Cycle of Docker Containers and Data Management

GitLab — Heroku CI/CD Pipeline in 10 minutes.