A small demo that leverages Spring batch's capabilities to do job processing and Apache Kafka's stream processing. A simple CSV file is used in a batch job which then writes it to a Kafka Producer for further processing. A Kafka consumer can then verify by consuming the messages from the correct topic.
Below mentioned pipeline has been followed through out the codebase. The implementation is trivial once you modularize the responsibilities of each relevant class.
- Spring Boot + Batch + JPA
- Apache Kafka
- Apache Zookeeper
Batch systems offer tremendous advantages as compared to interactive systems.
- Repeated jobs are done fast in batch systems without user interaction.
- You don’t need special hardware and system support to input data in batch systems.
- Best for large organizations but small organizations can also benefit from it.
Expectation is to convert the following flat file into something meaningful when run as a batch process.
- Such as a Kafka stream like this *
- Or to a datastore like this *
# Start Zookeeper instance
$ zookeeper-server-start.bat ..\..\config\zookeeper.properties
# Start Kafka server
$ kafka-server-start.bat ..\..\config\server.properties
# Create a topic
$ kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic CSV_TOPIC_K
Make sure following is appended to config\server.properties
port = 9092
advertised.host.name = localhost
Branch | Description |
---|---|
master | Base branch that reads from CSV and processes them to a topic in a Kafka producer |
batch-db-upload | Similar to master except that it deserializes the CSV to a H2 Database instead of Kafka |