Components: Kafka
Description: Kafka is one of the most important transport layer in project - almost every interconnection between services organized with Kafka.
Provision script (Puppet manifest): kafka.pp
Additional info: Kafka site
To create Kafka cluster I have used 3 machines and Zookeeper ensemble - all states in Kafka broker are under monitoring (see Monitoring Links). Of course it’s necessary to monitore also producer and consumer - but I will make it later. You can read some nterresting links about monitoring Kafka and collecting metrics with help of jmxtrans in those links:
- Monitoring Apache Kafka with Grafana / InfluxDB via JMX
- How to Monitor Kafka
- Ready dashboadrds for Kafka and settings for jmxtrans
Hot commands:
To manually create necessary topics you need to execute such commands:
bin/kafka-topics.sh --create --replication-factor 2 --partitions 2 --zookeeper localhost:2181 --topic crawler-events
bin/kafka-topics.sh --create --replication-factor 2 --partitions 2 --zookeeper localhost:2181 --topic crawler-commands
You always can see existing topics:
bin/kafka-topics.sh --list --zookeeper localhost:2181
Table of content:
- Home
- Prerequisites
- Developement stand provisioning
- Components
- Crawler (Spring Boot, Java)
- Message Broker (Kafka)
- Distributed file storage (Hadoop’s HDFS)
- Data warehouse (Apache Hive)
- Distributed business logic cluster (Akka, Scala)
- Distributed data processing cluster (Spark, Scala)
- Indexing/Search engine (Elasticsearch)
- REST server (Lagom, Scala)
- Service coordination (Zookeeper)
- Time-series database (monitoring data) (InfluxDB)
- Metrics collector (Telegraf)
- Monitoring visualization service (Grafana)
- Reverse proxy/load-balancer (nginx)
- Monitoring Links
- Development