Components: Akka
Description: Actor-based solution for distributed processing of crawled from sites data. It reads data from Kafka stream, analyses it with groovy scripts, and stores result in HDFS and Elasticsearch cluster. For better performance it implemented with help of Akka Streams.
Technologies: Implemented as Akka app. Also used: Scala, Java 8, groovy, Kafka, Akka, Akka Streams, Akka Cluster, HDFS, Elastisearch, FP, TDD.
Git-repo: story_line2_server_akka
Provision script (Puppet manifest): server_akka.pp
Additional info:
Table of content:
- Home
- Prerequisites
- Developement stand provisioning
- Components
- Crawler (Spring Boot, Java)
- Message Broker (Kafka)
- Distributed file storage (Hadoop’s HDFS)
- Data warehouse (Apache Hive)
- Distributed business logic cluster (Akka, Scala)
- Distributed data processing cluster (Spark, Scala)
- Indexing/Search engine (Elasticsearch)
- REST server (Lagom, Scala)
- Service coordination (Zookeeper)
- Time-series database (monitoring data) (InfluxDB)
- Metrics collector (Telegraf)
- Monitoring visualization service (Grafana)
- Reverse proxy/load-balancer (nginx)
- Monitoring Links
- Development