Used ports
Base services
- 222 - ssh
- 3000 - monit
Monitoring
- 8086 - InfluxDB port
- 9086 - InfluxDB port (RPC)
- 8081 - Grafana
- 8080 - Jenkins
Infrastructure
- 2181 - zookeeper
- 2888 - zookeeper (election)
- 3888 - zookeeper (election)
- 8000 - Ngnix
- 9200 - elasticsearch
- 9092 - kafka (plaintext)
- 27017 - MongoDB
Project’s services
- 8001 - server_web application port
- 8002 - server_web admin port
- 8088 - crawler application port
- 8089 - crawler admin port
- 8011 - server_web application port
Spark cluster
- 7077 - Spark Master port
- 7078 - Spark Worker port
- 8090 - Spark Master UI port
- 8091 - Spark Worker UI port
Hadoop: HDFS
- 50010 (http) - The datanode server address and port for data transfer (Internal,
dfs.datanode.address
) - 50075 (http) - The datanode http server address and port. (Internal,
dfs.datanode.http.address
) - 50475 (https) - The datanode secure http server address and port. (Internal,
dfs.datanode.https.address
) - 50020 (IPC) - The datanode ipc server address and port. (Internal,
dfs.datanode.ipc.address
) - 50070 (http) - The address and the base port where the dfs namenode web ui will listen on. (External,
dfs.namenode.http-address
) - 50470 (https) - The namenode secure http server address and port. (Internal,
dfs.namenode.https-address
) - 9000 (IPC) - File system metadata operations (Internal,
fs.defaultFS
)
Hadoop: YARN
- 8032 (IPC) - The address of the applications manager interface in the RM. (Internal,
yarn.resourcemanager.address
). - 8030 (http) - The address of the scheduler interface. (Internal,
yarn.resourcemanager.scheduler.address
) - 8031 (http) - The secure http server address and port. (Internal,
yarn.resourcemanager.resource-tracker.address
) - 8033 (http) - The address of the RM admin interface. (Internal,
yarn.resourcemanager.admin.address
) - 8088 (http)- The http address of the RM web application. If only a host is provided as the value, the webapp will be served on a random port. (Internal,
yarn.resourcemanager.webapp.address
) (Internal because of https://community.hortonworks.com/questions/191898/hdp-261-virus-crytalminer-drwho.html) - 8090 (https) - The https address of the RM web application. If only a host is provided as the value, the webapp will be served on a random port. (Internal,
yarn.resourcemanager.webapp.https.address
) - 8040 - Address where the localizer IPC is. (Internal,
yarn.nodemanager.localizer.address
) - 8048 (IPC) - Address where the collector service IPC is. (Internal,
yarn.nodemanager.collector-service.address
) - 8041 ???? (http) - The address of the container manager in the NM. (Internal,
yarn.nodemanager.address
) - 8042 (http) - NM Webapp address (External,
yarn.nodemanager.webapp.address
) - 8044 (https) - The https adddress of the NM web application. (Internal,
yarn.nodemanager.webapp.https.address
) - 13562 (http) - Default port that the ShuffleHandler will run on. ShuffleHandler is a service run at the NodeManager to facilitate transfers of intermediate Map outputs to requesting Reducers. (Internal,
mapreduce.shuffle.port
)
Hive
- 9083 - Hive’s metastore Thrify protocol’s port (Internal)
- 10001 - HiveServer2 Thrift RPC messages over HTTP (Internal,
hive.server2.thrift.http.port
,hive.server2.thrift.http.path
) - 10002 - Web UI for HiveServer2 (Internal,
hive.server2.webui.host
,hive.server2.webui.port
)
MySQL
- 3306 - default MySQL connection port (Internal)
Debugging
- 60000 - java app debug port
- 50000 - jmx java app monitoring port
JMX monitoring
- 40001 - kafka (localhost-only) port
Deprecated
- 8082 - Storm UI port (deprectated)
- 6627 - Storm Numbus port (deprectated)
- 8083 - Storm logviewer port (deprectated)
- 6700, 6701, 6702, 6703 - Storm supervisor ports (deprectated)
- 3772 - Storm DRPC drpc.port (External DRPC Clients) (deprectated)
- 3773 - Storm DRPC drpc.invocations.port (Worker Processes) (deprectated)
- 3774 - Storm DRPC drpc.http.port (External HTTP DRPC Clients) (deprectated)
- 9000 - Ngnix based topology storage port ()
Table of content:
- Home
- Prerequisites
- Developement stand provisioning
- Components
- Crawler (Spring Boot, Java)
- Message Broker (Kafka)
- Distributed file storage (Hadoop’s HDFS)
- Data warehouse (Apache Hive)
- Distributed business logic cluster (Akka, Scala)
- Distributed data processing cluster (Spark, Scala)
- Indexing/Search engine (Elasticsearch)
- REST server (Lagom, Scala)
- Service coordination (Zookeeper)
- Time-series database (monitoring data) (InfluxDB)
- Metrics collector (Telegraf)
- Monitoring visualization service (Grafana)
- Reverse proxy/load-balancer (nginx)
- Monitoring Links
- Development