Continuous Integration, Continuous Deployment
Continuous Integration
As CI server I choose Jenknins. Link to it is located in Monitoring Links.
CI pipeline configuration: in each submodule special file
(Jenkninsfile
) is created,
which describes process of extraction and testing this submodule
Access to repo: for access to repo GitHub Deployment keys are used. Each key is named by schema:
story_line2_build
-> story_line2_build.github.com
with key id_rsa.story_line2_build
.
Path to repo looks like: story_line2_crawler.github.com:fedor-malyshkin/story_line2_crawler.git
Useful script for key generation:
#!/bin/sh
for var in story_line2_crawler story_line2_build story_line2_deployment \
story_line2_client-android story_line2_server_web story_line2_server_storm story_line2_morph \
story_line2_config story_line2_glr_parser story_line2_analyser story_line2_glr_parser_debugger \
story_line2_glr_parser_testing story_line2_token story_line2_geo story_line2_server_akka
do
ssh-keygen -b 4096 -t rsa -N "" -C "your_github_email_account@example.com" -f id_rsa.$var
done
Server CI configuration
To work correctly and get necessary code from github several steps must be done:
apt-get update && apt-get -y install git
- generate necessary keys (see script above)
- Import public part of the key into guthub (in REPO Repository->Settings->Deploy Keys, don’t forget give a meaningful name)
- add into /home/jenkins-user-name/.ssh/config:
Host REPO_NAME.github.com Hostname github.com user git IdentitiesOnly yes IdentityFile ~/.ssh/id_rsa.<reponame>
chmod 600 ~/.ssh/config
!!!chmod 700 ~/.ssh
!!!- Add GitHub as known host
ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts
) - Check:
ssh -T REPO_NAME.github.com
- Note: use usual git repo (not github) in form of
ssh://REPO_NAME.github.com/fedor-malyshkin/REPO_NAME
- Read link 1
- Read link 2
Continuous Deployment. Production server configuration
On production server cron-running task makes:
- checks SHA1 remote repo
story_line2_deployment
with taglatest
with current stored value - case of difference it callsgit pull
(and get new version of scripts, configs and so on…). After that it callspuppet apply
(trough batch file “provision_production.sh”).
Table of content:
- Home
- Prerequisites
- Developement stand provisioning
- Components
- Crawler (Spring Boot, Java)
- Message Broker (Kafka)
- Distributed file storage (Hadoop’s HDFS)
- Data warehouse (Apache Hive)
- Distributed business logic cluster (Akka, Scala)
- Distributed data processing cluster (Spark, Scala)
- Indexing/Search engine (Elasticsearch)
- REST server (Lagom, Scala)
- Service coordination (Zookeeper)
- Time-series database (monitoring data) (InfluxDB)
- Metrics collector (Telegraf)
- Monitoring visualization service (Grafana)
- Reverse proxy/load-balancer (nginx)
- Monitoring Links
- Development