All Projects → ing-bank → flink-deployer

ing-bank / flink-deployer

Licence: MIT license
A tool that help automate deployment to an Apache Flink cluster

Programming Languages

go
31211 projects - #10 most used programming language
scala
5932 projects

Projects that are alternatives of or similar to flink-deployer

apache-flink-jdbc-streaming
Sample project for Apache Flink with Streaming Engine and JDBC Sink
Stars: ✭ 22 (-84.62%)
Mutual labels:  apache-flink, flink
Lidea
大型分布式系统实时监控平台
Stars: ✭ 28 (-80.42%)
Mutual labels:  flink
Flink Commodity Recommendation System
🐳基于 Flink 的商品实时推荐系统。使用了 redis 缓存热点数据。当用户产生评分行为时,数据由 kafka 发送到 flink,根据用户历史评分行为进行实时和离线推荐。实时推荐包括:基于行为和实时热门,离线推荐包括:历史热门、历史优质商品和 itemcf 。
Stars: ✭ 167 (+16.78%)
Mutual labels:  flink
Flink Doc Zh
Apache Flink 中文文档
Stars: ✭ 242 (+69.23%)
Mutual labels:  flink
Nussknacker
Process authoring tool for Apache Flink
Stars: ✭ 182 (+27.27%)
Mutual labels:  flink
dagger
Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.
Stars: ✭ 238 (+66.43%)
Mutual labels:  apache-flink
Flink Clickhouse Sink
Flink sink for Clickhouse
Stars: ✭ 165 (+15.38%)
Mutual labels:  flink
FlinkExperiments
Experiments with Apache Flink.
Stars: ✭ 3 (-97.9%)
Mutual labels:  flink
flink-k8s
Example Apache Flink cluster on Kubernetes
Stars: ✭ 24 (-83.22%)
Mutual labels:  apache-flink
Flink Boot
懒松鼠Flink-Boot 脚手架让Flink全面拥抱Spring生态体系,使得开发者可以以Java WEB开发模式开发出分布式运行的流处理程序,懒松鼠让跨界变得更加简单。懒松鼠旨在让开发者以更底上手成本(不需要理解分布式计算的理论知识和Flink框架的细节)便可以快速编写业务代码实现。为了进一步提升开发者使用懒松鼠脚手架开发大型项目的敏捷的度,该脚手架默认集成Spring框架进行Bean管理,同时将微服务以及WEB开发领域中经常用到的框架集成进来,进一步提升开发速度。比如集成Mybatis ORM框架,Hibernate Validator校验框架,Spring Retry重试框架等,具体见下面的脚手架特性。
Stars: ✭ 209 (+46.15%)
Mutual labels:  flink
Flink Recommandsystem Demo
🚁🚀基于Flink实现的商品实时推荐系统。flink统计商品热度,放入redis缓存,分析日志信息,将画像标签和实时记录放入Hbase。在用户发起推荐请求后,根据用户画像重排序热度榜,并结合协同过滤和标签两个推荐模块为新生成的榜单的每一个产品添加关联产品,最后返回新的用户列表。
Stars: ✭ 3,115 (+2078.32%)
Mutual labels:  flink
Registry
Schema Registry
Stars: ✭ 184 (+28.67%)
Mutual labels:  flink
Flink Sql Cookbook
The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.
Stars: ✭ 189 (+32.17%)
Mutual labels:  flink
Sparkstreaming
💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算);🚀 支持运行过程中增删topic;🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。
Stars: ✭ 179 (+25.17%)
Mutual labels:  flink
fdp-modelserver
An umbrella project for multiple implementations of model serving
Stars: ✭ 47 (-67.13%)
Mutual labels:  flink
Flinkx
Based on Apache Flink. support data synchronization/integration and streaming SQL computation.
Stars: ✭ 2,651 (+1753.85%)
Mutual labels:  flink
proxima-platform
The Proxima platform.
Stars: ✭ 17 (-88.11%)
Mutual labels:  apache-flink
flink-client
Java library for managing Apache Flink via the Monitoring REST API
Stars: ✭ 48 (-66.43%)
Mutual labels:  flink
flink-spark-submiter
从本地IDEA提交Flink/Spark任务到Yarn/k8s集群
Stars: ✭ 157 (+9.79%)
Mutual labels:  flink
amazon-kinesis-data-analytics-flink-starter-kit
Amazon Kinesis Data Analytics Flink Starter Kit helps you with the development of Flink Application with Kinesis Stream as a source and Amazon S3 as a sink. This demonstrates the use of Session Window with AggregateFunction.
Stars: ✭ 35 (-75.52%)
Mutual labels:  apache-flink

Build Status codecov.io Gitter chat

Flink-deployer

A Go command-line utility to facilitate deployments to Apache Flink.

Currently, it supports several features:

  1. Listing jobs
  2. Deploying a new job
  3. Updating an existing job
  4. Terminating an existing job
  5. Querying Flink queryable state

For a full overview of the commands and flags, run flink-job-deployer help

How to run locally

To be able to test the deployer locally, follow these steps:

  1. Build the CLI tool docker image: docker-compose build deployer
  2. optional: cd flink-sample-job; sbt clean assembly; cd .. (Builds a jar with small stateful test job)
  3. docker-compose up -d jobmanager taskmanager (start a Flink job- and taskmanager)
  4. docker-compose run deployer help (run the Flink deployer with argument help)

Repeat step 3 with any commands you'd like to try.

Run a sample job

Provided you ran step 1 of the above guide, a jar with a sample Flink job is available in the deployer. It will be mounted in the deployer container at the following path:

/tmp/flink-sample-job/flink-stateful-wordcount-assembly-0.jar

To deploy it you can simply run (it's the default command specified in the docker-compose.yml):

docker-compose run deployer

This will print a simple word count to the output console, you can view it by checking the logs of the taskmanager as follows:

docker-compose logs -f taskmanager

If all went well you should see the word counter continue with where it was.

A list of some example commands to run can be found here.

Authentication

Apache Flink doesn't support any Web UI authentication out of the box. One of the custom approaches is using NGINX in front of Flink to protect the user interface. With NGINX, there are again a lot of different ways to add that authentication layer. To support the most basic one, we've added support for using Basic Authentication.

You can inject the FLINK_BASIC_AUTH_USERNAME and FLINK_BASIC_AUTH_PASSWORD environment variables to configure basic authentication.

Supported environment variables

  • FLINK_BASE_URL: Base Url to Flink's API (required, e.g. http://jobmanageraddress:8081/)
  • FLINK_BASIC_AUTH_USERNAME: Basic authentication username used for authenticating to Flink
  • FLINK_BASIC_AUTH_PASSWORD: Basic authentication password used for authenticating to Flink
  • FLINK_API_TIMEOUT_SECONDS: Number of seconds until requests to the Flink API time out (e.g. 10)

Development

Managing dependencies

This project uses dep to manage all project dependencies residing in the vendor folder.

Run dep status to review the status of the included and most recent available depencencies.

Build

Build from source for your current machine:

go build ./cmd/cli

Build from source for a specific machine architecture:

env GOOS=linux GOARCH=amd64 go build ./cmd/cli

Build the Docker container locally to test CLI tool:

docker-compose build deployer

Test

go test ./cmd/cli ./cmd/cli/flink ./cmd/cli/operations

Or with coverage:

sh test-with-coverage.sh

Docker

A docker image for this repo is available from the docker hub: nielsdenissen/flink-deployer

The image expects the following env vars:

FLINK_BASE_URL=http://localhost:8080

Kubernetes

When running in Kubernetes (or Openshift), you'll have to deploy the container to the cluster. A reason for this is Flink will try to reroute you to the internal Kubernetes address of the cluster, which doesn't resolve from outside. Besides that it'll give you the necessary access to the stored savepoints when you're using persistent volumes to store those.

This section is aimed at providing you with a quick getting started guide to deploy our container to Kubernetes. There are a few steps we'll need to take which we describe below:

0. Run a kubernetes cluster

If you don't have a kubernetes cluster readily available, you can quickly get started by setting up a minikube cluster.

minikube start

1. Setup a Flink cluster in Kubernetes

Flink has a guide on how to run a cluster in Kubernetes, you can find it here.

If you're using Minikube, be sure to pull the images that flink uses in their deploy configurations locally first. Otherwise minikube will not be able to find them. So perform a docker pull flink:latest on your host.

2. Add the test jar (or your own job you want to run) to the deployer image

We now need to package the jar into the container so we can deploy it in Kubernetes. There are other ways around this like storing the jar on a Persistent Volume or downloading it at runtime inside the container. This is the easiest getting started though and still the technique we use.

To build the container with the jar packaged you can use the Dockerfile-including-sample-job. Be sure to have create the jar for the test job in case you want to use it. See step 2 in the How to run locally section. Run this from the root of this repository:

docker build -t flinkdeployerstatefulwordcount:test -f Dockerfile-including-sample-job .

3. Run the deployer in Kubernetes

In this example we're going to show how you can do a simple deploy of the sample-job in this project to the cluster. For this we need a yaml that specifies what to do to Kubernetes. Here's an example of how such a kubernetes yaml could look like:

apiVersion: v1
kind: Pod
metadata:
    generateName: "flink-stateful-wordcount-deployer-"
spec:
    dnsPolicy: ClusterFirst
    restartPolicy: OnFailure
    containers:
    -   name: "flink-stateful-wordcount-deployer"
        image: "flinkdeployerstatefulwordcount:test"
        args:
        - "deploy"
        - "--file-name"
        - "/tmp/flink-stateful-wordcount-assembly-0.jar"
        - "--entry-class"
        - "WordCountStateful"
        - "--parallelism"
        - "2"
        - "--program-args"
        - "--intervalMs 1000"
        imagePullPolicy: Never
        env:
        -   name: FLINK_BASE_URL
            value: "http://flink-jobmanager:8081"

Go to Kubernetes, click the Create + button and copy paste the above YAML. This should trigger a POD to be deployed that runs once and stops after deploying the sample job to the Flink cluster running in Kubernetes.

MINIKUBE USERS: In order to use local images with Minikube (so images on your local docker installation instead of dockerHub), you need to perform the following steps:

  • Point minikube to your local docker: eval $(minikube docker-env) (See this guide for more info)
  • Rebuild the image as done in step 2 of this guide.
  • The imagePullPolicy in the yaml above must be set to Never.

4. Attach Persistent Volumes to all Flink containers

This step we won't outline completely, as it's a bit more involved for a getting started guide. In order to recover jobs from savepoints, you'll need to have a Persistent Volume shared among all Flink nodes and the deployer. You'll need this in any case if you want to persistent and thus not lose any data in your Flink cluster running in Kubernetes. After creating a Persistent Volume and hooking it up to the existing Flink containers, you'll need to add something like the following to the YAML of the deployer (besides of course change the command to for instance update):

        volumeMounts:
        -   name: flink-data
            mountPath: "/data/flink"
    volumes:
    -   name: flink-data
        persistentVolumeClaim:
            claimName: "PVC_FLINK"

The directory you put in your Persistent Volume should be the directory to which Flink stores it's savepoints.

Copyright

All copyright of project flink-job-deployer are held by Marc Rooding and Niels Denissen, 2017-2018.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].