All Projects → pbelathur → spring-boot-performance-analysis

pbelathur / spring-boot-performance-analysis

Licence: GPL-3.0 license
How to tune Spring Boot + HikariCP for the cloud - avoiding the common mistakes

Projects that are alternatives of or similar to spring-boot-performance-analysis

spring-batch-admin-ui
Spring Batch Admin 是一个后端采用spring boot 2, spring security , oauth2, Spring data jpa 作为基础框架,集成了quartz 提供调度能力,集成了Spring batch 提供批处理能力的管理系统。系统旨在提供更底层数据展示以及常见批处理的配置以及运行能力。
Stars: ✭ 41 (+7.89%)
Mutual labels:  springboot2
Neo
Orm框架:基于ActiveRecord思想开发的至简化的java的Orm框架
Stars: ✭ 35 (-7.89%)
Mutual labels:  hikaricp
Zeko-SQL-Builder
Zeko SQL Builder is a high-performance lightweight SQL query library written for Kotlin language
Stars: ✭ 87 (+128.95%)
Mutual labels:  hikaricp
SpringBoot-Learn
Spring Boot 入门
Stars: ✭ 62 (+63.16%)
Mutual labels:  springboot2
Spring Boot Leaning
Spring Boot 2.X 最全课程代码
Stars: ✭ 2,008 (+5184.21%)
Mutual labels:  springboot2
litchi
这是一款分布式的java游戏服务器框架
Stars: ✭ 97 (+155.26%)
Mutual labels:  hikaricp
Spring-Boot-Security-Thymeleaf-Demo
Spring Boot 2.0+Srping Security+Thymeleaf的简易教程
Stars: ✭ 56 (+47.37%)
Mutual labels:  springboot2
cf-rabbitmq-release
A BOSH Release of RabbitMQ
Stars: ✭ 29 (-23.68%)
Mutual labels:  pcf
My Blog
🌴A simple & beautiful blogging system implemented with spring-boot & thymeleaf & mybatis My Blog 是由 SpringBoot + Mybatis + Thymeleaf 等技术实现的 Java 博客系统,页面美观、功能齐全、部署简单及完善的代码,一定会给使用者无与伦比的体验
Stars: ✭ 2,400 (+6215.79%)
Mutual labels:  springboot2
antifreeze
Cloud Foundry CLI plugin to detect if an app doesn't match the manifest
Stars: ✭ 21 (-44.74%)
Mutual labels:  pcf
cdk-microservices-labs
Hugo Style Documents
Stars: ✭ 12 (-68.42%)
Mutual labels:  springboot2
Newbee Mall
🔥 🎉newbee-mall 项目(新蜂商城)是一套电商系统,包括 newbee-mall 商城系统及 newbee-mall-admin 商城后台管理系统,基于 Spring Boot 2.X 及相关技术栈开发。 前台商城系统包含首页门户、商品分类、新品上线、首页轮播、商品推荐、商品搜索、商品展示、购物车、订单结算、订单流程、个人订单管理、会员中心、帮助中心等模块。 后台管理系统包含数据面板、轮播图管理、商品管理、订单管理、会员管理、分类管理、设置等模块。
Stars: ✭ 8,319 (+21792.11%)
Mutual labels:  springboot2
easy-java
整理java技术要点,让java更简单,更容易上手。
Stars: ✭ 23 (-39.47%)
Mutual labels:  hikaricp
active4j
Active4j-boot是基于SpingBoot2.0轻量级的java快速开发框架。以Spring Framework为核心容器,Spring MVC为模型视图控制器,Mybatis Plus为数据访问层, Apache Shiro为权限授权层, Redis为分布式缓存,Quartz为分布式集群调度,layui作为前端框架并进行前后端分离的开源框架
Stars: ✭ 32 (-15.79%)
Mutual labels:  springboot2
PCFControls
PCF Controls available for Model Driven Apps in order to enhance the out of the box capabilities !
Stars: ✭ 27 (-28.95%)
Mutual labels:  pcf
webflux-streaming-demo
A tryout of reactive application using Spring 5 WebFlux and mongoDB, along with an overview article on reactive programming.
Stars: ✭ 96 (+152.63%)
Mutual labels:  springboot2
MySQL
Simple JDBC MySQL database wrapper for Java
Stars: ✭ 62 (+63.16%)
Mutual labels:  hikaricp
springboot2-angular7-swagger
This project shows how to generate and consume Swagger API with Spring boot and Angular.
Stars: ✭ 27 (-28.95%)
Mutual labels:  springboot2
xiaoyuanxianyu
基于SpringBoot2.0的校园二手交易平台后台
Stars: ✭ 83 (+118.42%)
Mutual labels:  springboot2
hammer
Wrapper CLI for interacting with OM, BOSH and others for PCF environments
Stars: ✭ 14 (-63.16%)
Mutual labels:  pcf

How to tune Spring Boot + HikariCP for the cloud - avoiding the common mistakes

Prashanth PB Belathur
Staff Solutions Architect, VMWare Pivotal

TL;DR

A well behaved spring boot app in a single-tenant host, when deployed in a multi-tenant host (cloud) degrades performance of other applications in the neighborhood. Be wary of the Hikari Connection Pool configuration in your spring boot app which in most cases are oversized and, a few of the properties are set to the upper limit by default. The properties need to be downsized for deployment in a multi-tenant cloud environment to minimize the noisy neighbor impacts.

Recommendations

Applicable for any spring boot application using HikariCP and deployed in the cloud (PCF, AWS, Azure etc.)

  • keep pool size under 10 connections per application instance.

  • always specify the idle timeout, the default 10s is usually high for quick response apps.

  • a sensible application instance scaling to keep the total database connections under 1000 across all the application instances to minimize the noisy neighbour impacts.

  • research your infrastructure imposed connection time limits and, then set the maximum lifetime and connection timeout appropriately.

Overview

A critical spring boot application for a major shipping company uses Oracle as the datastore. The database server also hosts databases for other unrelated applications. The critical application would behave and perform well when running in a VM, but when deployed in Pivotal Cloud Foundry (PCF), all the other applications in the same PCF org/space encountered severe performance degradation.

I outline the technique used to identify/analyze the cause for performance degradation and, fine-tune the spring boot application to minimize the noisy neighbor impact on other applications.

Tools

  • Micrometer to expose the metrics from the spring boot application
  • Prometheus to store and time-series aggregation of metric data
  • Grafana to visualize the aggregated metric data from Prometheus
  • Docker to run Prometheus and Grafana in containers.
  • JMeter for load tests

analysis toolchain

Setup

  1. Configure Actuator and Prometheus Registry in your Spring Boot 2.x app

    build.gradle

    dependencies {
        ...    
        implementation 'org.springframework.boot:spring-boot-starter-web'
        implementation 'org.springframework.boot:spring-boot-starter-actuator'
        implementation 'io.micrometer:micrometer-registry-prometheus'
        ...
    }
    
  2. GIT clone this repo [https://github.com/pbelathur/spring-boot-performance-analysis.git]

  3. Replace LOCAL_MACHINE_IP with the actual IP address of the machine running Docker in spring-boot-performance-analysis/docker/prometheus.yml

    scrape_configs:
    - job_name: 'performance-troubleshooter'
      scrape_interval: 5s
      metrics_path: '/actuator/prometheus'
      static_configs:
        - targets: ['LOCAL_MACHINE_IP:PORT']
    
    • LOCAL_MACHINE_IP is NOT localhost OR `127.0.0.1, because the prometheus and grafana is running as docker container.
    • PORT is the Spring Boot application port usually 8080
  4. Start Prometheus and Grafana on your computer: docker-compose up

  5. Verify Prometheus can communicate with your spring boot application

    • using a web browser access http://localhost:9090/targets
  6. Verify Grafana can communicate with Prometheus

    • using a web browser access http://localhost:3000
    • under the Recently viewed dashboards look for the entry Spring Boot 2.1 Statistics
    • click on Spring Boot 2.1 Statistics and look for Instance = LOCAL_MACHINE_IP:PORT specified in prometheus.yml
  7. Setup JMeter load test with REST API endpoint on your spring boot app with number-of-threads=240, ramp-up-period=30s and loop-count=25

Execution

  1. Start your Spring Boot application

  2. Start JMeter load test on your computer

  3. During the execution of the load test, access http://localhost:3000 --> HikariCP Statistics section in Spring Boot 2.1 Statistics dashboard

    grafana-basic-hikari-stats

    • Connection Size is the total connections in DB connection pool (active + idle + pending).
    • Connections is the count of active+ idle + pending connections over a rolling time window.
    • Connection Usage Time is approximately equal to db query execution time.
    • Connection Acquire Time
    • Connection Creation Time

Observations

Situation active idle pending Notes
noisy neighbor 0 > maximumPoolSize / 2 and > minimumIdle 0 if this condition is observed under no-request scenario and after considerable time since the last request, then the spring boot app is a potential noisy neighbor, as idle connections are not returned to the pool and, they consume system resources on the database server which increase connection times, decrease throughput for other applications using the same database server.
sweet spot maximumPoolSize <=minimumIdle < 2 x maximumPoolSize best possible in terms of database performance, utilization and minimize chance for app to be a noisy neighbor.
inadequate connections maximumPoolSize <= minimumIdle >3 x maximumPoolSize if consistent spike is noticed in Connection Usage Time, then increase connection pool size in steps of 2 until you see performance improvement.
  • If Connections < active + idle + pending

    there is a potential memory leak in the spring boot application that needs further investigation using memory/thread dump analysis.

Analysis

A spring boot application with a service taking 50ms to complete a database query using a single connection is used to provide insights in calculating the connection pool size, idle pool size and timeouts.

Connection Pool Size

  • spring.datasource.hikari.maximum-pool-size

  • 50ms/database query => 200 database queries/sec per connection

  • If pool size = 10 connections on a single app instance, then we can handle 200 X 10 = 2000 queries/sec per instance.

  • if we scale the apps instance to 20, we can handle 2000 x 20 = 40,000 queries/sec among 20 instances, by using 10 x 20 = 200 connections

    Keeping pool size <= 10 connections per app instance and sensible app instance scaling to keep the total db connections < 1000 across all app instances (especially for Oracle) results in minimizing the noisy neighbour impacts in PCF.

Idle Timeout

  • spring.datasource.hikari.idle-timeout

  • the 10s default is high for most applications; set this value slightly higher than the average database query time. so that the connections are reclaimed faster preventing too many idle connections in pool. (e.g. average database query time = 50ms, idle-timeout = 100ms)

Maximum lifetime

  • spring.datasource.hikari.max-lifetime

  • this should be set several seconds shorter than any database or infrastructure imposed connection time limit. The main idea here is the application needs to timeout before the infrastructure imposed connection time limit.

Connection Timeout

  • spring.datasource.hikari.connection-timeout

  • the 30s default might be high for time critical apps, hence set the value based on the time criticality of the app. With 5s-10s for time critical applications. Making this value too small will result in SQLExceptions flooding the logs.

Sample HikariCP Configuration

# maximum db connections in pool
spring.datasource.hikari.maximum-pool-size=10

# minimum number of idle connections maintained by HikariCP in a connection pool
spring.datasource.hikari.minimum-idle=3

# maximum idle time for connection
spring.datasource.hikari.idle-timeout=100  # 100ms

# maximum number of milliseconds that a client will wait for a connection from pool
spring.datasource.hikari.connection-timeout=10000 # 10s

# maximum lifetime in milliseconds of a connection in the pool after it is closed.
spring.datasource.hikari.max-lifetime=120000 # 2m
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].