All Projects → Azure-Samples → azure-sql-db-change-stream-debezium

Azure-Samples / azure-sql-db-change-stream-debezium

Licence: MIT license
SQL Server Change Stream sample using Debezium

Programming Languages

C#
18002 projects
TSQL
950 projects
shell
77523 projects
powershell
5483 projects

Projects that are alternatives of or similar to azure-sql-db-change-stream-debezium

Real-time-Data-Warehouse
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
Stars: ✭ 52 (-29.73%)
Mutual labels:  cdc, change-data-capture, debezium
debezium-incubator
Previously used repository for new Debezium modules and connectors in incubation phase (archived)
Stars: ✭ 89 (+20.27%)
Mutual labels:  cdc, change-data-capture, debezium
southpaw
⚾ Streaming left joins in Kafka for change data capture
Stars: ✭ 48 (-35.14%)
Mutual labels:  cdc, change-data-capture, debezium
scylla-cdc-source-connector
A Kafka source connector capturing Scylla CDC changes
Stars: ✭ 19 (-74.32%)
Mutual labels:  cdc, change-data-capture, debezium
Debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
Stars: ✭ 5,937 (+7922.97%)
Mutual labels:  cdc, change-data-capture, debezium
kafka-connect-http
Kafka Connect connector that enables Change Data Capture from JSON/HTTP APIs into Kafka.
Stars: ✭ 81 (+9.46%)
Mutual labels:  cdc, change-data-capture
redis-microservices-demo
Microservice application with various Redis use-cases with RediSearch, RedisGraph and Streams. The data are synchronize between MySQL and Redis using Debezium as a CDC engine
Stars: ✭ 48 (-35.14%)
Mutual labels:  cdc, debezium
OpenLogReplicator
Open Source Oracle database CDC written purely in C++. Reads transactions directly from database redo log files and streams in JSON or Protobuf format to: Kafka, RocketMQ, flat file, network stream (plain TCP/IP or ZeroMQ)
Stars: ✭ 112 (+51.35%)
Mutual labels:  cdc, change-data-capture
MySqlCdc
MySQL/MariaDB binlog replication client for .NET
Stars: ✭ 71 (-4.05%)
Mutual labels:  cdc, change-data-capture
oracdc
Oracle database CDC (Change Data Capture)
Stars: ✭ 51 (-31.08%)
Mutual labels:  cdc, change-data-capture
azure-sql-db-prisma
Full-Stack End-To-End implementation - both with REST and GraphQL support - with Azure SQL and Prisma.io of the well-known To-do list sample.
Stars: ✭ 35 (-52.7%)
Mutual labels:  azure-functions, azure-sql
pgcapture
A scalable Netflix DBLog implementation for PostgreSQL
Stars: ✭ 94 (+27.03%)
Mutual labels:  cdc, change-data-capture
Realtime
Listen to your to PostgreSQL database in realtime via websockets. Built with Elixir.
Stars: ✭ 4,278 (+5681.08%)
Mutual labels:  cdc, change-data-capture
serverless-full-stack-apps-azure-sql
Full stack solution using Javascript, Azure Static Web Apps, Azure Function, Azure SQL Database and a microservice architecture to monitor in real-time public transportation data, create a geofence and send notification when geofence is activated
Stars: ✭ 72 (-2.7%)
Mutual labels:  azure-functions, azure-sql
debezium.github.io
Source for the Debezium website; Please log issues in our tracker at https://issues.redhat.com/projects/DBZ/.
Stars: ✭ 34 (-54.05%)
Mutual labels:  change-data-capture, debezium
durable-functions-producer-consumer
Uses Durable Functions' fan out pattern to load N messages across M sessions into various Azure Storage/Messaging services. Includes the ability to consume the messages with another Azure Function & load timing data in to Event Hubs for ingestion in to analytics services like Azure Data Explorer
Stars: ✭ 31 (-58.11%)
Mutual labels:  azure-functions, event-hubs
DacFx
SQL Server database schema validation, deployment, and upgrade runtime. Enables declarative database development and database portability across SQL Server versions and environments.
Stars: ✭ 152 (+105.41%)
Mutual labels:  sql-server, azure-sql
TiBigData
TiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+159.46%)
Mutual labels:  cdc
hashflags-function
⚡ A collection of Azure functions related to Twitter hashflags
Stars: ✭ 15 (-79.73%)
Mutual labels:  azure-functions
azure-functions-python-library
Azure Functions Python SDK
Stars: ✭ 95 (+28.38%)
Mutual labels:  azure-functions
page_type languages products description urlFragment
sample
tsql
sql
csharp
azure-sql-database
sql-server
azure-sql-managed-instance
azure-sqlserver-vm
azure
dotnet
dotnet-core
azure-functions
azure-event-hubs
Create a Change Data Feed from Azure SQL or SQL Server using the Open Source tool Debezium
azure-sql-db-change-stream-debezium

Azure SQL / SQL Server Change Stream with Debezium

License

Azure SQL Database and SQL Server Change Stream sample using Debezium. A change feed or change stream allow applications to access real-time data changes, using standard technologies and well-known API, to create modern applications using the full power of database like SQL Server.

Debezium make use of Change Data Capture, so it can be used with SQL Server on-premises and the whole Azure SQL family (https://azure.microsoft.com/en-us/products/azure-sql) (Azure SQL MI, Azure SQL DB, SQL Server on VM).

With Debezium and Azure SQL / SQL Server you can not only create more modern and reactive applications that handle data changes in near real time with a minium impact on the database, but you can also use it to implement your Hybrid IT strategy, still using On-Prem SQL Server but relying on Azure for all your computing needs, taking advantage of PaaS offerings like EventHubs and Azure Functions. This sample will show how to do that.

SQL Server Change Stream

For those who want more details, Debezium uses Apache Kafka, which is not natively available on Azure as a PaaS offer. But luckily EventHubs offer almost 100% support, so we can use it instead of Kafka and make maintenance and scalability easier.

Event Hubs for Kafka

Step by Step Guide

This step by step guide uses Wide World Importers sample database from here:

https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0

Make sure you download the OLTP database if you want to follow this guide without having to change a thing. To restore the database on Azure SQL, you can use the scripts provided here: Restore Database in Azure SQL.

Create Debezium User

Debezium needs to query the database so a dedicated user is used throughout the sample. User has db_owner access to make the script simpler. In real world you may want to tighten security a bit more.

To create the login and user run the script 00-setup-database-user.sql from the sql folder (pertinent to the service or product you are using, Azure SQL or SQL Server / Azure SQL MI) on the database where you have restored Wide World Importers database.

Enable Change Data Capture

Debezium uses Change Data Capture to capture all the changes done to selected tables.

In this samples only two tables are monitored:

  • Sales.Orders
  • Warehouse.StockItems

The script 01-enable-cdc.sql enable Change Data Capture on the aforementioned tables.

Create an Azure Event Hubs

All data gathered by Change Data Capture will be send to Event Hubs, so create an Azure Event Hubs in your Azure Subscription. Using the Azure Cloud Shell Bash:

# create group
az group create -n debezium -l eastus

# create eventhuvbs with kafka enabled
az eventhubs namespace create -n debezium -g debezium -l eastus --enable-kafka

Later in the configuration process you'll need the EventHubs connection string, so grab it and store it somewhere:

az eventhubs namespace authorization-rule keys list -g debezium --namespace-name debezium -n RootManageSharedAccessKey --query "primaryConnectionString" -o tsv

Run Debezium

Pre-Requisites

In order to run Debezium you have to install and configure Apache Kafka, Apache Zookeper and Kafka Connect. If you already know how to do that, or your already have a testing or development environment, well, perfect. Go and install Debezium SQL Server Connector: Installing a Debezium Connector.

If prefer a more lean and quick easy to start using Debezium, you can just use the Debezium Docker Image, that provides anything you need to run a test instance of Debezium. Just make sure you have Docker and Docker Compose installed. In the debezium/on-prem folder you can find all the scripts needed to run Debezium using Docker.

Configure Environment

Docker Compose will use .env to get the environment variables values used in the .yaml configuration file. The provided .env.template file look like the following:

DEBEZIUM_VERSION=1.6
EH_NAME=debezium
EH_CONNECTION_STRING=

Copy it and create a new .env file. Leave the version set to 1.6. Change the EH_NAME to the EventHubs name you created before. Also set EH_CONNECTION_STRING to hold the EventHubs connection string you got before. Make sure not to use any additional quotes or double quotes.

The .yaml file

If you are just interested in testing Debezium you can safely skip this section and move to the next one to start Debezium. If you want to understand how to make Debezium work with Event Hubs, read on.

Debezium needs Apache Kafka to run, NOT Azure Event Hubs. Luckily for us, Azure Event Hubs exposes a Kafka-Compatible endpoint, so we can still enjoy Kafka with all the comfort of a PaaS offering. There are a few tweeks needed in order to make Debezium working with Azure Event Hubs.

First of all EventHubs requires authentication. This part is taken care from the configuration settings that looks like the following:

- *_SECURITY_PROTOCOL=SASL_SSL
- *_SASL_MECHANISM=PLAIN
- *_SASL_JAAS_CONFIG=[...]

Documentation on EventHubs Kafka Authentication and Kafka Connect is available here:

Integrate Apache Kafka Connect support on Azure Event Hubs

Since we're running a Docker Image, we cannot really change the configuration file, but Debezium allows pass-through configurations:

Debezium Connect-Base

There is additional caveat to keep in mind. EventHubs security uses the string $ConnectionString as username. In order to avoid to have Docker Compose to treat it as a variable instead, a double dollar sign $$ needs to be used:

Docker Compose Config File Variable Substitution

Two other options useful for running Debezium on Azure Event Hubs are the following:

- CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE=false
- CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE=true

They control if the schema is sent with the data or not. Since the Azure Event Hubs only support values, as opposed to Apache Kafka, which everything is actually a key-value pair, the schema generation for the key section can be safely turned off. While this can also be done for the value part, it is not recommended as some data type are serialized in a Kafka-specific way and you need to know their "sematic" type in order to recreate the correct value.

Debezium SQL Server Connector Data Types

Here's a sample of a schema for a "create" (INSERT) event:

Debezium SQL Server Connector Create Event Sample

Start Debezium

Debezium can now be started. If you're using the Docker Images you can just do this by running debezium/on-prem/start-debezium.ps1 (or the .sh file if you're on Linux/WSL)

Once the startup has finished, you'll see something like

[Worker clientId=connect-1, groupId=1] Finished starting connectors and tasks   [org.apache.kafka.connect.runtime.distributed.DistributedHerder]

you will see three topics (or EventHub to use the Azure EventHubs nomenclature):

az eventhubs eventhub list -g debezium --namespace debezium -o table

and the result will show:

  • debezium_configs
  • debezium_offsets
  • debezium_statuses

to explore Azure Event Hubs is strongly suggest to download and use Service Bus Explorer

Register SQL Server Connector

Now that Debezium is running, the SQL Server Connector (which is used both for connecting to Azure SQL or SQL Server) can be registered. Before doing that, make sure to specify the correct connection for your SQL Server instance in the debezium/register-sqlserver-eh.json file. You can create one using the provided .template file.

If you are using the Wide World Importers database, the only values you have to change are:

"database.hostname": "<server>.database.windows.net",
"database.dbname": "<database-name>",

If you are following the step-by-step guide using a database of yours, make sure to also correctly set values for

"database.user" : "debezium-wwi",
"database.password" : "Abcd1234!",

All the other values used are explained in detail here:

SQL Server Connector Configuration Values

Once the configuration file is set, just register that using debezium/on-prem/register-connector.ps1.

Depending on how big your tables are, it make take a while (more on this later). Once you see the following message:

Snapshot step 8 - Finalizing   [io.debezium.relational.HistorizedRelationalSnapshotChangeEventSource]

and no other errors or exception before that, you'll know that the SQL Server Connector is correctly running.

Make sample changes

Now that Debezium is running and fully configured, you can generate a new Sales Order and insert, update and delete some data in the Stock table. You can use the following scripts:

./sql/.../02-create-new-sales-order.sql
./sql/.../03-modify-warehouse-stock.sql

After running the script you can use Service Bus Explorer or VS Code Event Hub Explorer to consume the stream of changes sent to Azure Event Hubs. You'll notice a new topic named wwi. That's where we instructed Debezium to send all the changes detected to the monitored tables.

Consume Change Stream using an Azure Functions

One way to quickly react to the Change Stream data coming from Debezium is to use Azure Functions. A sample is available in folder azure-function. The easiest way to run the sample is to open it from VS Code or via Azure Function Core Tools, via func start. It will automatically recognize it as an Azure Function and download everything needed to run it.

Make sure you have a local.setting.json that looks like the provided template. Copy the Azure Event Hubs connection string you got at the beginning into the Debezium configuration option.

Start the function. As soon as the Azure Function runtime is running, the code will start to process the changes already available in EventHubs and you'll see something like this:

Event from Change Feed received:
- Object: Sales.Orders
- Operation: Insert
- Captured At: 2019-08-04T22:35:59.0100000Z
> OrderID = 73625
> CustomerID = 941
> SalespersonPersonID = 3
> PickedByPersonID =
> ContactPersonID = 3141
> BackorderOrderID =
> OrderDate = 8/4/2019 12:00:00 AM
> ExpectedDeliveryDate = 8/5/2019 12:00:00 AM
> CustomerPurchaseOrderNumber = 4923
> IsUndersupplyBackordered = False
> Comments = Auto-generated
> DeliveryInstructions = Unit 17, 1466 Deilami Road
> InternalComments =
> PickingCompletedWhen =
> LastEditedBy = 3
> LastEditedWhen = 8/4/2019 10:35:58 PM
Executed 'ProcessDebeziumPayload' (Succeeded, Id=ee9d1080-64ff-4039-83af-69c4b12fa85f)

Done

Congratulations, you now have a working Change Stream from SQL Server. This opens up a whole new set of possibilities! Have fun!

Notes

Running Debezium on Azure

If you're using Debezium with Azure SQL MI or Azure SQL DB, you may want to run Debezium on Azure. Sample script to run the Debezium container on Azure Container Instances are available in the debezium/azure folder.

Connector Configuration

More details on SQL Server and Event Hubs specific configuration here:

SQL Server Connector Configuration Values

Troubleshooting

Local Azure Function not working

If, after starting the Azure Function via func start you get the following error:

The listener for function 'ProcessDebeziumPayload' was unable to start. Microsoft.WindowsAzure.Storage: No connection could be made because the target machine actively refused it. System.Net.Http: No connection could be made because the target machine actively refused it. System.Private.CoreLib: No connection could be made because the target machine actively refused it.

you need to make sure that Azure Storage Emulator is installed and started.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].