All Projects → pacman82 → odbc2parquet

pacman82 / odbc2parquet

Licence: MIT license
A command line tool to query an ODBC data source and write the result into a parquet file.

Programming Languages

rust
11053 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to odbc2parquet

Aioodbc
aioodbc - is a library for accessing a ODBC databases from the asyncio
Stars: ✭ 206 (+116.84%)
Mutual labels:  odbc
sqt
sql query tool
Stars: ✭ 32 (-66.32%)
Mutual labels:  odbc
pypyodbc
pypyodbc is a pure Python cross platform ODBC interface module (pyodbc compatible as of 2017)
Stars: ✭ 39 (-58.95%)
Mutual labels:  odbc
tagreader-python
A Python package for reading trend data from the OSIsoft PI and Aspen InfoPlus.21 historians
Stars: ✭ 27 (-71.58%)
Mutual labels:  odbc
qsv
CSVs sliced, diced & analyzed.
Stars: ✭ 438 (+361.05%)
Mutual labels:  parquet
albis
Albis: High-Performance File Format for Big Data Systems
Stars: ✭ 20 (-78.95%)
Mutual labels:  parquet
Cidlib
The CIDLib general purpose C++ development environment
Stars: ✭ 179 (+88.42%)
Mutual labels:  odbc
IMCtermite
Enables extraction of measurement data from binary files with extension 'raw' used by proprietary software imcFAMOS/imcSTUDIO and facilitates its storage in open source file formats
Stars: ✭ 20 (-78.95%)
Mutual labels:  parquet
parquet-flinktacular
How to use Parquet in Flink
Stars: ✭ 29 (-69.47%)
Mutual labels:  parquet
columnify
Make record oriented data to columnar format.
Stars: ✭ 28 (-70.53%)
Mutual labels:  parquet
sqlalchemy exasol
SQLAlchemy dialect for EXASOL
Stars: ✭ 34 (-64.21%)
Mutual labels:  odbc
miniparquet
Library to read a subset of Parquet files
Stars: ✭ 38 (-60%)
Mutual labels:  parquet
laravel-sybase
Connection and Laravel Eloquent driver for Sybase
Stars: ✭ 29 (-69.47%)
Mutual labels:  odbc
Freesql
🦄 .NET orm, Mysql orm, Postgresql orm, SqlServer orm, Oracle orm, Sqlite orm, Firebird orm, 达梦 orm, 人大金仓 orm, 神通 orm, 翰高 orm, 南大通用 orm, Click house orm, MsAccess orm.
Stars: ✭ 3,077 (+3138.95%)
Mutual labels:  odbc
laravel-db2
laravel-db2 is a simple DB2 service provider for Laravel. It provides DB2 Connection by extending the Illuminate Database component of the laravel framework.
Stars: ✭ 56 (-41.05%)
Mutual labels:  odbc
Django Pyodbc
An ODBC-powered MS SQL Server DB backend for Django 1.4+
Stars: ✭ 194 (+104.21%)
Mutual labels:  odbc
parquet-extra
A collection of Apache Parquet add-on modules
Stars: ✭ 30 (-68.42%)
Mutual labels:  parquet
SACK
System Abstraction Component Kit
Stars: ✭ 18 (-81.05%)
Mutual labels:  odbc
databricks-notebooks
Collection of Databricks and Jupyter Notebooks
Stars: ✭ 19 (-80%)
Mutual labels:  parquet
terraform-aws-kinesis-firehose
This code creates a Kinesis Firehose in AWS to send CloudWatch log data to S3.
Stars: ✭ 25 (-73.68%)
Mutual labels:  parquet

ODBC to Parquet

Licence Crates.io

A command line tool to query an ODBC data source and write the result into a parquet file.

  • Small memory footprint. Only holds one batch at a time in memory.
  • Fast. Makes efficient use of ODBC bulk reads, to lower IO overhead.
  • Flexible. Query any ODBC data source you have a driver for. MySQL, MS SQL, Excel, ...

Mapping of types in queries

The tool queries the ODBC Data source for type information and maps it to parquet type as such:

ODBC SQL Type Parquet Type
Decimal(p < 39, s) Decimal(p,s)
Numeric(p < 39, s) Decimal(p,s)
Bit Boolean
Double Double
Real Float
Float(p: 0..24) Float
Float(p >= 25) Double
Tiny Integer Int8
Small Integer Int16
Integer Int32
Big Int Int64
Date Date
Timestamp(p: 0..3) Timestamp Milliseconds
Timestamp(p >= 4) Timestamp Microseconds
Datetimeoffset(p: 0..3) Timestamp Milliseconds (UTC)
Datetimeoffset(p >= 4) Timestamp Microseconds (UTC)
Varbinary Byte Array
Long Varbinary Byte Array
Binary Fixed Length Byte Array
All others Utf8 Byte Array

p is short for precision. s is short for scale. Intervals are inclusive.

Installation

Prerequisites

To work with this tool you need an ODBC driver manager and an ODBC driver for the data source you want to access.

Windows

An ODBC driver manager is already preinstalled on windows. So is the ODBC data sources (64Bit) and ODBC data sources (32Bit) app which you can use to discover which drivers are already available on your system.

Linux

This tool links both at runtime and during build against libodbc.so. To get it you should install unixODBC. You can do this using your systems packet manager. For ubuntu you run:

sudo apt install unixodbc-dev

OS-X

This tool links both at runtime and during build against libodbc.so. To get it you should install unixODBC. To install it I recommend the homebrew packet manager, which allows you to install it using:

brew install unixodbc

Download binary from GitHub

https://github.com/pacman82/odbc2parquet/releases/latest

Note: Download the 32 Bit version if you want to connect to data sources using 32 Bit drivers and download the 64 Bit version if you want to connect via 64 Bit drivers. It won't work vice versa.

Via Cargo

If you have a rust tool chain installed, you can install this tool via cargo.

cargo install odbc2parquet

You can install cargo from here https://rustup.rs/.

Usage

Query using connection string

odbc2parquet query \
--connection-string "Driver={ODBC Driver 17 for SQL Server};Server=localhost;UID=SA;PWD=<YourStrong@Passw0rd>;" \
out.par  \
"SELECT * FROM Birthdays"

Query using data source name

odbc2parquet query \
--dsn my_db \
--password "<YourStrong@Passw0rd>" \
--user "SA" \
out.par1 \
"SELECT * FROM Birthdays"

List available ODBC drivers

odbc2parquet list-drivers

List available ODBC data sources

odbc2parquet list-data-sources

Use parameters in query

odbc2parquet query \
--connection-string "Driver={ODBC Driver 17 for SQL Server};Server=localhost;UID=SA;PWD=<YourStrong@Passw0rd>;" \
out.par  \
"SELECT * FROM Birthdays WHERE year > ? and year < ?" \
1990 2010

Inserting data into a database

odbc2parquet insert \
--connection-string "Driver={ODBC Driver 17 for SQL Server};Server=localhost;UID=SA;PWD=<YourStrong@Passw0rd>;" \
input.par \
MyTable

Use odbc2parquet --help to see all option.

Links

Thanks to @samaguire there is a script for Powershell users which helps you to download a bunch of tables to a folder: https://github.com/samaguire/odbc2parquet-PSscripts

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].