All Projects → simonw → Csvs To Sqlite

simonw / Csvs To Sqlite

Licence: apache-2.0
Convert CSV files into a SQLite database

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Csvs To Sqlite

Pytablewriter
pytablewriter is a Python library to write a table in various formats: CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.
Stars: ✭ 422 (-25.7%)
Mutual labels:  csv, pandas, sqlite
Visidata
A terminal spreadsheet multitool for discovering and arranging data
Stars: ✭ 4,606 (+710.92%)
Mutual labels:  csv, pandas, sqlite
pdpcli
PdpCLI is a pandas DataFrame processing CLI tool which enables you to build a pandas pipeline from a configuration file.
Stars: ✭ 15 (-97.36%)
Mutual labels:  csv, pandas
A-Detector
⭐ An anomaly-based intrusion detection system.
Stars: ✭ 69 (-87.85%)
Mutual labels:  csv, pandas
dbd
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (-94.72%)
Mutual labels:  csv, sqlite
California Coronavirus Data
The Los Angeles Times' independent tally of coronavirus cases in California.
Stars: ✭ 188 (-66.9%)
Mutual labels:  csv, pandas
spot price machine learning
Machine Learning for Spot Prices
Stars: ✭ 25 (-95.6%)
Mutual labels:  pandas, click
AlphaVantageAPI
An Opinionated AlphaVantage API Wrapper in Python 3.9. Compatible with Pandas TA (pip install pandas_ta). Get your FREE API Key at https://www.alphavantage.co/support/
Stars: ✭ 77 (-86.44%)
Mutual labels:  csv, pandas
Csv Diff
Python CLI tool and library for diffing CSV and JSON files
Stars: ✭ 118 (-79.23%)
Mutual labels:  csv, click
Sqlite Utils
Python CLI utility and library for manipulating SQLite databases
Stars: ✭ 368 (-35.21%)
Mutual labels:  sqlite, click
Pgloader
Migrate to PostgreSQL in a single command!
Stars: ✭ 3,754 (+560.92%)
Mutual labels:  csv, sqlite
Datasette
An open source multi-tool for exploring and publishing data
Stars: ✭ 5,640 (+892.96%)
Mutual labels:  csv, sqlite
Rightmove webscraper.py
Python class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
Stars: ✭ 125 (-77.99%)
Mutual labels:  csv, pandas
30 Days Of Python
Learn Python for the next 30 (or so) Days.
Stars: ✭ 1,748 (+207.75%)
Mutual labels:  csv, pandas
DataProfiler
What's in your data? Extract schema, statistics and entities from datasets
Stars: ✭ 843 (+48.42%)
Mutual labels:  csv, pandas
Rbql
🦜RBQL - Rainbow Query Language: SQL-like language for (not only) CSV file processing. Supports SQL queries with Python and JavaScript expressions
Stars: ✭ 118 (-79.23%)
Mutual labels:  csv, sqlite
astro
Astro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Stars: ✭ 79 (-86.09%)
Mutual labels:  sqlite, pandas
Csvpack
csvpack library / gem - tools 'n' scripts for working with tabular data packages using comma-separated values (CSV) datafiles in text with meta info (that is, schema, datatypes, ..) in datapackage.json; download, read into and query CSV datafiles with your SQL database (e.g. SQLite, PostgreSQL, ...) of choice and much more
Stars: ✭ 71 (-87.5%)
Mutual labels:  csv, sqlite
Pygraphistry
PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
Stars: ✭ 1,365 (+140.32%)
Mutual labels:  csv, pandas
csv-to-sqlite
A desktop app to convert CSV files to SQLite databases!
Stars: ✭ 68 (-88.03%)
Mutual labels:  csv, sqlite

csvs-to-sqlite

PyPI Changelog Tests License

Convert CSV files into a SQLite database. Browse and publish that SQLite database with Datasette.

Basic usage:

csvs-to-sqlite myfile.csv mydatabase.db

This will create a new SQLite database called mydatabase.db containing a single table, myfile, containing the CSV content.

You can provide multiple CSV files:

csvs-to-sqlite one.csv two.csv bundle.db

The bundle.db database will contain two tables, one and two.

This means you can use wildcards:

csvs-to-sqlite ~/Downloads/*.csv my-downloads.db

If you pass a path to one or more directories, the script will recursively search those directories for CSV files and create tables for each one.

csvs-to-sqlite ~/path/to/directory all-my-csvs.db

Handling TSV (tab-separated values)

You can use the -s option to specify a different delimiter. If you want to use a tab character you'll need to apply shell escaping like so:

csvs-to-sqlite my-file.tsv my-file.db -s $'\t'

Refactoring columns into separate lookup tables

Let's say you have a CSV file that looks like this:

county,precinct,office,district,party,candidate,votes
Clark,1,President,,REP,John R. Kasich,5
Clark,2,President,,REP,John R. Kasich,0
Clark,3,President,,REP,John R. Kasich,7

(Real example taken from the Open Elections project)

You can now convert selected columns into separate lookup tables using the new --extract-column option (shortname: -c) - for example:

csvs-to-sqlite openelections-data-*/*.csv \
    -c county:County:name \
    -c precinct:Precinct:name \
    -c office -c district -c party -c candidate \
    openelections.db

The format is as follows:

column_name:optional_table_name:optional_table_value_column_name

If you just specify the column name e.g. -c office, the following table will be created:

CREATE TABLE "office" (
    "id" INTEGER PRIMARY KEY,
    "value" TEXT
);

If you specify all three options, e.g. -c precinct:Precinct:name the table will look like this:

CREATE TABLE "Precinct" (
    "id" INTEGER PRIMARY KEY,
    "name" TEXT
);

The original tables will be created like this:

CREATE TABLE "ca__primary__san_francisco__precinct" (
    "county" INTEGER,
    "precinct" INTEGER,
    "office" INTEGER,
    "district" INTEGER,
    "party" INTEGER,
    "candidate" INTEGER,
    "votes" INTEGER,
    FOREIGN KEY (county) REFERENCES County(id),
    FOREIGN KEY (party) REFERENCES party(id),
    FOREIGN KEY (precinct) REFERENCES Precinct(id),
    FOREIGN KEY (office) REFERENCES office(id),
    FOREIGN KEY (candidate) REFERENCES candidate(id)
);

They will be populated with IDs that reference the new derived tables.

Installation

$ pip install csvs-to-sqlite

csvs-to-sqlite now requires Python 3. If you are running Python 2 you can install the last version to support Python 2:

$ pip install csvs-to-sqlite==0.9.2

csvs-to-sqlite --help

Usage: csvs-to-sqlite [OPTIONS] PATHS... DBNAME

  PATHS: paths to individual .csv files or to directories containing .csvs

  DBNAME: name of the SQLite database file to create

Options:
  -s, --separator TEXT         Field separator in input .csv
  -q, --quoting INTEGER        Control field quoting behavior per csv.QUOTE_*
                               constants. Use one of QUOTE_MINIMAL (0),
                               QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or
                               QUOTE_NONE (3).
  --skip-errors                Skip lines with too many fields instead of
                               stopping the import
  --replace-tables             Replace tables if they already exist
  -t, --table TEXT             Table to use (instead of using CSV filename)
  -c, --extract-column TEXT    One or more columns to 'extract' into a
                               separate lookup table. If you pass a simple
                               column name that column will be replaced with
                               integer foreign key references to a new table
                               of that name. You can customize the name of the
                               table like so:
                                   state:States:state_name
                               This will pull unique values from the 'state'
                               column and use them to populate a new 'States'
                               table, with an id column primary key and a
                               state_name column containing the strings from
                               the original column.
  -d, --date TEXT              One or more columns to parse into ISO formatted
                               dates
  -dt, --datetime TEXT         One or more columns to parse into ISO formatted
                               datetimes
  -df, --datetime-format TEXT  One or more custom date format strings to try
                               when parsing dates/datetimes
  -pk, --primary-key TEXT      One or more columns to use as the primary key
  -f, --fts TEXT               One or more columns to use to populate a full-
                               text index
  -i, --index TEXT             Add index on this column (or a compound index
                               with -i col1,col2)
  --shape TEXT                 Custom shape for the DB table - format is
                               csvcol:dbcol(TYPE),...
  --filename-column TEXT       Add a column with this name and populate with
                               CSV file name
  --no-index-fks               Skip adding index to foreign key columns
                               created using --extract-column (default is to
                               add them)
  --no-fulltext-fks            Skip adding full-text index on values extracted
                               using --extract-column (default is to add them)
  --just-strings               Import all columns as text strings by default
                               (and, if specified, still obey --shape,
                               --date/datetime, and --datetime-format)

  --version                    Show the version and exit.
  --help                       Show this message and exit.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].