All Projects → rasbt → smilite

rasbt / smilite

Licence: GPL-3.0 license
A Python module to retrieve and compare SMILE strings of chemical compounds from the free ZINC online database

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to smilite

Pssqlite
PowerShell module to query SQLite databases
Stars: ✭ 184 (+183.08%)
Mutual labels:  sqlite-database
sqflite migration
Library to manage sqlite db migrations.
Stars: ✭ 40 (-38.46%)
Mutual labels:  sqlite-database
sqliteweb
Web-Based SQLite database browser
Stars: ✭ 93 (+43.08%)
Mutual labels:  sqlite-database
Sqfentity
SqfEntity ORM for Flutter/Dart lets you build and execute SQL commands on SQLite database easily and quickly with the help of fluent methods similar to .Net Entity Framework. SqfEntity also generates add/edit forms with validations and special controls (DropDown List, DateTime pickers, Checkboxes.. etc) for your table.
Stars: ✭ 237 (+264.62%)
Mutual labels:  sqlite-database
SQLiteHelper
🗄 This project comes in handy when you want to write a sql statement easily and smarter.
Stars: ✭ 57 (-12.31%)
Mutual labels:  sqlite-database
Bank-Account-Simulation
A Bank Account Simulation with JavaFX and SQLite back-end. Material UX|UI.
Stars: ✭ 19 (-70.77%)
Mutual labels:  sqlite-database
Crystdb
CrystDB is a thread-safe and convenient Object Relational Mapping database that based on SQLite.
Stars: ✭ 146 (+124.62%)
Mutual labels:  sqlite-database
hearthstats
Scrape Hearthstone decks from HearthPwn, then build a SQLite database of the results. Can also scrape card collection data from HearthPwn/Innkeeper, and integrates with omgvamp's Mashape Hearthstone API to build a table of card data that can be used to make more advanced queries.
Stars: ✭ 11 (-83.08%)
Mutual labels:  sqlite-database
sqlite-analyzer
Code generation for Java/Android database access.
Stars: ✭ 63 (-3.08%)
Mutual labels:  sqlite-database
food-sqlite-demo
This tutorial we will save text from EditText and Image from gallery into SQLite database
Stars: ✭ 58 (-10.77%)
Mutual labels:  sqlite-database
SeLite
Automated database-enabled navigation ✔️ of web applications
Stars: ✭ 34 (-47.69%)
Mutual labels:  sqlite-database
AndroidEasySQL-Library
An Easier & Lazier approach to SQL database for Android
Stars: ✭ 28 (-56.92%)
Mutual labels:  sqlite-database
tibiawiki-sql
Python script that generates a SQLite database from TibiaWiki articles
Stars: ✭ 19 (-70.77%)
Mutual labels:  sqlite-database
Iosdebugdatabase
make it easy to debug databases in iOS applications iOS debug database
Stars: ✭ 219 (+236.92%)
Mutual labels:  sqlite-database
Sqlable
Swift library for making storing data in a SQLite database simple and magic-free
Stars: ✭ 83 (+27.69%)
Mutual labels:  sqlite-database
Cachew
Transparent and persistent cache/serialization powered by type hints
Stars: ✭ 155 (+138.46%)
Mutual labels:  sqlite-database
OLA Play Music App
Music Streaming App
Stars: ✭ 27 (-58.46%)
Mutual labels:  sqlite-database
mtgsqlive
MTGJSON build scripts to generate alternative data formats
Stars: ✭ 40 (-38.46%)
Mutual labels:  sqlite-database
watchdb
Keeping SQLite databases in sync
Stars: ✭ 72 (+10.77%)
Mutual labels:  sqlite-database
mlflow-tracking-server
MLFLow Tracking Server based on Docker and AWS S3
Stars: ✭ 59 (-9.23%)
Mutual labels:  sqlite-database

smilite

smilite is a Python module to download and analyze SMILES strings (Simplified Molecular-Input Line-entry System) of chemical compounds from ZINC (a free database of commercially-available compounds for virtual screening, http://zinc.docking.org).
Now supports both Python 3.x and Python 2.x.

Sections

Installation
Simple command line online query scripts
      - lookup_zincid.py
      - lookup_smile_str.py
CSV file command line scripts
      - gen_zincid_smile_csv.py (downloading SMILES)
      - comp_smile_strings.py (checking for duplicates within 1 file)
      - comp_2_smile_files.py (checking for duplicates across 2 files)
SQLite file command line scripts
      - lookup_single_id.py
      - lookup_smile.py
      - add_to_sqlite.py
      - sqlite_to_csv.py
Changelog

Installation

You can use the following command to install smilite:
pip install smilite
or
easy_install smilite

Alternatively, you can download the package manually from the Python Package Index https://pypi.python.org/pypi/smilite, unzip it, navigate into the package, and use the command:

python3 setup.py install

Simple command line online query scripts

If you downloaded the smilite package from https://pypi.python.org/pypi/smilite or https://github.com/rasbt/smilite, you can use the command line scripts I provide in the scripts/cmd_line_online_query_scripts dir.

lookup_zincid.py

Retrieves the SMILES string and simplified SMILES string for a given ZINC ID
from the online Zinc. It uses ZINC12 as the default backend, and via an additional commandline argument zinc15, the ZINC15 database will be used instead.

Usage:
[shell]>> python3 lookup_zincid.py ZINC_ID [zinc12/zinc15]

Example (retrieve data from ZINC):
[shell]>> python3 lookup_zincid.py ZINC01234567 zinc15

Output example:

ZINC01234567
C[C@H]1CCCC[NH+]1CC#CC(c2ccccc2)(c3ccccc3)O
CC1CCCCN1CCCC(C2CCCCC2)(C3CCCCC3)O

Where

  • 1st row: ZINC ID
  • 2nd row: SMILES string
  • 3rd row: simplified SMILES string

lookup_smile_str.py

Retrieves the corresponding ZINC_IDs for a given SMILES string
from the online ZINC database.

Usage:
[shell]>> python3 lookup_smile_str.py SMILE_str

Example (retrieve data from ZINC):
[shell]>> python3 lookup_smile_str.py "C[C@H]1CCCC[NH+]1CC#CC(c2ccccc2)(c3ccccc3)O"

Output example:

ZINC01234567
ZINC01234568
ZINC01242053
ZINC01242055

CSV file command line scripts

If you downloaded the smilite package from https://pypi.python.org/pypi/smilite or https://github.com/rasbt/smilite, you can use the command line scripts I provide in the scripts/csv_scripts dir.

gen_zincid_smile_csv.py (downloading SMILES)

Generates a ZINC_ID,SMILE_STR csv file from a input file of ZINC IDs. The input file should consist of 1 columns with 1 ZINC ID per row. ZINC12 is used as the default backend, and via an additional commandline argument zinc15, the ZINC15 database can be used instead.

Usage:
[shell]>> python3 gen_zincid_smile_csv.py in.csv out.csv [zinc12/zinc15]

Example:
[shell]>> python3 gen_zincid_smile_csv.py ../examples/zinc_ids.csv ../examples/zid_smiles.csv zinc15

Screen Output:

Downloading SMILES
0%                          100%
[##########                    ] | ETA[sec]: 106.525 

Input example file format:

zinc_ids.csv

Output example file format:

zid_smiles.csv

comp_smile_strings.py (checking for duplicates within 1 file)

Compares SMILES strings within a 2 column CSV file (ZINC_ID,SMILE_string) to identify duplicates. Generates a new CSV file with ZINC IDs of identified duplicates listed in a 3rd-nth column(s).

Usage:
[shell]>> python3 comp_smile_strings.py in.csv out.csv [simplify]

Example 1:
[shell]>> python3 comp_smile_strings.py ../examples/zinc_smiles.csv ../examples/comp_smiles.csv

Input example file format:

zid_smiles.csv

Output example file format 1:

comp_smiles.csv

Where

  • 1st column: ZINC ID
  • 2nd column: SMILES string
  • 3rd column: number of duplicates
  • 4th-nth column: ZINC IDs of duplicates

Example 2:
[shell]>> python3 comp_smile_strings.py ../examples/zid_smiles.csv ../examples/comp_simple_smiles.csv simplify

Output example file format 2:
comp_simple_smiles.csv

comp_2_smile_files.py (checking for duplicates across 2 files)

Compares SMILES strings between 2 input CSV files, where each file consists of rows with 2 columns ZINC_ID,SMILE_string to identify duplicate SMILES string across both files.
Generates a new CSV file with ZINC IDs of identified duplicates listed in a 3rd-nth column(s).

Usage:
[shell]>> python3 comp_2_smile_files.py in1.csv in2.csv out.csv [simplify]

Example:
[shell]>> python3 comp_2_smile_files.py ../examples/zid_smiles2.csv ../examples/zid_smiles3.csv ../examples/comp_2_files.csv

Input example file 1:

zid_smiles2.csv

Input example file 2:

zid_smiles3.csv

Output example file format:

comp_2_files.csv

Where:

  • 1st column: name of the origin file
  • 2nd column: ZINC ID
  • 3rd column: SMILES string
  • 4th-nth column: ZINC IDs of duplicates

SQLite file command line scripts

If you downloaded the smilite package from https://pypi.python.org/pypi/smilite or https://github.com/rasbt/smilite, you can use the command line scripts I provide in the scripts/sqlite_scripts dir.

lookup_single_id.py

Retrieves the SMILES string and simplified SMILES string for a given ZINC ID
from a previously built smilite SQLite database or from the online ZINC database.

Usage:
[shell]>> python3 lookup_single_id.py ZINC_ID [sqlite_file]

Example1 (retrieve data from a smilite SQLite database):
[shell]>> python3 lookup_single_id.py ZINC01234567 ~/Desktop/smilite_db.sqlite

Example2 (retrieve data from the ZINC online database):
[shell]>> python3 lookup_single_id.py ZINC01234567

Output example:

ZINC01234567
C[C@H]1CCCC[NH+]1CC#CC(c2ccccc2)(c3ccccc3)O
CC1CCCCN1CCCC(C2CCCCC2)(C3CCCCC3)O

Where

  • 1st row: ZINC ID
  • 2nd row: SMILES string
  • 3rd row: simplified SMILES string

lookup_smile.py

Retrieves the ZINC ID(s) for a given SMILES string or simplified SMILES string from a previously built smilite SQLite database.

Usage:
[shell]>> python3 lookup_smile.py sqlite_file SMILE_STRING [simplify]

Example1 (search for SMILES string):
[shell]>> python3 lookup_smile.py ~/Desktop/smilite.sqlite "C[C@H]1CCCC[NH+]1CC#CC(c2ccccc2)(c3ccccc3)O"

Example2 (search for simplified SMILES string):
[shell]>> python3 lookup_smile.py ~/Desktop/smilite.sqlite "CC1CCCCN1CCCC(C2CCCCC2)(C3CCCCC3)O" simple

Output example:

ZINC01234567
C[C@H]1CCCC[NH+]1CC#CC(c2ccccc2)(c3ccccc3)O
CC1CCCCN1CCCC(C2CCCCC2)(C3CCCCC3)O

Where

  • 1st row: ZINC ID
  • 2nd row: SMILES string
  • 3rd row: simplified SMILES string

add_to_sqlite.py

Reads ZINC IDs from a CSV file and looks up SMILES strings and simplified SMILES strings from the ZINC online database. Writes those SMILES strings to a smilite SQLite database. A new database will be created if it doesn't exist, yet.

Usage:
[shell]>> python3 add_to_sqlite.py sqlite_file csv_file

Example:
[shell]>> python3 add_to_sqlite.py ~/Desktop/smilite.sqlite ~/Desktop/zinc_ids.csv

Input CSV file example format:

ZINC01234567
ZINC01234568
...

An example of the smilite SQLite database contents after successful insertion is shown in the image below. https://raw.github.com/rasbt/smilite/master/images/add_to_sqlite_1.png

sqlite_to_csv.py

Writes contents of an SQLite smilite database to a CSV file.

Usage:
[shell]>> python3 sqlite_to_csv.py sqlite_file csv_file

Example:
[shell]>> python3 sqlite_to_csv.py ~/Desktop/smilite.sqlite ~/Desktop/zinc_smiles.csv

Input CSV file example format:

ZINC_ID,SMILE,SIMPLE_SMILE
ZINC01234568,C[C@@H]1CCCC[NH+]1CC#CC(c2ccccc2)(c3ccccc3)O,CC1CCCCN1CCCC(C2CCCCC2)(C3CCCCC3)O
ZINC01234567,C[C@H]1CCCC[NH+]1CC#CC(c2ccccc2)(c3ccccc3)O,CC1CCCCN1CCCC(C2CCCCC2)(C3CCCCC3)O

An example of the CSV file contents opened in an spreadsheet program is shown in the image below. https://raw.github.com/rasbt/smilite/master/images/sqlite_to_csv_2.png

Changelog

VERSION 2.3.1 (07/25/2020)

  • Fix bug to allow zinc15 option in gen_zincid_smile_csv.py script

VERSION 2.3.0 (06/10/2020)

  • Fixes ZINC URL in lookup_smile_str.py
  • Adds an optional command line parameter (with arguments zinc15 or zinc12) for lookup_smile_str.py

VERSION 2.2.0

  • Provides an optional command line argument (zinc15) to use ZINC15 as a backend for downloading SMILES

VERSION 2.1.0

  • Functions and scripts to fetch ZINC IDs corresponding to a SMILES string query

VERSION 2.0.1

  • Progress bar for add_to_sqlite.py

VERSION 2.0.0

  • added SQLite features

VERSION 1.3.0

  • added script and module function to compare SMILES strings across 2 files.

VERSION 1.2.0

  • added Python 2.x support

VERSION 1.1.1

  • PyPrind dependency fix

VERSION 1.1.0

  • added a progress bar (PyPrind) to generate_zincid_smile_csv() function
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].