All Projects → MaxValue → Terpene Profile Parser For Cannabis Strains

MaxValue / Terpene Profile Parser For Cannabis Strains

Parser and database to index the terpene profile of different strains of Cannabis from online databases

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Terpene Profile Parser For Cannabis Strains

Crawlab
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+13220.63%)
Mutual labels:  crawler, scrapy, web-crawler
Crawlab Lite
Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Stars: ✭ 122 (+93.65%)
Mutual labels:  crawler, scrapy, web-crawler
Taxadb
🐣 locally query the ncbi taxonomy
Stars: ✭ 26 (-58.73%)
Mutual labels:  bioinformatics, database
Scanpy
Single-Cell Analysis in Python. Scales to >1M cells.
Stars: ✭ 858 (+1261.9%)
Mutual labels:  data-science, bioinformatics
Workshop
课题组每周研讨会
Stars: ✭ 28 (-55.56%)
Mutual labels:  data-science, bioinformatics
Py3 scripts
Life is short, *****.
Stars: ✭ 5 (-92.06%)
Mutual labels:  crawler, scrapy
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-58.73%)
Mutual labels:  data-science, bioinformatics
Ethereumdb
Stars: ✭ 21 (-66.67%)
Mutual labels:  data-science, database
Icrawler
A multi-thread crawler framework with many builtin image crawlers provided.
Stars: ✭ 629 (+898.41%)
Mutual labels:  crawler, scrapy
Maman
Rust Web Crawler saving pages on Redis
Stars: ✭ 39 (-38.1%)
Mutual labels:  crawler, web-crawler
Avbook
AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
Stars: ✭ 8,133 (+12809.52%)
Mutual labels:  crawler, database
Multiqc
Aggregate results from bioinformatics analyses across many samples into a single report.
Stars: ✭ 708 (+1023.81%)
Mutual labels:  analysis, bioinformatics
Spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+941.27%)
Mutual labels:  crawler, web-crawler
Pretzel
Javascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (-58.73%)
Mutual labels:  data-science, bioinformatics
Scrapyrt
HTTP API for Scrapy spiders
Stars: ✭ 637 (+911.11%)
Mutual labels:  crawler, scrapy
Scrapy Azuresearch Crawler Samples
Scrapy as a Web Crawler for Azure Search Samples
Stars: ✭ 20 (-68.25%)
Mutual labels:  crawler, scrapy
Mri Analysis Pytorch
MRI analysis using PyTorch and MedicalTorch
Stars: ✭ 55 (-12.7%)
Mutual labels:  health, data-science
Easy Scraping Tutorial
Simple but useful Python web scraping tutorial code.
Stars: ✭ 583 (+825.4%)
Mutual labels:  crawler, scrapy
Getting Started With Genomics Tools And Resources
Unix, R and python tools for genomics and data science
Stars: ✭ 587 (+831.75%)
Mutual labels:  data-science, bioinformatics
Universityrecruitment Ssurvey
用严肃的数据来回答“什么样的企业会到什么样的大学招聘”?
Stars: ✭ 30 (-52.38%)
Mutual labels:  analysis, crawler

Terpene Profile Parser for Cannabis Strains

Parser and Database to index the Terpene Profile of different Strains Of Cannabis from Online-Databases

Say Thanks! Link

Description

This repository contains:

  • A folder for each online database which displays test results about the terpene profile of cannabis strains (Found in labs/). These folders usually contain:
    • A web crawler to download lab test results of different cannabis strains from the database
    • A parser to extract the actual terpene profile from each of those HTML-pages as CSV-list
    • The CSV list of extracted terpene profiles

FAQ

What are Terpenes? What is a Terpene?

A terpene is a chemical compound which can have physiological effects on the human body. It can make you sleepy, awake, more concentrated, relaxed or less anxious. Read more on Wikipedia. This page and this lab page has some information which is also useful.

What is a Terpene Profile?

A terpene profile is a listing of terpenes present in a biological sample. This project is only concerned with specific terpenes such as Linalool, Caryophyllene oxide, Myrcene, beta-Pinene, Limonene, Terpinolene, alpha-Pinene, Humulene and Caryophyllene, but Linalool, beta-Pinene, Limonene, alpha-Pinene are the most important ones. Also we are only interested in the terpene profile of strains from the species Cannabis sativa.

What is a cannabis strain? What is a strain?

A strain is like a dog breed. As dogs all belong to the same species, but can look really different to each other, we distinguish them by breed. This is the same for Cannabis: There are several "breeds" and each one of them has different effects on the human physiology/psyche. Also strains/breeds were emphasized by humans, not by nature.

So what is this all about? What sense makes all of this?

Research suggests that Cannabis sativa can (in the right circumstances) have positive (reduce/cure depression/anxiety, improve concentration, help with sleep problems) effects on the human body. The thing is: Each strain acts differently on the body and we do not know which acts in what way, because the plant is illegal in most countries currently. This results in a lot of incorrect information spreading about which strain acts in what way and even which plant is actually belonging to a specific strain. This produces a number of problems: Many samples are therefore labelled incorrectly, many samples weren't raised under controlled lab conditions which produces very varying results and the devices for testing the samples are somewhat (really) expensive.

The good part is that in some countries it is not illegal or at least legal enough to conduct scientific research on it. Some of those research institutions (or labs) publish their chemical analysis results of the different samples online. This data is not really machine readable (to analyse it further) but it can be extracted using modern web crawling. By building statistical models we can filter away the incorrect data from the differing growing conditions of the samples. In the future a sort-of search engine is planned to search by terpene profile which gives you a sorted list of fitting strains.

How to use

See the regarding folders' README.md file for instructions.

How to contribute

Let me know if:

  • I missed some data
  • There is an online database i haven't noticed
  • The scientific information to explain the project is wrong

As with all those points: Please provide sources/proofs that your information (query, link, scientifc sources) is more valid than the present one.

Project history

The idea for this project comes from Paul Fuxjäger who wants to find high quality medical cannabis for new health treatment options. The code for extracting and cleaning the data was written by Max Fuxjäger.

Copyright

Have fun. We hope you can use this data to do good for humanity.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].