All Projects → fuyb1992 → es_pandas

fuyb1992 / es_pandas

Licence: MIT license
Read, write and update large scale pandas DataFrame with Elasticsearch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to es pandas

Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+9170.59%)
Mutual labels:  pandas
PLSC
Paddle Large Scale Classification Tools,supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, DeiT, FaceViT.
Stars: ✭ 113 (+232.35%)
Mutual labels:  large-scale
anesthetic
Nested Sampling post-processing and plotting
Stars: ✭ 34 (+0%)
Mutual labels:  pandas
Missingno
Missing data visualization module for Python.
Stars: ✭ 3,019 (+8779.41%)
Mutual labels:  pandas
libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
Stars: ✭ 284 (+735.29%)
Mutual labels:  large-scale
django-model-values
Taking the O out of ORM.
Stars: ✭ 57 (+67.65%)
Mutual labels:  pandas
Pandas Gbq
Pandas Google BigQuery
Stars: ✭ 243 (+614.71%)
Mutual labels:  pandas
read-protobuf
Small library to read serialized protobuf(s) directly into Pandas Dataframe
Stars: ✭ 28 (-17.65%)
Mutual labels:  pandas
vue-large-scale-folder-structure
Vue Js, 2 vue-cli large scale folder structure with vuex, vue-router, axios
Stars: ✭ 29 (-14.71%)
Mutual labels:  large-scale
fer
Facial Expression Recognition
Stars: ✭ 32 (-5.88%)
Mutual labels:  pandas
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+8852.94%)
Mutual labels:  pandas
Artificial Intelligence Deep Learning Machine Learning Tutorials
A comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.
Stars: ✭ 2,966 (+8623.53%)
Mutual labels:  pandas
Chatistics
A WhatsApp Chat analyzer and statistics.
Stars: ✭ 32 (-5.88%)
Mutual labels:  pandas
Datacamp Python Data Science Track
All the slides, accompanying code and exercises all stored in this repo. 🎈
Stars: ✭ 250 (+635.29%)
Mutual labels:  pandas
BigCLAM-ApacheSpark
Overlapping community detection in Large-Scale Networks using BigCLAM model build on Apache Spark
Stars: ✭ 40 (+17.65%)
Mutual labels:  large-scale
Jupyter Tips And Tricks
Using Project Jupyter for data science.
Stars: ✭ 245 (+620.59%)
Mutual labels:  pandas
skutil
NOTE: skutil is now deprecated. See its sister project: https://github.com/tgsmith61591/skoot. Original description: A set of scikit-learn and h2o extension classes (as well as caret classes for python). See more here: https://tgsmith61591.github.io/skutil
Stars: ✭ 29 (-14.71%)
Mutual labels:  pandas
python-eodhistoricaldata
Download data from EOD historical data https://eodhistoricaldata.com/ using Python, Requests and Pandas.
Stars: ✭ 67 (+97.06%)
Mutual labels:  pandas
hh research
Автоматизация поиска и исследования вакансий с сайта hh.ru (Headhunter) с помощью методов Python. Классификация данных, поиск статистических параметров.
Stars: ✭ 36 (+5.88%)
Mutual labels:  pandas
Algorithmic-Trading
Algorithmic trading using machine learning.
Stars: ✭ 102 (+200%)
Mutual labels:  pandas

es_pandas

Build Status 996.icu LICENSE PyPi version Downloads

Read, write and update large scale pandas DataFrame with ElasticSearch.

Requirements

This package should work on Python3(>=3.4) and ElasticSearch should be version 5.x, 6.x or 7.x.

Installation The package is hosted on PyPi and can be installed with pip:

pip install es_pandas

Deprecation Notice

Supporting of ElasticSearch 5.x will by deprecated in future version.

Usage

import time

import pandas as pd

from es_pandas import es_pandas


# Information of es cluseter
es_host = 'localhost:9200'
index = 'demo'

# crete es_pandas instance
ep = es_pandas(es_host)

# Example data frame
df = pd.DataFrame({'Num': [x for x in range(100000)]})
df['Alpha'] = 'Hello'
df['Date'] = pd.datetime.now()

# init template if you want
doc_type = 'demo'
ep.init_es_tmpl(df, doc_type)

# Example of write data to es, use the template you create
ep.to_es(df, index, doc_type=doc_type, thread_count=2, chunk_size=10000)

# set use_index=True if you want to use DataFrame index as records' _id
ep.to_es(df, index, doc_type=doc_type, use_index=True, thread_count=2, chunk_size=10000)

# delete records from es
ep.to_es(df.iloc[5000:], index, doc_type=doc_type, _op_type='delete', thread_count=2, chunk_size=10000)

# Update doc by doc _id
df.iloc[:1000, 1] = 'Bye'
df.iloc[:1000, 2] = pd.datetime.now()
ep.to_es(df.iloc[:1000, 1:], index, doc_type=doc_type, _op_type='update')

# Example of read data from es
df = ep.to_pandas(index)
print(df.head())

# return certain fields in es
heads = ['Num', 'Date']
df = ep.to_pandas(index, heads=heads)
print(df.head())

# set certain columns dtype
dtype = {'Num': 'float', 'Alpha': object}
df = ep.to_pandas(index, dtype=dtype)
print(df.dtypes)

# infer dtype from es template
df = ep.to_pandas(index, infer_dtype=True)
print(df.dtypes)

# use query_sql parameter if you want to do query in sql

# Example of write data to es with pandas.io.json
ep.to_es(df, index, doc_type=doc_type, use_pandas_json=True, thread_count=2, chunk_size=10000)
print('write es doc with pandas.io.json finished')
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].