non-api-fb-scraperScrape public FaceBook posts from any group or user into a .csv file without needing to register for any API access
Stars: ✭ 40 (-85.51%)
spark-netflowNetFlow data source for Spark SQL and DataFrames
Stars: ✭ 16 (-94.2%)
RemoteShuffleServiceCeleborn provides an elastic and high-performance service for shuffle and spilled data.
Stars: ✭ 262 (-5.07%)
r4dswebsitePublic repository for the R4DS community website.
Stars: ✭ 19 (-93.12%)
IoT-system-PLC-data-to-InfluxDBThis project aim is to provide free software to fetch data from plcs (Siemens S7-300/400/1200/1500) and store it. Used stack is completly opensource. I used InfluDB as data storage, so application principle is following Big Data paradigm.
Stars: ✭ 26 (-90.58%)
interpretable-mlTechniques & resources for training interpretable ML models, explaining ML models, and debugging ML models.
Stars: ✭ 17 (-93.84%)
Open-korean-corporaOpen Korean NLP Dataset Curation for the Users All Around the Globe
Stars: ✭ 82 (-70.29%)
climateRAn R 📦 for getting point and gridded climate data by AOI
Stars: ✭ 93 (-66.3%)
Naive-Resume-MatchingText Similarity Applied to resume, to compare Resumes with Job Descriptions and create a score to rank them. Similar to an ATS.
Stars: ✭ 27 (-90.22%)
spark-rootApache Spark Data Source for ROOT File Format
Stars: ✭ 28 (-89.86%)
dxramA distributed in-memory key-value storage for billions of small objects.
Stars: ✭ 25 (-90.94%)
lightdashAn open source alternative to Looker built using dbt. Made for analysts ❤️
Stars: ✭ 1,082 (+292.03%)
PySPODA Python package for spectral proper orthogonal decomposition (SPOD).
Stars: ✭ 50 (-81.88%)
nebulaA distributed, fast open-source graph database featuring horizontal scalability and high availability
Stars: ✭ 8,196 (+2869.57%)
Product-Categorization-NLPMulti-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-89.13%)
img2datasetEasily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Stars: ✭ 1,173 (+325%)
ipython-notebooksA collection of Jupyter notebooks exploring different datasets.
Stars: ✭ 43 (-84.42%)
data-miningResources for the Data Mining for Bussiness and Governance course.
Stars: ✭ 52 (-81.16%)
GDLibraryMatlab library for gradient descent algorithms: Version 1.0.1
Stars: ✭ 50 (-81.88%)
Data-AnalysisDifferent types of data analytics projects : EDA, PDA, DDA, TSA and much more.....
Stars: ✭ 22 (-92.03%)
algorithmsbasic algorithms and solutions
Stars: ✭ 22 (-92.03%)
pyglotaranA Python library for Global and Target Analysis of time-resolved spectroscopy data
Stars: ✭ 33 (-88.04%)
lcbo-apiA crawler and API server for Liquor Control Board of Ontario retail data
Stars: ✭ 152 (-44.93%)
PythonTipsDSPython Tips for Data Scientist
Stars: ✭ 23 (-91.67%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-85.87%)
growthbookOpen Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+748.55%)
Instagram-Comments-ScraperInstagram comment scraper using python and selenium. Save the comments into excel.
Stars: ✭ 73 (-73.55%)
gan deeplearning4jAutomatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-93.12%)
FlameStreamDistributed stream processing model and its implementation
Stars: ✭ 14 (-94.93%)
vinumVinum is a SQL processor for Python, designed for data analysis workflows and in-memory analytics.
Stars: ✭ 57 (-79.35%)
lubeckHigh level linear algebra library for Dlang
Stars: ✭ 57 (-79.35%)
tracing-vs-freehandTracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021)
Stars: ✭ 21 (-92.39%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-94.93%)
ddalDDAL(Distributed Data Access Layer) is a simple solution to access database shard.
Stars: ✭ 33 (-88.04%)
uetaiCustom ML tracking experiment and debugging tools.
Stars: ✭ 17 (-93.84%)
datartDatart is a next generation Data Visualization Open Platform
Stars: ✭ 1,042 (+277.54%)
social-dataCode and data for eviction and housing analysis in the US
Stars: ✭ 17 (-93.84%)
stats📈 Useful notes and personal collections on statistics.
Stars: ✭ 16 (-94.2%)
bsu🎓Repository for university labs on FAMCS, BSU
Stars: ✭ 91 (-67.03%)
simon-frontend💹 SIMON is powerful, flexible, open-source and easy to use machine learning knowledge discovery platform 💻
Stars: ✭ 114 (-58.7%)
Data-Analyst-NanodegreeThis repo consists of the projects that I completed as a part of the Udacity's Data Analyst Nanodegree's curriculum.
Stars: ✭ 13 (-95.29%)
PyDREAMPython Implementation of Decay Replay Mining (DREAM)
Stars: ✭ 22 (-92.03%)
seedseed自助报表展示系统
Stars: ✭ 63 (-77.17%)
ngmswissgeol.ch gives you insight in geoscientific data - above and below the surface.
Stars: ✭ 23 (-91.67%)