ExposureExposure是一个帮助做曝光统计需求的库,可以很方便的对曝光事件进行埋点,在现有代码上少量侵入即可实现曝光埋点。支持RV的线性布局、网格布局、瀑布流布局、横向滑动RV,ScrollView等各种滚动布局。支持配置item的有效曝光面积。
UnROOT.jlNative Julia I/O package to work with CERN ROOT files
SparkTwitterAnalysisAn Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project.
cdsData syncing in golang for ClickHouse.
awesome-bigdataA curated list of awesome big data frameworks, ressources and other awesomeness.
meetups-archivosPpts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …
flink-learnLearning Flink : Flink CEP,Flink Core,Flink SQL
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
columnifyMake record oriented data to columnar format.
datacatalog-tag-managerPython package to manage Google Cloud Data Catalog tags, loading metadata from external sources -- currently supports the CSV file format
NotesThis is a learning note | Java基础,JVM,源码,大数据,面经
gan deeplearning4jAutomatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
anovosAnovos - An Open Source Library for Scalable feature engineering Using Apache-Spark
dockerfilesMulti docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
StreamBenchMeasuring the performance of popular streaming engines with Yahoo's Streaming Benchmark
the-apache-ignite-bookAll code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
jhdfA pure Java HDF5 library
amasAmas is recursive acronym for “Amas, monitor alert system”.
greycatGreyCat - Data Analytics, Temporal data, What-if, Live machine learning
TiBigDataTiDB connectors for Flink/Hive/Presto
Clustering4EverC4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
bigquery-data-lineageReference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
intersect一道面试题的思考 - 6000万数据包和300万数据包在50M内存使用环境中求交集
hayabusaHayabusa: Simple and Fast Full-Text Search Engine for Massive System Log Data
optimus🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
workflUXAn open-source, cloud-ready web application for simplified deployment of big data workflows.