mmtf-workshop-2018Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (+150%)
FunFolDesDataRosetta FunFolDes – a general framework for the computational design of functional proteins.
Stars: ✭ 15 (-25%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+455%)
hPDBPDB parser in Haskell
Stars: ✭ 20 (+0%)
mmtfThe specification of the MMTF format for biological structures
Stars: ✭ 40 (+100%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+95%)
contact mapContact map analysis for biomolecules; based on MDTraj
Stars: ✭ 27 (+35%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+14395%)
plmcInference of couplings in proteins and RNAs from sequence variation
Stars: ✭ 85 (+325%)
MorpheusMorpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Stars: ✭ 303 (+1415%)
gan deeplearning4jAutomatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-5%)
deepblastNeural Networks for Protein Sequence Alignment
Stars: ✭ 29 (+45%)
pyspark-cheatsheetPySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Stars: ✭ 115 (+475%)
spark3DSpark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
Stars: ✭ 23 (+15%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (+540%)
lightdockProtein-protein, protein-peptide and protein-DNA docking framework based on the GSO algorithm
Stars: ✭ 110 (+450%)
SynapseMLSimple and Distributed Machine Learning
Stars: ✭ 3,355 (+16675%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+975%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-35%)
HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (+620%)
ParquetviewerSimple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (+625%)
parapredParatope Prediction using Deep Learning
Stars: ✭ 49 (+145%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+785%)
tape-neurips2019Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. (DEPRECATED)
Stars: ✭ 117 (+485%)
Data AcceleratorData Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+1135%)
spark-recordsBulletproof Apache Spark jobs with fast root cause analysis of failures.
Stars: ✭ 67 (+235%)
sparkucxA high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (+60%)
awesome-toolscurated list of awesome tools and libraries for specific domains
Stars: ✭ 31 (+55%)
MistServerless proxy for Spark cluster
Stars: ✭ 309 (+1445%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+650%)
pytorch-rgnRecurrent Geometric Network in Pytorch
Stars: ✭ 28 (+40%)
gis4wrfQGIS toolkit 🧰 for pre- and post-processing 🔨, visualizing 🔍, and running simulations 💻 in the Weather Research and Forecasting (WRF) model 🌀
Stars: ✭ 137 (+585%)
Uni-FoldAn open-source platform for developing protein models beyond AlphaFold.
Stars: ✭ 227 (+1035%)
seamlessSeamless is a framework to set up reproducible computations (and visualizations) that respond to changes in cells. Cells contain the input data as well as the source code of the computations, and all cells can be edited interactively.
Stars: ✭ 19 (-5%)
fink-brokerAstronomy Broker based on Apache Spark
Stars: ✭ 18 (-10%)
incubator-tezMirror of Apache Tez (Incubating)
Stars: ✭ 60 (+200%)
spinmobRapid and flexible acquisition, analysis, fitting, and plotting in Python. Designed for scientific laboratories.
Stars: ✭ 34 (+70%)
ClickhouseClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+105345%)
PyCannyEdgeEducational Python implementation of the Canny Edge Detector
Stars: ✭ 31 (+55%)
KoalasKoalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+15120%)
Vue Virtual Scroll List⚡️A vue component support big amount data list with high render performance and efficient.
Stars: ✭ 3,201 (+15905%)
paccmann kinase binding residuesComparison of active site and full kinase sequences for drug-target affinity prediction and molecular generation. Full paper: https://pubs.acs.org/doi/10.1021/acs.jcim.1c00889
Stars: ✭ 29 (+45%)
sequenceworkprograms and scripts, mainly python, for analyses related to nucleic or protein sequences
Stars: ✭ 22 (+10%)
isarn-sketches-sparkRoutines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (+40%)
CboardAn easy to use, self-service open BI reporting and BI dashboard platform.
Stars: ✭ 2,795 (+13875%)
HyperspaceAn open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+1130%)
Aws Etl OrchestratorA serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (+1125%)
bagriXML/Document DB on top of distributed cache
Stars: ✭ 40 (+100%)
TrafodionApache Trafodion
Stars: ✭ 242 (+1110%)
Kafka UiOpen-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (+1050%)
Selinon An advanced distributed task flow management on top of Celery
Stars: ✭ 237 (+1085%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+1075%)