2. Dist KerasDistributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
3. spark-dashboardTooling to deploy an Apache Spark performance dashboard. Run this as a standalone Docker container or install the helm chart on Kubernetes.
4. hdfs-metadataTool for gathering blocks and replicas meta data from HDFS. It also builds a heat map showing how replicas are distributed along disks and nodes.
5. SparkPluginsCode and examples of how to write and deploy Apache Spark Plugins with Spark 3.x. Spark plugins allow runnig custom code on the executors as they are initialized. This also allows extending the Spark metrics systems with user-provided monitoring probes.
6. Hadoop-ProfilerHadoop Profiler, or hprofiler, is a tool which is able to analyze on- and off-CPU workloads on distributed computing environments.