All Projects → trishume → dayder

trishume / dayder

Licence: MIT license
Search lots of data sets for spurious correlations

Programming Languages

rust
11053 projects
javascript
184084 projects - #8 most used programming language
ruby
36898 projects - #4 most used programming language
HTML
75241 projects

Projects that are alternatives of or similar to dayder

ratewithscience
Rate things on arbitrary scales using big data and science!
Stars: ✭ 42 (-4.55%)
Mutual labels:  hackathon, terrible-hack
LastSecondSlides
Use the Google speech-to-text API to generate presentation slides as you talk!
Stars: ✭ 32 (-27.27%)
Mutual labels:  hackathon, terrible-hack
CADLab Loyalty
This is a end to end Loyalty business scenario
Stars: ✭ 21 (-52.27%)
Mutual labels:  hackathon
pathpy
pathpy is an OpenSource python package for the modeling and analysis of pathways and temporal networks using higher-order and multi-order graphical models
Stars: ✭ 124 (+181.82%)
Mutual labels:  data-mining
take-off
An Open Source Distributed Hackathon, Lead By Giveth.io
Stars: ✭ 21 (-52.27%)
Mutual labels:  hackathon
sql-cookbook
Common SQL recipes and best practises
Stars: ✭ 68 (+54.55%)
Mutual labels:  data-mining
BLUELAY
Searches online paste sites for certain search terms which can indicate a possible data breach.
Stars: ✭ 24 (-45.45%)
Mutual labels:  data-mining
csmath-2021
This mathematics course is taught for the first year Ph.D. students of computer science and related areas @zju
Stars: ✭ 30 (-31.82%)
Mutual labels:  data-mining
4chanMarkovText
Text Generation using Markov Chains fed by 4chan APIs
Stars: ✭ 28 (-36.36%)
Mutual labels:  data-mining
Network-Embedding-Resources
Network Embedding Survey and Resources
Stars: ✭ 43 (-2.27%)
Mutual labels:  data-mining
CODE-CAMP-2020
A Virtual Hackathon Camp for Developers, Build real products and win Swags in comfort of your home.
Stars: ✭ 30 (-31.82%)
Mutual labels:  hackathon
Hefei ECG TOP1
“合肥高新杯”心电人机智能大赛 —— 心电异常事件预测 TOP1 Solution
Stars: ✭ 109 (+147.73%)
Mutual labels:  data-mining
hackathon-prep-material
Getting ready for a Bluemix hackathon? Here is some great material to get you started.
Stars: ✭ 26 (-40.91%)
Mutual labels:  hackathon
dating-app-concept-flutter
Dating App UI concept made to showcase Flutter in SheCodes hackathon
Stars: ✭ 53 (+20.45%)
Mutual labels:  hackathon
kmeans
A simple implementation of K-means (and Bisecting K-means) clustering algorithm in Python
Stars: ✭ 18 (-59.09%)
Mutual labels:  data-mining
dutch-hackathons
Building the most comprehensive list of annual hackathons in the Netherlands at hackathonlist.nl.
Stars: ✭ 22 (-50%)
Mutual labels:  hackathon
machine learning in python
Demo of basic machine learning models in python with Jupter Notebook
Stars: ✭ 16 (-63.64%)
Mutual labels:  data-mining
MLH-Quizzet
This is a smart Quiz Generator that generates a dynamic quiz from any uploaded text/PDF document using NLP. This can be used for self-analysis, question paper generation, and evaluation, thus reducing human effort.
Stars: ✭ 23 (-47.73%)
Mutual labels:  hackathon
Awesome-DataScience-Cheatsheets
Collection of cheatsheets for data science, machine learning and deep learning :).
Stars: ✭ 48 (+9.09%)
Mutual labels:  data-mining
hackathon
Repositório de hackathons do Training Center
Stars: ✭ 20 (-54.55%)
Mutual labels:  hackathon

Dayder

Dayder is a web app for finding spurious correlations in thousands of data sets. It was originally created for TerribleHack III where it just worked with 3000+ different causes of death, but we've kept improving it since then. It was made by Tristan Hume and Marc Mailhot. Since the hackathon we've expanded and optimized it to work with over 390,000 data sets at a time.

Having been originally created for a "Stupid shit no-one needs and terrible ideas" hackathon, we don't even pretend that Dayder is useful. It is mostly a tech demo for making a super fast web app despite dealing with large quantities of data. Dayder uses a custom binary format, custom JS Canvas and DOM rendering, heavily optimized server-side code in Rust, caching and tuned data layouts for excellent performance. It can filter through 390,000+ data sets as you type with sub 50ms server response time and instantaneous rendering of thousands of graphs in the browser. It also can find correlations among all 390,000+ data sets in less than 2 seconds.

Dayder

Origins

Inspired by the Spurious Correlations book and website. We thought of making a modern fast reactive site for finding spurious correlations in lots of time series data sets. We started with the data set of various causes of death over time, because it was the funniest in a morbid sort of way.

After the hackathon we added Canadian GDP from various sectors, and the full set of over 300,000 time series from FRED. This required making a whole bunch more improvements to speed and usability.

How we built it

We designed a custom binary time series data format (btsf!) that allows bandwidth and memory efficient processing of large amounts of time series data.

This is processed using typed arrays in JS on the client and using Rust on the server. All the DOM and canvas rendering is efficient custom code so that it can render thousands of graphs in milliseconds.

Later we optimized it further. It now does as-you-type filtering on the server side so it doesn't have to send 400MB of data to the client. In order to do as-you-type filtering I optimized the filtering method using efficient data layout, lazy processing, caching and incremental sorting to maintain sub 50ms response times on every request.

License

This project is released under the MIT license, see the LICENSE file for details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].