All Projects → TheDataRideAlongs → ProjectDomino

TheDataRideAlongs / ProjectDomino

Licence: Apache-2.0 License
Scaling COVID public behavior change and anti-misinformation

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to ProjectDomino

app
CovidTrace mobile app.
Stars: ✭ 19 (-67.8%)
Mutual labels:  covid
ml api covid
This is the API Code for my tutorial article. It paints a picture for developing a machine learning Python API from start to finish and provides help in more difficult areas like the setup with AWS Lambda.
Stars: ✭ 21 (-64.41%)
Mutual labels:  covid
covid19-bd
Public API for accessing district-wise dataset and daily stats for Covid-19 in Bangladesh. Data is pulled from IEDCR reports
Stars: ✭ 29 (-50.85%)
Mutual labels:  covid
covid19-react
Progressive Web Application com React para mapear os dados do COVID-19 pelo mundo. 📊
Stars: ✭ 24 (-59.32%)
Mutual labels:  covid
us-covid19
Data repository of State's Health Department stats for COVID19 in the United States
Stars: ✭ 37 (-37.29%)
Mutual labels:  covid
covid
COVID-19 cases around the world.
Stars: ✭ 14 (-76.27%)
Mutual labels:  covid
covid19
Scale Workspace Response to COVID19
Stars: ✭ 17 (-71.19%)
Mutual labels:  covid
COVID Hospital PUF
The community created FAQ about the hospital-level COVID capacity data.
Stars: ✭ 28 (-52.54%)
Mutual labels:  covid
impf-bot
💉🤖 Bot for the German "ImpfterminService - 116117"
Stars: ✭ 167 (+183.05%)
Mutual labels:  covid
align covid
Coronavirus time series aligned by number of cases, not date.
Stars: ✭ 22 (-62.71%)
Mutual labels:  covid
agenda-saude
Sistema de agendamento de saúde, em uso para gerir filas de vacinação do COVID-19 e H1N1.
Stars: ✭ 119 (+101.69%)
Mutual labels:  covid
whatsapp-bot
WhatsApp Chatbot with many kinds of features. This bot is created for the purpose of providing some information and for fun purposes only
Stars: ✭ 23 (-61.02%)
Mutual labels:  covid
covid-19
COVID-19 World is yet another Project to build a Dashboard like app to showcase the data related to the COVID-19(Corona Virus).
Stars: ✭ 28 (-52.54%)
Mutual labels:  covid
rid-covid
Image-based COVID-19 diagnosis. Links to software, data, and other resources.
Stars: ✭ 74 (+25.42%)
Mutual labels:  covid
covid-19-self-assessment
The tool takes the public through a series of questions to inform those who are concerned they may have contracted COVID-19.
Stars: ✭ 31 (-47.46%)
Mutual labels:  covid
fake-news-datasets
This repository contains list of available fake news datasets for data mining.
Stars: ✭ 28 (-52.54%)
Mutual labels:  misinformation
mesconseilscovid
Isolement, tests, vaccins… tout savoir pour prendre soin de votre santé
Stars: ✭ 22 (-62.71%)
Mutual labels:  covid
coronavirus
covid-19 data in J
Stars: ✭ 15 (-74.58%)
Mutual labels:  covid
COVID-19-ANDROID
A simple android application builds for showcase COVID-19.
Stars: ✭ 21 (-64.41%)
Mutual labels:  covid
permanently-remote
A list of tech companies going permanently remote after COVID-19
Stars: ✭ 61 (+3.39%)
Mutual labels:  covid

Project Domino

Scaling COVID public behavior change and anti-misinformation

One of the most important steps in stopping the COVID-19 pandemic is influencing mass behavior change for citizens to take appropriate, swift action on mitigating infection and human-to-human contact. Government officials at all levels have advocated misinformed practices such as dining out or participating in outdoor gatherings that have contributed to amplifying the curve rather than flattening it. At time of writing, the result of poor crisis emergency risk communication has led to over 32.9M US citizens testing positive, 2-20X more are likely untested, and over 584K deaths. The need to influence appropriate behavior and mitigation actions are extreme: The US has shot up from untouched to become the 6th most infected nation.

Project Domino accelerates research on developing capabilities for information hygiene at the mass scale necessary for the current national disaster and for future ones. We develop and enable the use of 3 key data capabilities for modern social discourse:

  • Detecting misinformation campaigns
  • Identifying at-risk behavior groups and viable local behavior change influencers
  • Automating high-precision interventions

Data

We are collecting, analyzing, and sharing data around:

  • COVID Twitter: Who is saying what and when, including around URLs
  • Feeds: Correlations against labeled data such as bots, fact checks, and indicators of digital crime
  • Scores: Bots, misinformation, crime, likely location, and more

While we cannot publish the raw data due due to compliance restrictions from our data porviders, we are happy to support individual projects, such as for analyzing and predicting real-world compliance of health policies, and identifying bad actors. Please jump into the Slack or contact a project leader on LinkedIn and we'll get you going.

The interventions

We are working with ethics groups to identify safe interventions along the following lines:

  • Targeting of specific underserved issues: Primary COVID public health issues such as unsafe social behavior, unsafe medicine, unsafe science, dangerous government policy influence, and adjacent issues such as fake charities, phishing, malware, and hate group propaganda

  • Help top social platforms harden themselves: Trust and safety teams at top social networks need to be able to warn users about misinformation, de-trend it, and potentially take it down before it has served its purpose. The status quo is handling incidents months after the fact. We will provide real-time alert feeds and scoring APIs to help take action during the critical minutes before misinformation gains significant reach.

  • Enable top analysts to investigate coordinated activity: A minority of groups cause the bulk of the misinformation that gets shared. We are building a high-scale analyst environment featuring technology such as GPU-accelerated visual graph analytics and high-memory notebook servers.

  • Help leaders clean up their community: Identify and invite community leaders of at-risk groups to use our tools to detect trending misinformation and sift it out from their regular community content.

  • Alert individuals as they are being manipulated: For manipulated conversations where we have clear intelligence, we are exploring an alert bot that will post the misinformation report directly on the thread, or enable community participants or project partners to do so.

  • Enable other platforms: We expect a growing number of initiatives to benefit from our intelligence and automation capabilities.

The technologies

  • Twitter firehose monitor: 100K+ topical tweets/day
  • Data integration pipeline for sources of known scams, fraud, lies, bots, propaganda, extremism, and other misinformation sources
  • Misinformation knowledge graph connecting accounts, posts, reports, and models (1B+ nodes/edges)
  • Automated GPU / graph / machine learning pipeline: general classification (bot, community, ...) and targeted (clinical disinformation, ...): Nvidia RAPIDS, BERT, ...
  • Automated alerting & reporting pipeline: Prefect, Streamlit
  • Interactive visual analytics environment for data scientists and analysts: GPU, graph, Jupyter, Streamlit, ML, ...

How to help

We are actively seeking several forms of support:

  • Volunteers: Most immediate priority is on data engineering, data science, and advisors on marketing/funding/public health

    • Data engineers: Orchestration (Airflow, Prefect.io, Nifi, ...), streaming (Kafka, ...), graph (Neo4j, cuGraph), GPU (RAPIDS), ML (NLP libs), and databases
    • Analysts: OSINT, threat intel, campaign tracking, ...
    • Data scientists: especially around graph, misinformation, neural networks, GNNs, NLP, with backgrounds such as security, fraud, misinformation, marketing
    • Developers & designers: intelligence integrations, website for search & reports, automations, intervention bots, API
    • Marketing: Strategy & implementation
    • Public health and communications: Especially around safe and effective intervention design
    • Legal: Risk & legal analysis for various interventions
  • APIs and Data:

    • Feeds & enriching APIs: Lists and intel on URLs, domains, keywords, emails, topics, blockchain accounts, social media accounts & content, clinical trials, esp. if tunable on topic
    • Scoring: Libraries and APIs around social networks (initially Twitter) structure & content, websites, and news: bot scores, fingerprinting, topic extraction & classification
    • Crawling tech: Social networks and web
  • Software Licenses:

    • SaaS, OSS, Docker preferred
    • Project management
    • Analytics
    • Automation
  • Hardware: Anything you can provide along the lines of:

    • 1 x Database server (CPU): 32+ cores, Ubuntu 18, 64GB+ RAM, ideally backups
    • 1 x Primary analytics server (GPU) - 32+ CPU cores, 128GB+ CPU RAM, 2-8 GPUs, 64GB+ disk, 2-10TB attached SSD
      • GPUs: Nvidia Pascal or later, 12GB minimum, with 32GB strongly preferred (Ex: 32GB Volta)
    • 2 x Secondary / developer servers (GPU) - 8+ CPU cores, 64GB+ CPU RAM, 2-4 GPUs, 64GB+ disk, 1TB attached SSD
      • GPUs: Nvidia Pascal or later, 12GB minimum
    • 3 x Analyst stations (GPU) - 8+ CPU cores, 64GB+ CPU RAM, 1 TB attached SSD, 1-2 GPUs
      • GPUs: Nvidia Pascal or later, 12GB minimum, with 32GB strongly preferred (Ex: 32GB Volta)
    • For each, we can work together to setup Ubuntu with remote SSH admin, Nvidia drivers, & (Nvidia-enabled) Docker
  • Sponsors: Near-term funding until the project finds a more sustainable path is welcome!

    • Federal and private grants
    • Mission-aligned contracts: We are happy to help organizations tackle challenges around social intelligence, such as disaster analytics & anti-misinformation
  • Subprojects: We are focusing near-term on core data pipeline and simple analyses while building up to the discourse-graph-level ones

    • Firehose pipeline:
      • Cleaning up Twitter -> Neo4j/Parquet data conversions
      • Prefect.io/Airflow Firehose tasks: usertimeline, topic search, ... => Parquet/Neo4j
      • Switching from Twarc to Twint for higher-volume ingest
      • Fast large graph (100M row) export of cypher->neo4j->parquet
    • Orchestration tasks:
      • Python entity extractors: tweet -> text, URLs, bitcoin address, topics, ...
      • Python enrichment tasks: external APIs (factcheck, crypto, ...) and lightweight NLP algs (sentiment) and feed back to neo4j
      • Python data feed collectors: scripts that download feeds (factcheck, phishing, ...) and feed back to neo4j
    • Intervention campaigns:
      • Priority: Untrialed drug misinformation - modeling, detection, analysis, report, & alert
      • Scams: Check URLs & blockchain addresses for known badness
      • Mapping bots & misinformers
    • UI tools & automation: Prototypes of
      • Alert feed + scoring API: Misinfo intel for trust & safety teams that is powered by Neo4j and/or Prefect.io/Airflow
      • Community leader tools: Personalized alerting for community leaders to detect & respond to misinfo in their close social networks

Contact

Please contact Leo Meyerovich, CEO @ Graphistry and Sean Griffin, CEO @ DisasterTech for support and information

Community Slack channel: #COVID via an open invite link

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].