All Projects → kuwala-io → kuwala

kuwala-io / kuwala

Licence: Apache-2.0 license
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…

Programming Languages

python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language
Jupyter Notebook
11667 projects
r
7636 projects
CSS
56736 projects
HTML
75241 projects

Projects that are alternatives of or similar to kuwala

mmtf-workshop-2018
Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-89.45%)
Mutual labels:  jupyter, pyspark
Sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+101.27%)
Mutual labels:  jupyter, pyspark
Geopython
Notebooks and libraries for spatial/geo Python explorations
Stars: ✭ 268 (-43.46%)
Mutual labels:  jupyter, spatial-analysis
Pyspark Setup Demo
Demo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
Stars: ✭ 24 (-94.94%)
Mutual labels:  jupyter, pyspark
Beyond Jupyter
🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (-71.52%)
Mutual labels:  postgres, jupyter
Urbansprawl
Open framework for calculating spatial urban sprawl indices and performing disaggregated population estimates using open data
Stars: ✭ 48 (-89.87%)
Mutual labels:  open-data, spatial-analysis
Crime Analysis
Association Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-95.78%)
Mutual labels:  jupyter, spatial-analysis
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+937.76%)
Mutual labels:  data-integration, elt
astro
Astro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Stars: ✭ 79 (-83.33%)
Mutual labels:  postgres, elt
Kamu Cli
Next generation tool for decentralized exchange and transformation of semi-structured data
Stars: ✭ 69 (-85.44%)
Mutual labels:  jupyter, open-data
Ppd599
USC urban data science course series with Python and Jupyter
Stars: ✭ 1,062 (+124.05%)
Mutual labels:  jupyter, spatial-analysis
Data Science Stack Cookiecutter
🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Stars: ✭ 153 (-67.72%)
Mutual labels:  postgres, jupyter
Sqlcell
SQLCell is a magic function for the Jupyter Notebook that executes raw, parallel, parameterized SQL queries with the ability to accept Python values as parameters and assign output data to Python variables while concurrently running Python code. And *much* more.
Stars: ✭ 145 (-69.41%)
Mutual labels:  postgres, jupyter
jupyterlab-sparkmonitor
JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook
Stars: ✭ 78 (-83.54%)
Mutual labels:  jupyter, pyspark
top-github-scraper
Scape top GitHub repositories and users based on keywords
Stars: ✭ 40 (-91.56%)
Mutual labels:  scraping
mpl-interactions
Sliders to control matplotlib and other interactive goodies. Works in any interactive backend and even uses ipywidgets when in a Jupyter notebook
Stars: ✭ 62 (-86.92%)
Mutual labels:  jupyter
astetik
Astetik takes away the pain from telling visual stories with data on Python
Stars: ✭ 15 (-96.84%)
Mutual labels:  jupyter
subscene scraper
Library to download subtitles from subscene.com
Stars: ✭ 14 (-97.05%)
Mutual labels:  scraping
lightdash
An open source alternative to Looker built using dbt. Made for analysts ❤️
Stars: ✭ 1,082 (+128.27%)
Mutual labels:  dbt
us-house
117th United States House of Representatives - Contact Information, including: Phone Number, Mailing Address, Official Website, Twitter & Facebook Accounts.
Stars: ✭ 31 (-93.46%)
Mutual labels:  open-data

Slack License

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow.

Do you want to discuss your first contribution, want to learn more in general, or discuss your specific use-case for Kuwala? Just book a digital coffee session with the core team here.

Collaboration between BI analysts and engineers

Kuwala stands for extendability, reproducibility, and enablement. Small data teams build data products fastly and collaboratively. Analysts and engineers stay with their strengths. Kuwala is the tool that makes it possible to keep a data project within scope while having fun again.

  • Kuwala Canvas runs directly on a data warehouse = Maximum flexibility and no lock-in effect
  • Engineers enable their analysts by adding transformations and models via dbt or new data sources through Airbyte
  • The node-based editor enables analyst to build advanced data workflows with many data sources and transformations through simple drag-and-drop
  • With models-as-a-block the BI analyst can launch advanced Marketing Mix Models and attributions without knowing R or Python

Extract and Load with Airbyte

For connecting and loading all your tooling data into a data warehouse, we are integrating with Airbyte connectors. For everything related to third-party data, such as POI and demographics data, we are building separate data pipelines.

Transform with dbt

To apply transformations on your data, we are integrating dbt which is running on top of your data warehouses. Engineers can easily create dbt models and make them reusable to the frontend.

Run a Data Science Model

We are going to include open-source data science and AI models (e.g., Meta's Robyn Marketing Mix Modeling).

Report

We make the results exportable to Google Sheets and in the future also available in a Medium-style markdown editor.


How can I use Kuwala?

Canvas

The canvas environment is currently WIP. But you can already get an idea of how it is going to look like with our prototype and checkout our roadmap for updates.

Third-party data connectors

We currently have five pipelines for different third-party data sources which can easily be imported into a Postgres database. The following pipelines are integrated:

Jupyter environment & CLI

Before the canvas is built, we have a Jupyter environment with convenience functions to work with the third-party data pipelines. To easily run the data pipelines, you can use the CLI.

Quickstart & Demo

Demo correlating Uber traversals with Google popularities

badge

Jupyter Notebook Popularity Correlation

We have a notebook with which you can correlate any value associated with a geo-reference with the Google popularity score. In the demo, we have preprocessed popularity data and a test dataset with Uber rides in Lisbon, Portugal.

Run the demo

You could either use the deployed example on Binder using the badge above or run everything locally. The Binder example simply uses Pandas dataframes and is not connecting to a data warehouse.

Setting up and running the CLI

Prerequisites

  1. Installed version of Docker and docker-compose v2.
  2. Installed version of Python3 and latest pip, setuptools, and wheel version.
    • We recommend using version 3.9.5 or higher.
    • To check your current version run python3 --version.
  3. Installed version of libpq.
    • For Mac, you can use brew: brew install libpq
  4. Installed version of postgresql.
    • For Mac, you can use brew: brew install postgresql

Setup

  1. Change your directory to kuwala/core/cli.
  2. Create a virtual environment.
    • For instructions on how to set up a venv on different system see here.
  3. Install dependencies by running pip3 install --no-cache-dir -r requirements.txt

Run

To start the CLI, run the following command from inside the kuwala/core/cli/src directory and follow the instructions:

python3 main.py

Using Kuwala components individually

To use Kuwala's components, such as the data pipelines or the Jupyter environment, individually, please refer to the instructions under /kuwala.


Use cases


How can I contribute?

Every new issue, question, or comment is a contribution and very welcome! This project lives from your feedback and involvement!

Be part of our community

The best first step to get involved is to join the Kuwala Community on Slack. There we discuss everything related to our roadmap, development, and support.

Contribute to the project

Please refer to our contribution guidelines for further information on how to get involved.


Get more content about Kuwala

Link Description
Blog Read all our blog articles related to the stuff we are doing here.
Join Slack Our Slack channel with over 170 data engineers and many discussions.
Jupyter notebook - Popularity correlation Open a Jupyter notebook on Binder and merge external popularity data with Uber traversals by making use of convenient dbt functions.
Podcast Listen to our community podcast and maybe join us on the next show.
Digital coffee break Are you looking for new inspiring tech talks? Book a digital coffee chit-chat with one member of the core team.
Our roadmap See our upcoming milestones and sprint planing.
Contribution guidelines Further information on how to get involved.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].