All Projects → prakharrathi25 → data-storyteller

prakharrathi25 / data-storyteller

Licence: MIT license
Automated tool for data story telling

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to data-storyteller

ML-explainability-app
This is a web app built for easy explainability of machine learning models without writing any code in order to explain easily to non-technicals and stakeholders.
Stars: ✭ 15 (-82.35%)
Mutual labels:  streamlit
Data-Science-Meetup-Oxford
Content shared at DS-OX Meetup
Stars: ✭ 59 (-30.59%)
Mutual labels:  streamlit
project-code-py
Leetcode using AI
Stars: ✭ 100 (+17.65%)
Mutual labels:  streamlit
COCO-dataset-explorer
Streamlit tool to explore coco datasets
Stars: ✭ 66 (-22.35%)
Mutual labels:  streamlit
streamlit-lottie
Streamlit component to render Lottie animations
Stars: ✭ 47 (-44.71%)
Mutual labels:  streamlit
option-pricing-models
Simple python/streamlit web app for European option pricing using Black-Scholes model, Monte Carlo simulation and Binomial model. Spot prices for the underlying are fetched from Yahoo Finance API.
Stars: ✭ 16 (-81.18%)
Mutual labels:  streamlit
MIRNet
Tensorflow implementation of MIRNet for Low-light image enhancement
Stars: ✭ 78 (-8.24%)
Mutual labels:  streamlit
Hand Written
HandWritten is a streamlit application that converts a digital text document to a handwritten document
Stars: ✭ 17 (-80%)
Mutual labels:  streamlit
keytotext
Keywords to Sentences
Stars: ✭ 226 (+165.88%)
Mutual labels:  streamlit
streamlit-project
This repository provides a simple deployment-ready project layout for a Streamlit app. Simply swap out the code in `app.py` for your own and hit deploy!
Stars: ✭ 33 (-61.18%)
Mutual labels:  streamlit
Bank-Note-Authentication
💸 Authenticate Bank Notes on the basis of Genuity and Forged using Sklearn and deployed on Heroku and FastAPI Server 💳 💲
Stars: ✭ 17 (-80%)
Mutual labels:  streamlit
Story Generator
A Streamlit web app that generates Rick and Morty stories using GPT2.
Stars: ✭ 28 (-67.06%)
Mutual labels:  streamlit
Streamlit
Streamlit — The fastest way to build data apps in Python
Stars: ✭ 16,906 (+19789.41%)
Mutual labels:  streamlit
streamlit-tensorboard
Streamlit component for TensorBoard, TensorFlow's visualization toolkit
Stars: ✭ 18 (-78.82%)
Mutual labels:  streamlit
ezancestry
Easy genetic ancestry predictions in Python
Stars: ✭ 38 (-55.29%)
Mutual labels:  streamlit
stqdm
stqdm is the simplest way to handle a progress bar in streamlit app.
Stars: ✭ 75 (-11.76%)
Mutual labels:  streamlit
capbot2.0
Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021.
Stars: ✭ 20 (-76.47%)
Mutual labels:  streamlit
WebApp-Computer-Vision-streamlit
Computer Vision application in the web
Stars: ✭ 35 (-58.82%)
Mutual labels:  streamlit
vqgan-clip-app
Local image generation using VQGAN-CLIP or CLIP guided diffusion
Stars: ✭ 94 (+10.59%)
Mutual labels:  streamlit
leafmap-apps
Interactive web apps created using leafmap and streamlit
Stars: ✭ 30 (-64.71%)
Mutual labels:  streamlit

data-storyteller

forthebadge pythonbadge

📱 Data Storyteller 📉

ONE STOP SOLUTION FOR ALL YOUR DATA NEEDS

Introduction

As per Gartner [2], the analytics and business intelligence platform market has transitioned from the visual data discovery era to the augmented era. Data and analytics leaders/administrators should begin piloting capabilities and competencies that enable the “augmented consumer”.

With the technology advancements, the organisation today has the pre-eminence of taking data driven decisions and strategize their planning, forecasts based on the same. A profusion of business users do not have time to analyze the data and then secure noteworthy insights. And there are gaps betwixt how the tool produces an output and how the business user can exploit it to interpret it. In accompaniment, it needs a admirable domain knowledge to build business insights from data. Not every user is a business expert. Given a snapshot of data, we would like to fabricate a system which can verbalise a story from the data. The story includes the automation in the sense of being driven by the data, context and personal preferences. In this case, it solves both the problems of the tool usage as well as guiding the user with the data driven intelligence to make business decision. The whole corollary is driven by outcome and effectiveness.

Tool Description

Data Storyteller is an AI based tool that can take a data set, identify patterns in the data, can interpret the result, and can then produce an output story that is understandable to a business user based on the context. It is able to pro-actively analyse data on behalf of users and generate smart feeds using natural language generation techniques which can then be consumed easily by business users with very less efforts. The application has been built keeping in mind a rather elementary user and is hence, easily usable and understandable. This also uses a multipage implementation of Streamlit Library using Class based pages.

Features

Given data/analytics output, the tool can:-

  • turn the data into interactive data stories based on the given data
  • generate deep insights, infer pattern and help in business decisions.
  • provide personalization profiles; these could be represented as meta data describing what would be of interest to a given user.
  • generate reports understandable to a business user with interactive and intuitive interface.

📝 Module-Wise Description

The application also uses Streamlit for a multiclass page implementation which can be viewed in the multipage.py file. The UI of the application can be seen here. The application is divided into multiple modules, each of which have been described below.

UI of the application

📌 Data Upload

This module deals with the data upload. It can take csv and excel files. As soon as the data is uploaded, it creates a copy of the data to ensure that we don't have to read the data multiple times. It also saves the columns and their data types along with displaying them for the user. This is used to upload and save the data and it's column types which will be further needed at a later stage.

📌 Change Metadata

Once the column types are saved in the metadata, we need to give the user the option to change the type. This is to ensure that the automatic column tagging can be overridden if the user wishes. For example a binary column with 0 and 1s can be tagged as numerical and the user might have to correct it. The three data types available are:

  • Numerical
  • Categorical
  • Object

The correction happens immediately and is saved at that moment.

📌 Machine Learning

This section automates the process of machine learning by giving the user the option to select X and y variables and letting us do everything else. The user can specify which columns they need for machine learning and then select the type of process - regression and classficiation. The application selects multiple models and saves the best one as a binary .sav file to be used in the future for inferencing. The accuracy or R2 score is shown right then and there with the model running in the background.

📌 Data Visualization

📌 Y-Parameter Optimization

Technology Stack

  1. Python
  2. Streamlit
  3. Pandas
  4. Scikit-Learn
  5. Seaborn

How to Run

  • Clone the repository
  • Setup Virtual environment
$ python3 -m venv env
  • Activate the virtual environment
$ source env/bin/activate
  • Install dependencies using
$ pip install -r requirements.txt
  • Run Streamlit
$ streamlit run app.py

Other Content

Video Walkthrough

Presentation

🤝 How to Contribute? [3]

  • Take a look at the Existing Issues or create your own Issues!
  • Wait for the Issue to be assigned to you after which you can start working on it.
  • Fork the Repo and create a Branch for any Issue that you are working upon.
  • Create a Pull Request which will be promptly reviewed and suggestions would be added to improve it.
  • Add Screenshots to help us know what this Script is all about.

👨‍💻 Contributors


Prakhar Rathi


Manav Prabhakar


Salil Sxena

References

[1] SAP Hackathon: https://sap-code.hackerearth.com/challenges/hackathon/sap-code/custom-tab/data-4-storytelling/#Data%204%20Storytelling (used for the README.md introduction)

[2] Gartner: https://www.gartner.com/en/documents/3982132

[3] Soumyajit Behera: https://github.com/soumyajit4419/MedHub_360

Contact

For any feedback or queries, please reach out to [email protected].

Note: The project is only for education purposes, no plagiarism is intended.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].