All Projects → cc-archive → cccatalog-dataviz

cc-archive / cccatalog-dataviz

Licence: MIT license
Data visualizations of CC-licensed works across the internet.

Programming Languages

HTML
75241 projects
javascript
184084 projects - #8 most used programming language
python
139335 projects - #7 most used programming language
CSS
56736 projects

Projects that are alternatives of or similar to cccatalog-dataviz

image-crawler
A polite image crawler that can thumbnail and extract metadata from images at scale
Stars: ✭ 17 (-10.53%)
Mutual labels:  discontinued
WasteWater
Free library for modelling and simulation of waste water treatment plants.
Stars: ✭ 35 (+84.21%)
Mutual labels:  discontinued
hackfastalgos
A library of various fast algorithms written in Hack
Stars: ✭ 34 (+78.95%)
Mutual labels:  discontinued
b2g-installer
DEPRECATED - Tools to easily flash b2g on your android phone
Stars: ✭ 27 (+42.11%)
Mutual labels:  discontinued
discord-theme
CSS Theme for Discord. (discontinued)
Stars: ✭ 16 (-15.79%)
Mutual labels:  discontinued
BrazilianPhoneValidator
Validator for Brazilian phones based on the official Anatel datasets
Stars: ✭ 17 (-10.53%)
Mutual labels:  discontinued
ccsearch-browser-extension
[PROJECT TRANSFERRED] Cross-Browser extension to search, filter and use images in the public domain and under Creative Commons licenses.
Stars: ✭ 115 (+505.26%)
Mutual labels:  discontinued
cccatalog-frontend
[PROJECT TRANSFERRED] CC Search is a search tool for CC-licensed and public domain content across the internet.
Stars: ✭ 160 (+742.11%)
Mutual labels:  discontinued

Project Discontinued

For additional context see:


Visualize CC Catalog Data

About

The landscape of openly licensed content is wide and varied. Millions of web pages host and share CC-licensed works—in fact, we estimate that there are over 1.6 billion across the web! With this growth of CC-licensed works, Creative Commons (CC) is increasingly interested in learning how hosts and users of CC-licensed materials are connected, as well as the types of content published under a CC license and how this content is shared. Each month, CC uses Common Crawl data to find all domains that contain CC-licensed content. This dataset contains information about the URL of the websites and the licenses used.

In order to draw conclusions and insights from this dataset, we created the Linked Commons: a visualization that shows how the Commons is digitally connected.

A live demo of the project can be found in here

Getting Started

Directory Structure

src
│   README.md
│   docker-compose.yml # Development docker compose
│
└───GSoC2019
└───data-release # Contains some raw unprocessed tsv files and processed output JSON files
│
└───frontend # Contains react.js app to render the visualization in the browser.|   .env # Contains Backend Server Base Endpoint
│  │   package.json
│  │   package.lock.json
│  │
│  └───src # Contains all React Components
│  
└───backend # Includes Django server source code and scripts to build & update the database. 
   │   requirements.txt
   │   .env # Contains list of environment variables the project needs
   │
   └───scripts # Contains scripts to parse JSON data and upload it to MongoDB server
   └───src # Contains server side Django Apps which defines the API that feeds data to the visualization 

Setting Up Local Development Environment Without Docker

Prerequisites

The frontend application is using react, for which NodeJS v12+ and npm are necessary. NodeJS can be installed from here.

The backend application is using Django, for which Python v3.7+ necessary. Python can be installed from here.

Frontend

  1. Navigate to frontend/ directory.
cd frontend/
  1. Install all dependencies (Make sure that there exists a package.json in the current path)
npm install
  1. To start the development server, use the following command in the terminal.
npm start
  1. To create an optimized build for production, run the following command in the terminal.
npm run build

Backend and Database

  1. Navigate to backend/ directory.
cd backend/
  1. Before proceeding further, ensure that all the variables in .env file are updated and MONGO_HOSTNAME is set to localhost:27017.
  2. Install all dependencies
pip install -r requirements.txt
  1. Navigate to src/ directory where Django-server code exists
cd src/
  1. To start the development server, use the following command
python manage.py runserver
  1. Now the backend should be live at localhost:8000.
  2. The server needs a running instance of MongoDB. Start the Mongo DB server and ensure that the authentication credentials are exactly same as defined in the .env file. If you wish to update the data inside the Database, head over to this section.
  3. Happy Contributing to Linked Commons! 🚀🚀🚀

Setting Up Local Development Environment using Docker

  1. Make sure that the root directory contains docker-compose.yml. And ensure that the backend/.env file is updated and MONGO_HOSTNAME is set to mongodb:27017.
  2. Run the following command to build and start the container.
docker-compose up
  1. Now the frontend, backend and database should be live.
  2. If this is the first time you have built the container, head over to this section to learn how to add data to the MongoDB.
  3. Any changes in the backend/ and frontend/ will trigger a rebuild process and you will be able to see the changes on server!
  4. Happy Contributing to Linked Commons! 🚀🚀🚀

Building production version

Important: For simiplicity we will be using docker to build the production version. Please note that any changes in project files after build won't get reflected in the running container and you need to rebuild the image again.

  1. Before building images, ensure that all the variables in .env file are updated and MONGO_HOSTNAME is set to mongodb:27017.
  2. Now, navigate to backend and then build the django-backend image.
cd backend/
docker build . -f Dockerfile.prod -t linked_commons/backend
  1. Create a new user-defined bridge network
docker network create --driver=bridge linkedcommons-net
  1. Now run the recently built linked_commons/backend image.
docker run --name backend \
   -p 8000:8000 --env-file ./env \
   --network=linkedcommons-net \
   --rm -d linked_commons/backend
  1. Now to start the database in an isolated container.
docker run -it --name mongodb \
   --network=linkedcommons-net \
   -p 27017:27017 -v mongodbdata:/data/db \
   --env-file ./.env --rm -d mongo:4.0.8
  1. You can now access the backend at port 8000 and database at port 27017 of localhost. If you wish to add data then head over to this section.

  2. Now, let's build the frontend. Navigate to frontend directory and build the react-frontend image.

cd frontend
docker build . -f Dockerfile.prod  -t  linkedcommons/frontend
  1. Now to start the frontend application run the following command.
docker run --name frontend \
   -p 3000:80 --rm -d linkedcommons/frontend
  1. Now, the frontend can be accessed at localhost:3000.

Add data to MongoDB

  1. Navigate to the directory containing build_db_script.py.
cd backend/scripts
  1. Ensure that the directory contains fdg_input_file.json or update the INPUT_FILE_PATH variable which will be uploaded to the database. A sample fdg_input_file.json can be found inside data-release/ directory.
  2. Ensure that all the variables in .env file are updated with the running mongodb server.
  3. Now run the build_db_script in the terminal.
# It will connect to the database at `localhost:27017` and update the data. 
python build_db_script.py localhost
  1. It should take a while depending on the JSON file size.
  2. Congrats! You have successfully updated the data. 🎉🎉🎉

Archive

GSoC2019 - Google Summer of Code project by María Belén Guaranda

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].