All Projects → mesosphere → Rendler

mesosphere / Rendler

A rendering web crawler for Apache Mesos.

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Rendler

Csilvm
A LVM2 CSI plugin
Stars: ✭ 49 (-80%)
Mutual labels:  dcos
Marathon Autoscale
Simple Proof-of-Concept for Scaling Application running on Marathon based on Utilization
Stars: ✭ 108 (-55.92%)
Mutual labels:  dcos
Mesos Docker
Project has been superseded by native docker support in Mesos
Stars: ✭ 176 (-28.16%)
Mutual labels:  dcos
Dcos E2e
Spin up and manage DC/OS clusters in test environments
Stars: ✭ 61 (-75.1%)
Mutual labels:  dcos
Vault Gatekeeper
A small service for securely delivering Vault authorization keys to Mesos tasks and ECS containers.
Stars: ✭ 83 (-66.12%)
Mutual labels:  dcos
Reactjs Components
🎨 A library of reusable React components
Stars: ✭ 135 (-44.9%)
Mutual labels:  dcos
Shakedown
DC/OS test harness
Stars: ✭ 30 (-87.76%)
Mutual labels:  dcos
Marathon Ui
The web-ui for Marathon (https://github.com/mesosphere/marathon)
Stars: ✭ 222 (-9.39%)
Mutual labels:  dcos
Scale
Processing framework for containerized algorithms
Stars: ✭ 100 (-59.18%)
Mutual labels:  dcos
Dcos Commons
DC/OS SDK is a collection of tools, libraries, and documentation for easy integration of technologies such as Kafka, Cassandra, HDFS, Spark, and TensorFlow with DC/OS.
Stars: ✭ 162 (-33.88%)
Mutual labels:  dcos
Etcd Mesos
self-healing etcd on mesos!
Stars: ✭ 68 (-72.24%)
Mutual labels:  dcos
Dcos Jenkins Service
Jenkins on DC/OS
Stars: ✭ 72 (-70.61%)
Mutual labels:  dcos
Examples
DC/OS examples
Stars: ✭ 139 (-43.27%)
Mutual labels:  dcos
Acs Engine
WE HAVE MOVED: Please join us at Azure/aks-engine!
Stars: ✭ 1,049 (+328.16%)
Mutual labels:  dcos
Dcos
DC/OS - The Datacenter Operating System
Stars: ✭ 2,316 (+845.31%)
Mutual labels:  dcos
Ansible Dcos
[DEPRECATED] Please consider using the Ansible Roles for DC/OS maintained by the Mesosphere SRE team
Stars: ✭ 41 (-83.27%)
Mutual labels:  dcos
Dcos Cassandra Service
DEPRECATED—Open source Apache Cassandra running on DC/OS is now replaced by mesosphere/dcos-commons/frameworks/cassandra. This repository will be deleted at the end of 2017.
Stars: ✭ 116 (-52.65%)
Mutual labels:  dcos
Dcos Cli
The command line for DC/OS.
Stars: ✭ 225 (-8.16%)
Mutual labels:  dcos
Lashup
A distributed CRDT store with multicast and failure detector capabilities
Stars: ✭ 211 (-13.88%)
Mutual labels:  dcos
Dcos Kubernetes Quickstart
Quickstart guide for Kubernetes on DC/OS
Stars: ✭ 161 (-34.29%)
Mutual labels:  dcos

RENDLER ⁉️

A rendering web-crawler framework for Apache Mesos.

YES RENDLER

See the accompanying slides for more context.

RENDLER consists of three main components:

  • CrawlExecutor extends mesos.Executor
  • RenderExecutor extends mesos.Executor
  • RenderingCrawler extends mesos.Scheduler and launches tasks with the executors

Quick Start with Vagrant

Requirements

Start the mesos-demo VM

$ wget http://downloads.mesosphere.io/demo/mesos.box -O /tmp/mesos.box
$ vagrant box add --name mesos-demo /tmp/mesos.box
$ git clone https://github.com/mesosphere/RENDLER.git
$ cd RENDLER
$ vagrant up

Now that the VM is running, you can view the Mesos Web UI here: http://10.141.141.10:5050

You can see that 1 slave is registered and you've got some idle CPUs and Memory. So let's start the Rendler!

Run RENDLER in the mesos-demo VM

Check implementations of the RENDLER scheduler in the python, go, scala, and cpp directories. Run instructions are here:

Feel free to contribute your own!

Generating a pdf of your render graph output

With GraphViz (which dot) installed:

[email protected]:hostfiles $ bin/make-pdf
Generating '/home/vagrant/hostfiles/result.pdf'

Open result.pdf in your favorite viewer to see the rendered result!

Sample Output

Sample Crawl Crawl

Shutting down the mesos-demo VM

# Exit out of the VM
[email protected]:hostfiles $ exit
# Stop the VM
$ vagrant halt
# To delete all traces of the vagrant machine
$ vagrant destroy

Rendler Architecture

Crawl Executor

  • Interprets incoming tasks' task.data field as a URL
  • Fetches the resource, extracts links from the document
  • Sends a framework message to the scheduler containing the crawl result.

Render Executor

  • Interprets incoming tasks' task.data field as a URL
  • Fetches the resource, saves a png image to a location accessible to the scheduler.
  • Sends a framework message to the scheduler containing the render result.

Intermediate Data Structures

We define some common data types to facilitate communication between the scheduler and the executors. Their default representation is JSON.

results.CrawlResult(
    "1234",                                 # taskId
    "http://foo.co",                        # url
    ["http://foo.co/a", "http://foo.co/b"]  # links
)
results.RenderResult(
    "1234",                                 # taskId
    "http://foo.co",                        # url
    "http://dl.mega.corp/foo.png"           # imageUrl
)

Rendler Scheduler

Data Structures

  • crawlQueue: list of urls
  • renderQueue: list of urls
  • processedURLs: set or urls
  • crawlResults: list of url tuples
  • renderResults: map of urls to imageUrls

Scheduler Behavior

The scheduler accepts one URL as a command-line parameter to seed the render and crawl queues.

  1. For each URL, create a task in both the render queue and the crawl queue.

  2. Upon receipt of a crawl result, add an element to the crawl results adjacency list. Append to the render and crawl queues each URL that is not present in the set of processed URLs. Add these enqueued urls to the set of processed URLs.

  3. Upon receipt of a render result, add an element to the render results map.

  4. The crawl and render queues are drained in FCFS order at a rate determined by the resource offer stream. When the queues are empty, the scheduler declines resource offers to make them available to other frameworks running on the cluster.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].