All Projects → oliver006 → Sockpuppet

oliver006 / Sockpuppet

Having fun with WebSockets, Python, Golang and nytimes.com

Programming Languages

python
139335 projects - #7 most used programming language
go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to Sockpuppet

Beast
HTTP and WebSocket built on Boost.Asio in C++11
Stars: ✭ 3,241 (+10028.13%)
Mutual labels:  websockets, websocket-client
Websockets
Library for building WebSocket servers and clients in Python
Stars: ✭ 3,724 (+11537.5%)
Mutual labels:  websockets, websocket-client
Websocket Client
🔧 .NET/C# websocket client library
Stars: ✭ 297 (+828.13%)
Mutual labels:  websockets, websocket-client
reactive-streams-for-java-developers
No description or website provided.
Stars: ✭ 16 (-50%)
Mutual labels:  websockets, websocket-client
Pawl
Asynchronous WebSocket client
Stars: ✭ 448 (+1300%)
Mutual labels:  websockets, websocket-client
LazWebsockets
Websocket Server and Client Library written in Lazarus
Stars: ✭ 51 (+59.38%)
Mutual labels:  websockets, websocket-client
Stl.fusion
Get real-time UI updates in Blazor apps and 10-1000x faster API responses with a novel approach to distributed reactive computing. Fusion brings computed observables and automatic dependency tracking from Knockout.js/MobX/Vue to the next level by enabling a single dependency graph span multiple servers and clients, including Blazor apps running in browser.
Stars: ✭ 858 (+2581.25%)
Mutual labels:  websockets, websocket-client
remoting
Jetlang Remoting - asynchronous distributed messaging
Stars: ✭ 27 (-15.62%)
Mutual labels:  websockets, websocket-client
Awesome Websockets
A curated list of Websocket libraries and resources.
Stars: ✭ 850 (+2556.25%)
Mutual labels:  websockets, websocket-client
Java Slack Sdk
Slack Developer Kit (including Bolt for Java) for any JVM language
Stars: ✭ 393 (+1128.13%)
Mutual labels:  websockets, websocket-client
EthernetWebServer
This is simple yet complete WebServer library for AVR, Portenta_H7, Teensy, SAM DUE, SAMD21/SAMD51, nRF52, STM32, RP2040-based, etc. boards running Ethernet shields. The functions are similar and compatible to ESP8266/ESP32 WebServer libraries to make life much easier to port sketches from ESP8266/ESP32. Coexisting now with `ESP32 WebServer` and…
Stars: ✭ 118 (+268.75%)
Mutual labels:  websockets, websocket-client
Gun
HTTP/1.1, HTTP/2 and Websocket client for Erlang/OTP.
Stars: ✭ 710 (+2118.75%)
Mutual labels:  websockets, websocket-client
text
An experiment with WebSockets and the human condition.
Stars: ✭ 51 (+59.38%)
Mutual labels:  websockets, websocket-client
Node Slack Sdk
Slack Developer Kit for Node.js
Stars: ✭ 2,988 (+9237.5%)
Mutual labels:  websockets, websocket-client
general-angular
Realtime Angular Admin/CRUD Front End App
Stars: ✭ 24 (-25%)
Mutual labels:  websockets, websocket-client
Python Slack Sdk
Slack Developer Kit for Python
Stars: ✭ 3,307 (+10234.38%)
Mutual labels:  websockets, websocket-client
Websocat
Command-line client for WebSockets, like netcat (or curl) for ws:// with advanced socat-like functions
Stars: ✭ 3,477 (+10765.63%)
Mutual labels:  websockets, websocket-client
System.Net.WebSockets.Client.Managed
Microsoft's managed implementation of System.Net.WebSockets.ClientWebSocket tweaked for use on Windows 7 and .NET 4.5
Stars: ✭ 41 (+28.13%)
Mutual labels:  websockets, websocket-client
Saea
SAEA.Socket is a high-performance IOCP framework TCP based on dotnet standard 2.0; Src contains its application test scenarios, such as websocket,rpc, redis driver, MVC WebAPI, lightweight message server, ultra large file transmission, etc. SAEA.Socket是一个高性能IOCP框架的 TCP,基于dotnet standard 2.0;Src中含有其应用测试场景,例如websocket、rpc、redis驱动、MVC WebAPI、轻量级消息服务器、超大文件传输等
Stars: ✭ 318 (+893.75%)
Mutual labels:  websockets, websocket-client
Ulfius
Web Framework to build REST APIs, Webservices or any HTTP endpoint in C language. Can stream large amount of data, integrate JSON data with Jansson, and create websocket services
Stars: ✭ 666 (+1981.25%)
Mutual labels:  websockets, websocket-client

SockPuppet

Having fun with WebSockets, Python, Golang and nytimes.com


### What's this all about? Did you ever wonder how **nytimes.com** pushes breaking news articles to the front page while you have it open in your browser? Well, I used my browser's developer tools to look at what's going one and it turns out, they don't periodically reload JSON data but use websockets to push new events directly to your browser ([see here](https://developer.mozilla.org/en-US/docs/WebSockets) for more information about websockets).
It's a system called `nyt-fabrik`, here are a few talks and presentations where they give some insight into the architecture: [search google for "nytimes fabrik websockets"](https://www.google.com/search?q=nytimes+fabrik+websockets).

There is example code, see here for the Python code and here for the Golang example.


### Cool, so how does it work?

When you go to nytimes.com, your browser will establish a websocket connection with the NYT fabrik server and, after a little login dance, will start listening for news events. Your browser opens a websocket TCP connection to e.g. ws://blablabla.fabrik.nytimes.com./123/abcde123/websocket and the server sends a one-character frame o which is a request to provide some sort of login identification.
The client (your browser) responds with ["{\"action\":\"login\",\"client_app\":\"hermes.push\",\"cookies\":{\"nyt-s\":\"SOME_COOKIE_VALUE_HERE\"}}"] and next thing you know you, you either receive a h every 20-30 seconds which is some sort of keep-alive or a frame that starts with a and has all sorts of data encoded as JSON.

If we receive a message starting with a, we can strip the first character and JSON decode the rest.

{
    "body": "{\"status\":\"updated\",\"version\":1,\"links\":[{\"url\":\"http://www.nytimes.com/2015/05/26/us/cleveland-police.html\",\"count\":0,\"content_id\":\"100000003702598\",\"content_type\":\"article\",\"offset\":0}],\"title\":\"Cleveland Is Said to Settle Justice Department Lawsuit Over Policing\",\"start_time\":1432581057,\"display_duration\":null,\"label\":\"Breaking News\",\"last_modified\":1432581057,\"display_type_id\":1,\"end_time\":1432581057,\"id\":34931339,\"sub_type\":\"BreakingNews\"}",
    "timestamp": "2015-05-21T11:21:11.123456Z",
    "hash_key": "34131339",
    "uuid": "1234",
	...
    "account": "nyt1",
    "type": "feeds_item"
}

If the decoded message has field "body", we can decode it. In case of a breaking news item it looks something like this:

{"status": "updated", "sub_type": "BreakingNews", 
"links": [{"url": "http://www.nytimes.com/2015/05/26/us/cleveland-police.html", "count": 0, "content_id": "100000003702598", "content_type": "article", "offset": 0}], 
"title": "Cleveland Is Said to Settle Justice Department Lawsuit Over Policing", 
"start_time": 1432581057, "display_duration": null, "label": "Breaking News",
"version": 1, "display_type_id": 1, "end_time": 1432581057, 
"last_modified": 1432581057, "id": 34131339}

### Neat but how do I access the feed programmatically?

Good question, let's see, we need about 3-4 things to get this to work, easy. For the Python example, I'll be using the Tornado websocket framework and for the Golang example I'll be using the Golang.org websocket package.

Connect to the websocket

In Python, this is easy:

url = "ws://blablabla.fabrik.nytimes.com./123/abcdef123/websocket"
try:
    w = yield tornado.websocket.websocket_connect(url, connect_timeout=5)
    logging.info("Connected to %s", url)
except Exception as ex:
    logging.error("couldn't connect, err: %s", ex)

In Golang, it looks about the same:

addr := "ws://blablabla.fabrik.nytimes.com./123/abcdef123/websocket"
ws, err := websocket.Dial(addr, "", "http://www.nytimes.com/")
if err != nil {
	log.Fatal(err)
}
log.Printf("Connected to %s", addr)

That was easy, wasn't it?

Listen for incoming messages

Good, we now are connected and have a websocket object/struct we can work with, let's listen for incoming messages.

Python:

while True:
    payload = yield w.read_message()
    if payload is None:
        logging.error("uh oh, we got disconnected")
        return

and in Golang:

var msgBuf = make([]byte, 4096)
for {
	bufLen, err := ws.Read(msgBuf)
	if err != nil {
		log.Printf("read err: %s", err)
		return
	}

One caveat here, the Golang version can't handle messages longer than 4k (it'll chunk them into 4k pieces) but for our purposes that's not an issue.

Send the login message

If we receive o we need to send the login message. We need a cookie value so let's make one up:

if payload[0] == "o":
    cookie = ''.join(random.choice(string.ascii_letters + string.digits) for _ in range(32))
    msg = json.dumps(['{"action":"login", "client_app":"hermes.push", "cookies":{"nyt-s":"%s"}}' % cookie])
    w.write_message(msg.encode('utf8'))
    logging.info("sent cookie: %s", cookie)

In Golang this is a bit more verbose:

if msgBuf[0] == 'o' {
	// reply to the login request
	cookie := randCookie()
	msg := fmt.Sprintf(`["{\"action\":\"login\", \"client_app\":\"hermes.push\", \"cookies\":{\"nyt-s\":\"%s\"}}"]`, cookie)
	_, err := ws.Write([]byte(msg))
	if err != nil {
		log.Fatal(err)
	}
	log.Printf("Sent cookie: %s\n", cookie)
}

and randCookie() looks like this:

func randCookie() string {
	letters := []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890")
	b := make([]rune, 30)
	for i := range b {
		b[i] = letters[rand.Intn(len(letters))]
	}
	return string(b)
}

Patiently wait; and (mostly) ignore the h messages

Nothing much to do here, whenever we get a h message we can simply write ping to the console.

elif payload[0] == 'h':
    logging.info('ping')

and

if payload[0] == "o" {
	log.Println("ping")
}

Decode the news alert message when we receive one

Messages from the server that start with a contain JSON encoded data that we can decode. Python first:

elif payload[0] == 'a':
    frame = json.loads(payload[1:])
	if 'body' in frame:
	    body = json.loads(frame['body'])

Now you can for check if body['sub_type'] == "BreakingNews" or whatever else you plan on doing with this.

In Golang everything is a bit more verbose but roughly works the same (inlined and shortened for brevity).

if payload[0] == "o" {

	frame := []struct {
		UUID        string `json:"uuid"`
		Product     string `json:"product"`
		Project     string `json:"project"`
		...
		Body        string `json:"body,omitempty"`
	}{}

	// [1:] as we want to skip the leading character `a`
	err = json.Unmarshal(payload[1:], &frame)
	if err != nil {
		return
	}
	if len(frame.Body) > 1 {
		// here we should try to JSON unmarshal frame.Body
	}
}

frame.Body can now be unmarshaled in the same way as payload[1:] earlier. The resulting struct for it looks something like this:

type MessageBody struct {
	ID           int    `json:"id"`
	Title        string `json:"title"`
	Status       string `json:"status"`
	Version      int    `json:"version"`
	SubType      string `json:"sub_type"`
	Label        string `json:"label"`
	StartTime    int    `json:"start_time"`
	EndTime      int    `json:"end_time"`
	LastModified int    `json:"last_modified"`
	Links []struct {
		URL         string `json:"url"`
		ContentID   string `json:"content_id"`
	} `json:"links"`
}


### Sweet but what do I do with this?

Totally up to you. Send yourself an email or txt msg using Twilio or Plivo every time something happens.

Cool, how do I run the examples?

Python

python sockpuppet.py --ws_addr="ws://<<ADDRESS HERE>>"

Go

go run sockpuppet.go --ws_addr="ws://<<ADDRESS HERE>>"

You can find a valid websocket host by using the Developer Console of your favorite browser and visit nytimes.com and look for websocket connections in the network tab.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].