itielshwartz / asyncio-hn

Licence: MIT license

Python (asyncio) wrapper for hackernews api

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to asyncio-hn

Hackernews React Graphql

Hacker News clone rewritten with universal JavaScript, using React and GraphQL.

Stars: ✭ 4,242 (+15611.11%)

Mutual labels: hacker-news, hn

hackernews-button

Privacy-preserving Firefox extension linking to Hacker News discussion; built with Bloom filters and WebAssembly

Stars: ✭ 73 (+170.37%)

Mutual labels: hacker-news, hn

reading-list

社区驱动的高质量聚合阅读列表

Stars: ✭ 45 (+66.67%)

Mutual labels: hacker-news

RARBG-scraper

With Selenium headless browsing and CAPTCHA solving

Stars: ✭ 38 (+40.74%)

Mutual labels: scraping

socials

👨‍👩‍👦 Social account detection and extraction in Python, e.g. for crawling/scraping.

Stars: ✭ 37 (+37.04%)

Mutual labels: scraping

emacs-hnreader

Read Hacker News inside Emacs

Stars: ✭ 34 (+25.93%)

Mutual labels: hacker-news

html-table-to-json

Generate JSON representations of HTML tables

Stars: ✭ 39 (+44.44%)

Mutual labels: scraping

tophn

An application to recommend the topmost story of Hacker News from the last 24 hours

Stars: ✭ 31 (+14.81%)

Mutual labels: hacker-news

shorter.recipes

A website dedicated to making recipes from any website easy to read.

Stars: ✭ 27 (+0%)

Mutual labels: scraping

trafilatura

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

Stars: ✭ 711 (+2533.33%)

Mutual labels: scraping

4cat

The 4CAT Capture and Analysis Toolkit provides modular data capture & analysis for a variety of social media platforms.

Stars: ✭ 144 (+433.33%)

Mutual labels: scraping

scrapers

scrapers for building your own image databases

Stars: ✭ 46 (+70.37%)

Mutual labels: scraping

NBA-Fantasy-Optimizer

NBA Daily Fantasy Lineup Optimizer for FanDuel Using Python

Stars: ✭ 21 (-22.22%)

Mutual labels: scraping

ScrapeBot

A Selenium-driven tool for automated website interaction and scraping.

Stars: ✭ 16 (-40.74%)

Mutual labels: scraping

double-agent

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

Stars: ✭ 123 (+355.56%)

Mutual labels: scraping

diffbot-php-client

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

Stars: ✭ 53 (+96.3%)

Mutual labels: scraping

gochanges

**[ARCHIVED]** website changes tracker 🔍

Stars: ✭ 12 (-55.56%)

Mutual labels: scraping

etf4u

📊 Python tool to scrape real-time information about ETFs from the web and mixing them together by proportionally distributing their assets allocation

Stars: ✭ 29 (+7.41%)

Mutual labels: scraping

Architeuthis

MITM HTTP(S) proxy with integrated load-balancing, rate-limiting and error handling. Built for automated web scraping.

Stars: ✭ 35 (+29.63%)

Mutual labels: scraping

ioweb

Web Scraping Framework

Stars: ✭ 31 (+14.81%)

Mutual labels: scraping

View All Similar Projects ➔

asyncio-hn

A simple asyncio wrapper to download hacker-news with speed and ease.

The package supports all endpoints of the official API : hacker-news API

Develop proccess: Using asyncio to download hackernews

Installation

pip install asyncio-hn

Usage

import asyncio
from asyncio_hn import ClientHN

async def main(loop):
    # We init the client - extension of aiohttp.ClientSession
    async with ClientHN(loop=loop) as hn:
        # Up to 500 top and top stories (only ids)
        hn_new_stories = await hn.top_stories()
        # Download top 10 story data
        top_posts = await hn.items(hn_new_stories[:10])
        # Download the user data for each story
        users = await hn.users([post.get("by") for post in top_posts])


if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main(loop))

Advance usage

Using this config you can reach 1000+ request/sec.

import aiohttp
N = 1_000_000

async def advance_run(loop):
    # We init the client - extension of aiohttp.ClientSession
    conn = aiohttp.TCPConnector(limit=1000, loop=loop)
    async with ClientHN(loop=loop, queue_size=1000, connector=conn, progress_bar=True, debug=True) as hn:
        # Download the last 1,000,000 stories
        hn_new_stories = await hn.last_n_items(n=N)

Output example:

Item:

item = {'by': 'amzans', 'descendants': 25, 'id': 13566716,
                'kids': [13567061, 13567631, 13567027, 13567055, 13566798, 13567473], 'score': 171, 'time': 1486210548,
                'title': 'Network programming with Go (2012)', 'type': 'story',
                'url': 'https://jannewmarch.gitbooks.io/network-programming-with-go-golang-/content/'},
               {'by': 'r3bl', 'descendants': 1, 'id': 13567940, 'kids': [13568249], 'score': 24, 'time': 1486230224,
                'title': 'YouTube removes hundreds of the best climate science videos from the Internet',
                'type': 'story',
                'url': 'http://climatestate.com/2017/02/03/youtube-removes-hundreds-of-the-best-climate-science-videos-from-the-internet/'}

User:

user = {'created': 1470758993, 'id': 'amzans', 'karma': 174,
        'submitted': [13567884, 13566716, 13566699, 13558456, 13539270, 13539151, 13514498, 13418469, 13417725,
                      13416562, 13416097, 13416034, 13415954, 13415894, 13395310, 13394996, 13392554, 12418804,
                      12418361, 12413958, 12411992, 12411732, 12411546, 12262383, 12255593]}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

itielshwartz / asyncio-hn

Programming Languages

Labels

Projects that are alternatives of or similar to asyncio-hn

asyncio-hn

Installation

Usage

Advance usage

Output example: