All Projects → genkio → Spider Less

genkio / Spider Less

Licence: mit
Web spider as a service, spider on serverless

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Spider Less

Dialetus Service
API to Informal dictionary for the idiomatic expressions that each Brazilian region It has
Stars: ✭ 202 (+17.44%)
Mutual labels:  api, serverless, aws-lambda
Aws Serverless Java Container
A Java wrapper to run Spring, Jersey, Spark, and other apps inside AWS Lambda.
Stars: ✭ 1,054 (+512.79%)
Mutual labels:  api, serverless, aws-lambda
Lib
Autocode CLI and standard library tooling
Stars: ✭ 3,773 (+2093.6%)
Mutual labels:  api, serverless, aws-lambda
Serverless Es Logs
A Serverless plugin to transport logs to ElasticSearch
Stars: ✭ 51 (-70.35%)
Mutual labels:  api, serverless, aws-lambda
Lambda Api
Lightweight web framework for your serverless applications
Stars: ✭ 929 (+440.12%)
Mutual labels:  api, serverless, aws-lambda
Up
Up focuses on deploying "vanilla" HTTP servers so there's nothing new to learn, just develop with your favorite existing frameworks such as Express, Koa, Django, Golang net/http or others.
Stars: ✭ 8,439 (+4806.4%)
Mutual labels:  api, serverless, aws-lambda
Serverless Sam
Serverless framework plugin to export AWS SAM templates for a service
Stars: ✭ 143 (-16.86%)
Mutual labels:  serverless, aws-lambda
Aws Lambda Libreoffice
85 MB LibreOffice to fit inside AWS Lambda compressed with Brotli
Stars: ✭ 145 (-15.7%)
Mutual labels:  serverless, aws-lambda
Serverless Aws Alias
Alias support for Serverless 1.x
Stars: ✭ 171 (-0.58%)
Mutual labels:  serverless, aws-lambda
Laravel Bridge
Package to use Laravel on AWS Lambda with Bref
Stars: ✭ 168 (-2.33%)
Mutual labels:  serverless, aws-lambda
Flogo
Project Flogo is an open source ecosystem of opinionated event-driven capabilities to simplify building efficient & modern serverless functions, microservices & edge apps.
Stars: ✭ 1,891 (+999.42%)
Mutual labels:  serverless, aws-lambda
Serverless Sentry Plugin
This plugin adds automatic forwarding of errors and exceptions to Sentry (https://sentry.io) and Serverless (https://serverless.com)
Stars: ✭ 146 (-15.12%)
Mutual labels:  serverless, aws-lambda
Archive aws Lambda Go Net
Network I/O interface for AWS Lambda Go runtime.
Stars: ✭ 151 (-12.21%)
Mutual labels:  serverless, aws-lambda
Es2017 Lambda Boilerplate
AWS Lambda boilerplate for Node.js 6.10, adding ES2018/7/6 features, Docker-based unit testing and various CI/CD configurations
Stars: ✭ 169 (-1.74%)
Mutual labels:  serverless, aws-lambda
Serverless Pg
A package for managing PostgreSQL connections at SERVERLESS scale
Stars: ✭ 142 (-17.44%)
Mutual labels:  serverless, aws-lambda
Selfie2anime
Anime2Selfie Backend Services - Lambda, Queue, API Gateway and traffic processing
Stars: ✭ 146 (-15.12%)
Mutual labels:  serverless, aws-lambda
Portkey
Live-coding the Cloud
Stars: ✭ 139 (-19.19%)
Mutual labels:  serverless, aws-lambda
Graphql Genie
Simply pass in your GraphQL type defintions and get a fully featured GraphQL API with referential integrity, inverse updates, subscriptions and role based access control that can be used client side or server side.
Stars: ✭ 147 (-14.53%)
Mutual labels:  api, serverless
Zappa
Serverless Python
Stars: ✭ 11,859 (+6794.77%)
Mutual labels:  serverless, aws-lambda
List Lambdas
Enumerate Lambda functions across all regions with useful metadata 💡💵⚙
Stars: ✭ 156 (-9.3%)
Mutual labels:  serverless, aws-lambda

spider-less

Web spider on Serverless!

About Spiderless

Spiderless is the backend layer of KMPPP, a web spider as a service application, it allows you to monitor and get notified of nearly anything on the web. It is built on top of these technologies:

Technology Used For
Bulma, Buefy UI
Vue.js Front-end logic
AWS S3 Website hosting
AWS Lambda Backend API
AWS SNS Message queue
AWS DynamoDB Database
AWS API Gateway API gateway
AWS Cloudfront CDN
AWS Route 53 DNS

Architecture

serverless application architecture

API Endpoints

GET subscriptions

Description

Get a list of subscriptions (a maximum of 1 MB of data limited by DynamoDB).

Parameters

None

Request

curl /api/subscriptions

Response

[
  {
    "createdAt": 1544833435070,
    "targets": [
      {
        "selector":"#title-overview-widget > div.vital > div.title_block > div > div.ratings_wrapper > div.imdbRating > a > span",
        "label":"ratingCount"
      }
    ],
    "id": "b4d98de0-ffff-11e8-a4c9-9b9ee9089058",
    "url": "https://www.imdb.com/title/tt0111161/",
    "interval": 60
  }
]

POST subscriptions

Description

Create a new subscription to feed the spider.

Parameters

  • url (required) - Target website url
  • targets (required) - List of css selectors from which text contents are expected to be extracted
  • interval (required) - The interval (in minutes) between scrape

Request

curl -X POST /api/subscriptions -d '{"url":"https://www.imdb.com/title/tt0111161/","targets":"[{\"label\":\"ratingCount\",\"selector\":\"#title-overview-widget > div.vital > div.title_block > div > div.ratings_wrapper > div.imdbRating > a > span\"}]","interval":"60"}' -H "Content-Type: application/json"

Response

{
  "id": "ef417d30-ffff-11e8-a4c9-9b9ee9089058",
  "url": "https://www.imdb.com/title/tt0111161/",
  "targets": [
    {
      "label":"ratingCount",
      "selector":"#title-overview-widget > div.vital > div.title_block > div > div.ratings_wrapper > div.imdbRating > a > span"
    }
  ],
  "interval": 60,
  "createdAt": 1544833533059,
  "updatedAt": 1544833533059
}

DELETE subscriptions

Description

Delete a subscription.

Parameters

  • id (required) - Subscription id

Request

curl -X DELETE /api/subscriptions/:id

Response

{
  "id": "d72c05d0-ffff-11e8-a4c9-9b9ee9089058"
}

Functions List

scrape

Description

Scrape target websites and extract target contents.

Invoke

yarn invoke:local scrape -d '{"createdAt":1544833435070,"updatedAt":1544833435070,"targets":[{"selector":"#title-overview-widget > div.vital > div.title_block > div > div.ratings_wrapper > div.imdbRating > a > span","label":"ratingCount"}],"id":"b4d98de0-ffff-11e8-a4c9-9b9ee9089058","url":"https://www.imdb.com/title/tt0111161/","interval":60}'

Response

[
  {
    "label": "ratingCount",
    "content": "2,025,796"
  }
]

cron

Description

Fetch subscriptions from database and filter out the ones need to be executed.

Invoke

yarn invoke:local cron

Response

None

Development

# install dependencies
yarn install

# start api server on port 8090
yarn start

# invoke function locally
yarn invoke:local function_name

# invoke remote function
yarn invoke cron function_name

Deploy

# first setup your aws credentials https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/setup-credentials.html
yarn deploy
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].