All Projects β†’ omrilotan β†’ Isbot

omrilotan / Isbot

Licence: unlicense
πŸ’» JavaScript module that detects bots/crawlers/spiders via the user agent

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Isbot

Vytal
Browser extension to spoof timezone, geolocation, locale and user agent.
Stars: ✭ 1,449 (+399.66%)
Mutual labels:  user-agent
uainfer
Infer the user agent from its User Agent string
Stars: ✭ 21 (-92.76%)
Mutual labels:  user-agent
whoami.js
A simple and lightweight browser detection and logger library
Stars: ✭ 16 (-94.48%)
Mutual labels:  user-agent
php-useragent
A User-agent analyze project which written by PHP.
Stars: ✭ 83 (-71.38%)
Mutual labels:  user-agent
pumba
Fetch, store and access user agent strings for different browsers
Stars: ✭ 12 (-95.86%)
Mutual labels:  user-agent
vue-if-bot
Hide stuff from bots (especially cookie consents)
Stars: ✭ 62 (-78.62%)
Mutual labels:  user-agent
crawlerdetect
Golang module to detect bots and crawlers via the user agent
Stars: ✭ 22 (-92.41%)
Mutual labels:  user-agent
Useragent Switcher
A User-Agent spoofer browser extension that is highly configurable
Stars: ✭ 261 (-10%)
Mutual labels:  user-agent
userAgentLists
Get your lists of User-Agent Strings here
Stars: ✭ 57 (-80.34%)
Mutual labels:  user-agent
egjs-agent
Extracts browser and operating system information from the user agent string or user agent object(userAgentData).
Stars: ✭ 73 (-74.83%)
Mutual labels:  user-agent
user-agent
User-Agent parser for Clojure
Stars: ✭ 24 (-91.72%)
Mutual labels:  user-agent
react-ua
πŸ“±React User Agent Component, Hook, and HOC. SSR-ready, full UT, using new React Context and Hooks API
Stars: ✭ 18 (-93.79%)
Mutual labels:  user-agent
sawmill
Sawmill is a JSON transformation Java library
Stars: ✭ 92 (-68.28%)
Mutual labels:  user-agent
browserslist-generator
A library that makes generating and validating Browserslists a breeze!
Stars: ✭ 77 (-73.45%)
Mutual labels:  user-agent
Server
Server written in PHP, provides a Javascript API for in the browser
Stars: ✭ 34 (-88.28%)
Mutual labels:  user-agent
robots-parser
NodeJS robots.txt parser with support for wildcard (*) matching.
Stars: ✭ 117 (-59.66%)
Mutual labels:  user-agent
http
Aplus Framework HTTP Library
Stars: ✭ 113 (-61.03%)
Mutual labels:  user-agent
Platform.js
A platform detection library.
Stars: ✭ 2,937 (+912.76%)
Mutual labels:  user-agent
Visitor-Parser-JS
Visitor Parser JS
Stars: ✭ 20 (-93.1%)
Mutual labels:  user-agent
bots-zoo
No description or website provided.
Stars: ✭ 59 (-79.66%)
Mutual labels:  user-agent

isbot πŸ€–/πŸ‘¨β€πŸ¦°

Detect bots/crawlers/spiders using the user agent string.

Usage

Simple detection

const isbot = require('isbot')

// Nodejs HTTP
isbot(request.getHeader('User-Agent'))

// ExpressJS
isbot(req.get('user-agent'))

// User Agent string
isbot('Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)') // true
isbot('Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36') // false

Add crawler user agents

Add rules to user agent match RegExp

isbot('Mozilla/5.0') // false
isbot.extend([
    'istat',
    '^mozilla/\\d\\.\\d$'
])
isbot('Mozilla/5.0') // true

Remove matches of known crawlers

Remove rules to user agent match RegExp (see existing rules in list.json file)

isbot('Chrome-Lighthouse') // true
isbot.exclude(['chrome-lighthouse']) // pattern is case insensitive
isbot('Chrome-Lighthouse') // false

Verbose result

Return the respective match for bot user agent rule

isbot.find('Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 DejaClick/2.9.7.2') // 'DejaClick'

Definitions

  • Bot. Autonomous program imitating or replacing some aspect of a human behaviour, performing repetitive tasks much faster than human users could.
  • Good bot. Automated programs who visit websites in order to collect useful information. Web crawlers, site scrapers, stress testers, preview builders and other programs are welcomed on most websites because they serve purposes of mutual benefits.
  • Bad bot. Programs which are designed to perform malicious actions, ultimately hurting businesses. Testing credential databases, DDoS attacks, spam bots.

Clarifications

What does "isbot" do?

This package aims to identify "Good bots". Those who voluntarily identify themselves by setting a unique, preferably descriptive, user agent, usually by setting a dedicated request header.

What doesn't "isbot" do?

It does not try to recognise malicious bots or programs disguising themselves as real users.

Why would I want to identify good bots?

Recognising good bots such as web crawlers is useful for multiple purposes. Although it is not recommended to serve different content to web crawlers like Googlebot, you can still elect to

  • Flag bot pageviews to consider in business analysis
  • Prefer to serve cached content and relieve service load
  • Omit third party solutions' code (tags, pixels)

It is not recommended to whitelist requests for any reason based on user agent header only. Instead other methods of identification can be added such as reverse dns lookup.

Data sources

Crawlers user agents:

Non bot user agents:

Missing something? Please open an issue

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].