All Projects → imanhodjaev → pumba

imanhodjaev / pumba

Licence: Unlicense license
Fetch, store and access user agent strings for different browsers

Programming Languages

elixir
2628 projects

Projects that are alternatives of or similar to pumba

bots-zoo
No description or website provided.
Stars: ✭ 59 (+391.67%)
Mutual labels:  user-agent, crawling
zcrawl
An open source web crawling platform
Stars: ✭ 21 (+75%)
Mutual labels:  crawling
xXx dead xXx
b̶̡̪̬͒l̸̰̗̝̀ỏ̷̡̩g̴͇̑g̶̲̱̽͐i̵̹͗n̶̤̥͂̅̆g̴̮̾̅͜ ̷̧͎͆i̷̛͒͜͠n̸̥̺͒ ̶͚͚͊̿͜t̸̺͙̭̆̊̈́ḧ̶̟́̐e̸̱͔̟̓̓͝ ̶̨͔̾͛̑d̵̥̣̏ȧ̷̼̊r̷̰̝̥̅̌͝k̵̟̥̞̉̍͛
Stars: ✭ 19 (+58.33%)
Mutual labels:  crawling
browserslist-generator
A library that makes generating and validating Browserslists a breeze!
Stars: ✭ 77 (+541.67%)
Mutual labels:  user-agent
auctus
Dataset search engine, discovering data from a variety of sources, profiling it, and allowing advanced queries on the index
Stars: ✭ 34 (+183.33%)
Mutual labels:  crawling
php-useragent
A User-agent analyze project which written by PHP.
Stars: ✭ 83 (+591.67%)
Mutual labels:  user-agent
socials
👨‍👩‍👦 Social account detection and extraction in Python, e.g. for crawling/scraping.
Stars: ✭ 37 (+208.33%)
Mutual labels:  crawling
proxycrawl-python
ProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (+325%)
Mutual labels:  crawling
telegram-crawler
🕷 Automatically detect changes made to the official Telegram sites, clients and servers.
Stars: ✭ 84 (+600%)
Mutual labels:  crawling
Vytal
Browser extension to spoof timezone, geolocation, locale and user agent.
Stars: ✭ 1,449 (+11975%)
Mutual labels:  user-agent
robots-parser
NodeJS robots.txt parser with support for wildcard (*) matching.
Stars: ✭ 117 (+875%)
Mutual labels:  user-agent
diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (+341.67%)
Mutual labels:  crawling
crawling-framework
Easily crawl news portals or blog sites using Storm Crawler.
Stars: ✭ 22 (+83.33%)
Mutual labels:  crawling
the-seinfeld-chronicles
A dataset for textual analysis on arguably the best written comedy television show ever.
Stars: ✭ 14 (+16.67%)
Mutual labels:  crawling
uach-retrofill
This snippet illustrates how to reconstruct the legacy navigator.userAgent string value from the modern navigator.userAgentData values.
Stars: ✭ 26 (+116.67%)
Mutual labels:  user-agent
useragent-generator
Easily generate correct user-agent strings for popular browsers
Stars: ✭ 62 (+416.67%)
Mutual labels:  user-agent
crawlerdetect
Golang module to detect bots and crawlers via the user agent
Stars: ✭ 22 (+83.33%)
Mutual labels:  user-agent
dxram
A distributed in-memory key-value storage for billions of small objects.
Stars: ✭ 25 (+108.33%)
Mutual labels:  in-memory-storage
haro
Haro is a modern immutable DataStore
Stars: ✭ 24 (+100%)
Mutual labels:  in-memory-storage
react-ua
📱React User Agent Component, Hook, and HOC. SSR-ready, full UT, using new React Context and Hooks API
Stars: ✭ 18 (+50%)
Mutual labels:  user-agent

Tests Lint

Pumba helps you

  1. To fetch user agent strings for different browsers,
  2. Keep in-memory state,
  3. Randomly fetch any user agent,
  4. Profit - Hakuna Matata 🦄

Use cases 🔮

You might want to use Pumba

  1. To simulate real user agent when requesting website or resource,
  2. To use in tandem with crawlers to randomly swap user agent strings,
  3. Build REST API or any API on top of it or expose as a service,
  4. Many other use cases I'm not aware (open issue to share one).

Installation 💾

If available in Hex, the package can be installed by adding pumba to your list of dependencies in mix.exs and if needed please add to application list to start:

def deps do
  [
    {:pumba, "~> 0.0.2"}
  ]
end

Usage 🧠

Load user agents

To load user agent strings for a given browser you need to call

iex> Pumba.load("Firefox")
:ok

Get random user agent

To load user agent strings for a given browser you need to call

iex> Pumba.random()
Mozilla/5.0 (X11; Linux ppc64le; rv:75.0) Gecko/20100101 Firefox/75.0

Check if user agents loaded

iex> Pumba.ready?("Firefox")
true

Get user agents for browser

iex> Pumba.get("Firefox")
[
  "Mozilla/5.0 (X11; Linux ppc64le; rv:75.0) Gecko/20100101 Firefox/75.0",
  ...
]

Set custom client

Default client fetches user agents from http://www.useragentstring.com.

There are two ways to set custom client first is via config second is overriding manually

Configuration

config :pumba,
  client: MyAwesomeClient

Runtime

iex> Pumba.set_client(MyAwesomeClient)

Create custom client

If you want to have your own client then it should implement Pumba.Client behaviour.

User agents storage

Storage is a GenServer which has the following state and lives at Pumba.UserAgents

%{
  client: Pumba.Client.DefaultClient,
  browsers: %{},
  names: []
}

Where browsers is a map with key as browser name and the list user agents as it's value, names is a list of loaded browsers using which we later can randomly pick browser and return a random user agent.

browsers contains %Pumba.Result{} record which keeps total count of user agents and indexed map with user agent strings for fast lookups.

Get current state

To get the latest state you can use Pumba.all/0 function.

Documentation 📜

Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/pumba.

Assets 💄

https://www.flickr.com/photos/15622979@N07/4329873905

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].