All Projects → dmitriiweb → extract-emails

dmitriiweb / extract-emails

Licence: MIT license
Extract emails from a given website

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to extract-emails

pe
Fastest general-purpose parsing library for Python with a familiar API
Stars: ✭ 21 (-63.79%)
Mutual labels:  parsing, parsing-library
Nearley
📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.
Stars: ✭ 3,089 (+5225.86%)
Mutual labels:  parsing, parsing-library
FAParser
JSON Parsing + Archiving & Unarchiving in User Defaults
Stars: ✭ 67 (+15.52%)
Mutual labels:  parsing, parsing-library
angel.co-companies-list-scraping
No description or website provided.
Stars: ✭ 54 (-6.9%)
Mutual labels:  scraper, parsing
Scrapysharp
reborn of https://bitbucket.org/rflechner/scrapysharp
Stars: ✭ 226 (+289.66%)
Mutual labels:  scraper, parsing
yellowpages-scraper
Yellowpages.com Web Scraper written in Python and LXML to extract business details available based on a particular category and location.
Stars: ✭ 56 (-3.45%)
Mutual labels:  scraper, parsing
GreynirPackage
The Greynir NLP parser for Icelandic, packaged for PyPI
Stars: ✭ 49 (-15.52%)
Mutual labels:  parsing, parsing-library
Jikan
Unofficial MyAnimeList PHP+REST API which provides functions other than the official API
Stars: ✭ 531 (+815.52%)
Mutual labels:  scraper, parsing
Goose Parser
Universal scrapping tool, which allows you to extract data using multiple environments
Stars: ✭ 211 (+263.79%)
Mutual labels:  scraper, parsing
DotGrok
Parse text with pattern. Inspired by grok filter.
Stars: ✭ 26 (-55.17%)
Mutual labels:  parsing, parsing-library
parson
Yet another PEG parser combinator library and DSL
Stars: ✭ 52 (-10.34%)
Mutual labels:  parsing, parsing-library
rose
Analyse all kinds of data for a TV series
Stars: ✭ 34 (-41.38%)
Mutual labels:  scraper
re-typescript
An opinionated attempt at finally solving typescript interop for ReasonML / OCaml.
Stars: ✭ 68 (+17.24%)
Mutual labels:  parsing
codeparser
Parse Wolfram Language source code as abstract syntax trees (ASTs) or concrete syntax trees (CSTs)
Stars: ✭ 84 (+44.83%)
Mutual labels:  parsing
trawler
scraper for facebook, gab, google and tiktok
Stars: ✭ 20 (-65.52%)
Mutual labels:  scraper
pysub-parser
Library for extracting text and timestamps from multiple subtitle files (.ass, .ssa, .srt, .sub, .txt).
Stars: ✭ 40 (-31.03%)
Mutual labels:  parsing
twpy
Twitter High level scraper for humans.
Stars: ✭ 58 (+0%)
Mutual labels:  scraper
microformats-ruby
Ruby gem that parse HTML containing microformats/microformats2 and returns Ruby objects, a Ruby hash or a JSON hash
Stars: ✭ 89 (+53.45%)
Mutual labels:  parsing
elite-journal
Parsing the Elite: Dangerous journal and putting it into a cool format.
Stars: ✭ 34 (-41.38%)
Mutual labels:  parsing
python web scraping
Web scraping using python, requests and selenium
Stars: ✭ 40 (-31.03%)
Mutual labels:  scraper

Extract Emails

Image

PyPI version

Extract emails and linkedins profiles from a given website

Support the project with BTC: bc1q0cxl5j3se0ufhr96h8x0zs8nz4t7h6krrxkd6l

Documentation

Requirements

  • Python >= 3.7

Installation

pip install extract_emails

Simple Usage

As library

from pathlib import Path

from extract_emails import DefaultFilterAndEmailFactory as Factory
from extract_emails import DefaultWorker
from extract_emails.browsers.requests_browser import RequestsBrowser as Browser
from extract_emails.data_savers import CsvSaver


websites = [
    "website1.com",
    "website2.com",
]

browser = Browser()
data_saver = CsvSaver(save_mode="a", output_path=Path("output.csv"))

for website in websites:
    factory = Factory(
        website_url=website, browser=browser, depth=5, max_links_from_page=1
    )
    worker = DefaultWorker(factory)
    data = worker.get_data()
    data_saver.save(data)

As CLI tool

$ extract-emails --help

$ extract-emails --url https://en.wikipedia.org/wiki/Email -of output.csv -d 1
$ cat output.csv
email,page,website
[email protected],https://en.wikipedia.org/wiki/Email,https://en.wikipedia.org/wiki/Email
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].