Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → joeyism → Linkedin_scraper

joeyism / Linkedin_scraper

Licence: gpl-3.0

A library that scrapes Linkedin for user data

Programming Languages

python

139335 projects - #7 most used programming language

Labels

chrome firefox driver scraper profile linkedin users

Projects that are alternatives of or similar to Linkedin scraper

Browser Addon

Kee adds free, secure and easy password management features to your browser which save time and keep your private data more secure.

Stars: ✭ 386 (-6.54%)

Mutual labels: chrome, firefox

Copy As Markdown

Copying Link, Image and Tab(s) as Markdown Much Easier.

Stars: ✭ 332 (-19.61%)

Mutual labels: chrome, firefox

Crx Selection Translate

一站式划词 / 截图 / 网页全文 / 音视频翻译扩展。

Stars: ✭ 3,603 (+772.4%)

Mutual labels: chrome, firefox

Singlefile

Web Extension for Firefox/Chrome/MS Edge and CLI tool to save a faithful copy of an entire web page in a single HTML file

Stars: ✭ 4,417 (+969.49%)

Mutual labels: chrome, firefox

Extanalysis

Browser Extension Analysis Framework - Scan, Analyze Chrome, firefox and Brave extensions for vulnerabilities and intels

Stars: ✭ 351 (-15.01%)

Mutual labels: chrome, firefox

Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy

Stars: ✭ 309 (-25.18%)

Mutual labels: linkedin, scraper

Easy To Rss

🚀 Chrome/Firefox Extension to retreive RSS feeds URLs from WebSite, RSSHub supported

Stars: ✭ 386 (-6.54%)

Mutual labels: chrome, firefox

Searchwithmybrowser

Open Cortana searches with your default browser.

Stars: ✭ 285 (-30.99%)

Mutual labels: chrome, firefox

Org Capture Extension

A Chrome and firefox extension facilitating org-capture in emacs

Stars: ✭ 396 (-4.12%)

Mutual labels: chrome, firefox

Headereditor

Manage browser's requests, include modify the request headers and response headers, redirect requests, cancel requests

Stars: ✭ 338 (-18.16%)

Mutual labels: chrome, firefox

Paint Github

[WebExtension] Draw your GitHub comment.

Stars: ✭ 302 (-26.88%)

Mutual labels: chrome, firefox

Webextension Toolbox

Small CLI toolbox for cross-browser WebExtension development

Stars: ✭ 365 (-11.62%)

Mutual labels: chrome, firefox

Arduino Create Agent

The Arduino Create Agent

Stars: ✭ 298 (-27.85%)

Mutual labels: chrome, firefox

Roam Toolkit

Roam force multiplier

Stars: ✭ 390 (-5.57%)

Mutual labels: chrome, firefox

Fofa view

FOFA Pro view 是一款FOFA Pro 资产展示浏览器插件，目前兼容 Chrome、Firefox、Opera。

Stars: ✭ 291 (-29.54%)

Mutual labels: chrome, firefox

Melonjs

a fresh & lightweight javascript game engine

Stars: ✭ 3,721 (+800.97%)

Mutual labels: chrome, firefox

Hackbrowserdata

Decrypt passwords/cookies/history/bookmarks from the browser. 一款可全平台运行的浏览器数据导出解密工具。

Stars: ✭ 3,864 (+835.59%)

Mutual labels: chrome, firefox

Jjb

一个帮助你自动申请京东价格保护的chrome拓展

Stars: ✭ 3,083 (+646.49%)

Mutual labels: chrome, firefox

Surfingkeys

Map your keys for web surfing, expand your browser with javascript and keyboard.

Stars: ✭ 3,787 (+816.95%)

Mutual labels: chrome, firefox

History Master

📈📉📊A Firefox/Chrome extension to visualize browsing history, sync among different browsers!

Stars: ✭ 356 (-13.8%)

Mutual labels: chrome, firefox

View All Similar Projects ➔

Linkedin Scraper

Scrapes Linkedin User Data

Installation

pip3 install --user linkedin_scraper

Version 2.0.0 and before is called linkedin_user_scraper and can be installed via pip3 install --user linkedin_user_scraper

Setup

First, you must set your chromedriver location by

export CHROMEDRIVER=~/chromedriver

Usage

To use it, just create the class.

Sample Usage

from linkedin_scraper import Person, actions
from selenium import webdriver
driver = webdriver.Chrome()

email = "[email protected]"
password = "password123"
actions.login(driver, email, password) # if email and password isnt given, it'll prompt in terminal
person = Person("https://www.linkedin.com/in/andre-iguodala-65b48ab5", driver=driver)

NOTE: The account used to log-in should have it's language set English to make sure everything works as expected.

User Scraping

from linkedin_scraper import Person
person = Person("https://www.linkedin.com/in/andre-iguodala-65b48ab5")

Company Scraping

from linkedin_scraper import Company
company = Company("https://ca.linkedin.com/company/google")

Scraping sites where login is required first

Run ipython or python
In ipython/python, run the following code (you can modify it if you need to specify your driver)

from linkedin_scraper import Person
from selenium import webdriver
driver = webdriver.Chrome()
person = Person("https://www.linkedin.com/in/andre-iguodala-65b48ab5", driver = driver, scrape=False)

Login to Linkedin
[OPTIONAL] Logout of Linkedin
In the same ipython/python code, run

person.scrape()

The reason is that LinkedIn has recently blocked people from viewing certain profiles without having previously signed in. So by setting scrape=False, it doesn't automatically scrape the profile, but Chrome will open the linkedin page anyways. You can login and logout, and the cookie will stay in the browser and it won't affect your profile views. Then when you run person.scrape(), it'll scrape and close the browser. If you want to keep the browser on so you can scrape others, run it as

NOTE: For version >= 2.1.0, scraping can also occur while logged in. Beware that users will be able to see that you viewed their profile.

person.scrape(close_on_complete=False)

so it doesn't close.

Scraping sites and login automatically

From verison 2.4.0 on, actions is a part of the library that allows signing into Linkedin first. The email and password can be provided as a variable into the function. If not provided, both will be prompted in terminal.

from linkedin_scraper import Person, actions
from selenium import webdriver
driver = webdriver.Chrome()
email = "[email protected]"
password = "password123"
actions.login(driver, email, password) # if email and password isnt given, it'll prompt in terminal
person = Person("https://www.linkedin.com/in/andre-iguodala-65b48ab5", driver=driver)

API

Person

A Person object can be created with the following inputs:

Person(linkedin_url=None, name=None, about=[], experiences=[], educations=[], interests=[], accomplishments=[], company=None, job_title=None, driver=None, scrape=True)

`linkedin_url`

This is the linkedin url of their profile

`name`

This is the name of the person

`about`

This is the small paragraph about the person

`experiences`

This is the past experiences they have. A list of linkedin_scraper.scraper.Experience

`educations`

This is the past educations they have. A list of linkedin_scraper.scraper.Education

`interests`

This is the interests they have. A list of linkedin_scraper.scraper.Interest

`accomplishment`

This is the accomplishments they have. A list of linkedin_scraper.scraper.Accomplishment

`company`

This the most recent company or institution they have worked at.

`job_title`

This the most recent job title they have.

`driver`

This is the driver from which to scraper the Linkedin profile. A driver using Chrome is created by default. However, if a driver is passed in, that will be used instead.

For example

driver = webdriver.Chrome()
person = Person("https://www.linkedin.com/in/andre-iguodala-65b48ab5", driver = driver)

`scrape`

When this is True, the scraping happens automatically. To scrape afterwards, that can be run by the scrape() function from the Person object.

`scrape(close_on_complete=True)`

This is the meat of the code, where execution of this function scrapes the profile. If close_on_complete is True (which it is by default), then the browser will close upon completion. If scraping of other profiles are desired, then you might want to set that to false so you can keep using the same driver.

Company

Company(linkedin_url=None, name=None, about_us=None, website=None, headquarters=None, founded=None, company_type=None, company_size=None, specialties=None, showcase_pages=[], affiliated_companies=[], driver=None, scrape=True, get_employees=True)

`linkedin_url`

This is the linkedin url of their profile

`name`

This is the name of the company

`about_us`

The description of the company

`website`

The website of the company

`headquarters`

The headquarters location of the company

`founded`

When the company was founded

`company_type`

The type of the company

`company_size`

How many people are employeed at the company

`specialties`

What the company specializes in

`showcase_pages`

Pages that the company owns to showcase their products

`affiliated_companies`

Other companies that are affiliated with this one

`driver`

This is the driver from which to scraper the Linkedin profile. A driver using Chrome is created by default. However, if a driver is passed in, that will be used instead.

`get_employees`

Whether to get all the employees of company

For example

driver = webdriver.Chrome()
company = Company("https://ca.linkedin.com/company/google", driver=driver)

`scrape(close_on_complete=True)`

This is the meat of the code, where execution of this function scrapes the company. If close_on_complete is True (which it is by default), then the browser will close upon completion. If scraping of other companies are desired, then you might want to set that to false so you can keep using the same driver.

Contribution

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 413

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (20) 🔗