All Projects → kuutsav → leetcode-compensation

kuutsav / leetcode-compensation

Licence: other
Compensation analysis on the posts scraped from leetcode.com/discuss/compensation. At present, the reports have been generated only for Indian cities.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to leetcode-compensation

sawaliram
Online repository for the Sawaliram platform
Stars: ✭ 13 (-84.34%)
Mutual labels:  india
Indian-government-API-List
A curated list of official APIs owned by government of India.
Stars: ✭ 65 (-21.69%)
Mutual labels:  india
rreddit
𝐫⟋ Get Reddit data
Stars: ✭ 49 (-40.96%)
Mutual labels:  web-scraping
cl-torrents
Searching torrents on popular trackers - CLI, readline, GUI, web client. Tutorial and binaries (issue tracker on https://gitlab.com/vindarel/cl-torrents/)
Stars: ✭ 83 (+0%)
Mutual labels:  web-scraping
Data-Wrangling-with-Python
Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices
Stars: ✭ 90 (+8.43%)
Mutual labels:  web-scraping
Linkedin-Client
Web scraper for grabing data from Linkedin profiles or company pages (personal project)
Stars: ✭ 42 (-49.4%)
Mutual labels:  web-scraping
portfoliomanager
Finance portfolio management for Indians. Track financial goals using contributions to Provident Fund, Sukanya Samriddhi, Mutual Funds, Shares, Restricted Stock Units, Employee Stock Purchase Plan, 401K, Gold, Crypto
Stars: ✭ 27 (-67.47%)
Mutual labels:  india
ospi
Open Source Presence Infographic of Indian Startups
Stars: ✭ 25 (-69.88%)
Mutual labels:  india
grailer
web scraping tool for grailed.com
Stars: ✭ 30 (-63.86%)
Mutual labels:  web-scraping
browser-pool
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (-14.46%)
Mutual labels:  web-scraping
WebApp
WebApp Quandl.com API to tell stock growth history in a given period.
Stars: ✭ 83 (+0%)
Mutual labels:  india
ml-ai
ML-AI Community | Open Source | Built in Bharat for the World | Data science problem statements and solutions
Stars: ✭ 32 (-61.45%)
Mutual labels:  india
iww
AI based web-wrapper for web-content-extraction
Stars: ✭ 61 (-26.51%)
Mutual labels:  web-scraping
rymscraper
Python API to extract data from rateyourmusic.com.
Stars: ✭ 63 (-24.1%)
Mutual labels:  web-scraping
WaWebSessionHandler
(DISCONTINUED) Save WhatsApp Web Sessions as files and open them everywhere!
Stars: ✭ 27 (-67.47%)
Mutual labels:  web-scraping
selectorlib
A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them
Stars: ✭ 53 (-36.14%)
Mutual labels:  web-scraping
htmlunit
🕸🧰☕️Tools to Scrape Dynamic Web Content via the 'HtmlUnit' Java Library
Stars: ✭ 39 (-53.01%)
Mutual labels:  web-scraping
pari
Django/Wagtail based PARI webapp
Stars: ✭ 32 (-61.45%)
Mutual labels:  india
InOut4-landing
Landing page of InOut 4.0
Stars: ✭ 16 (-80.72%)
Mutual labels:  india
extractnet
A Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (-37.35%)
Mutual labels:  web-scraping

Leetcode Compensations report

Scraping and analysis of the leetcode-compensations page (for India).

Check out https://github.com/kuutsav/LeetComp for an interactive version of this report.

Salary

Reports

fixed salary - 5th Jan 2019 - 18th Jan 2022

total salary - 5th Jan 2019 - 18th Jan 2022

fixed salary, dark mode - 5th Jan 2019 - 18th Jan 2022

total salary, dark mode - 5th Jan 2019 - 18th Jan 2022

Directory structure

Folder Description
1. data
1.1. imgs images for reports
1.2. logs scraping logs
1.3. mappings standardized company, location and title mappings as well as unmapped entities
1.4. meta meta information for the posts like post_id, date, title, href
1.5. out data from info.all_info.get_clean_records_for_india()
1.6. posts text from the post
1.7. reports salary analysis by companies, titles and experience
2. info functions to parse data from posts(along with the standardized entities) in a tabular format
3. leetcode scraper
4. utils constants and helper methods

Setup

  1. Clone the repo.
  2. Put the chromedriver in the utils directory.
  3. Setup virual enviroment - python -m venv leetcode.
  4. Install necessary packages - pip install -r requirements.txt.
  5. To create the reports - npm install vega-lite vega-cli canvas(needed to save altair plots).

Generating reports

Scraping

$ export PTYHONPATH=<project_directory>
$ python leetcode/posts_meta.py --till_date 2021/08/03

# sample output
2021-08-03 19:36:07.474 | INFO     | __main__:<module>:48 - page no: 1 | # posts: 15
$ python leetcode/posts.py

# sample output
2021-08-03 19:36:25.997 | INFO     | __main__:<module>:45 - post_id: 1380805 done!
2021-08-03 19:36:28.995 | INFO     | __main__:<module>:45 - post_id: 1380646 done!
2021-08-03 19:36:31.631 | INFO     | __main__:<module>:45 - post_id: 1380542 done!
2021-08-03 19:36:34.727 | INFO     | __main__:<module>:45 - post_id: 1380068 done!
2021-08-03 19:36:37.280 | INFO     | __main__:<module>:45 - post_id: 1379990 done!
2021-08-03 19:36:40.509 | INFO     | __main__:<module>:45 - post_id: 1379903 done!
2021-08-03 19:36:41.096 | WARNING  | __main__:<module>:34 - sleeping extra for post_id: 1379487
2021-08-03 19:36:44.530 | INFO     | __main__:<module>:45 - post_id: 1379487 done!
2021-08-03 19:36:47.115 | INFO     | __main__:<module>:45 - post_id: 1379208 done!
2021-08-03 19:36:49.660 | INFO     | __main__:<module>:45 - post_id: 1378689 done!
2021-08-03 19:36:50.470 | WARNING  | __main__:<module>:34 - sleeping extra for post_id: 1378620
2021-08-03 19:36:53.866 | INFO     | __main__:<module>:45 - post_id: 1378620 done!
2021-08-03 19:36:57.203 | INFO     | __main__:<module>:45 - post_id: 1378334 done!
2021-08-03 19:37:00.570 | INFO     | __main__:<module>:45 - post_id: 1378288 done!
2021-08-03 19:37:03.226 | INFO     | __main__:<module>:45 - post_id: 1378181 done!
2021-08-03 19:37:05.895 | INFO     | __main__:<module>:45 - post_id: 1378113 done!

Generating pandas DataFrame for the reports

$ ipython

In [1]: from info.all_info import get_clean_records_for_india
In [2]: df = get_clean_records_for_india()
2021-08-04 15:47:11.615 | INFO     | info.all_info:get_raw_records:95 - n records: 4134
2021-08-04 15:47:11.616 | WARNING  | info.all_info:get_raw_records:97 - missing post_ids: ['1347044', '1193859', '1208031', '1352074', '1308645', '1206533', '1309603', '1308672', '1271172', '214751', '1317751', '1342147', '1308728', '1138584']
2021-08-04 15:47:11.696 | WARNING  | info.all_info:_save_unmapped_labels:54 - 35 unmapped company saved
2021-08-04 15:47:11.705 | WARNING  | info.all_info:_save_unmapped_labels:54 - 353 unmapped title saved
2021-08-04 15:47:11.708 | WARNING  | info.all_info:get_clean_records_for_india:122 - 1779 rows dropped(location!=india)
2021-08-04 15:47:11.709 | WARNING  | info.all_info:get_clean_records_for_india:128 - 385 rows dropped(incomplete info)
2021-08-04 15:47:11.710 | WARNING  | info.all_info:get_clean_records_for_india:134 - 7 rows dropped(internships)
In [3]: df.shape
Out[3]: (1963, 14)

Generating the reports

$ python reports/plots.py # generate fixed comp. plots
$ python reports/report.py # fixed comp.
$ python reports/report_dark.py # fixed comp., dark mode

$ python reports/plots_total.py # generate total comp. plots
$ python reports/report_total.py # total comp.
$ python reports/report_dark_total.py # total comp., dark mode

Sample

Key Value
title Flipkart Software Development Engineer-1, Bangalore
url https://leetcode.com/discuss/compensation/834212/Flipkart-or-Software-Development-Engineer-1-or-Bangalore
company flipkart
title sde 1
yoe 0.0 years
salary ₹ 1800000.0
location bangalore
post Education: B.Tech from NIT (2021 passout)\nYears of Experience: 0\nPrior Experience: Fresher\nDate of the Offer:\nAug 2020\nCompany: Flipkart\nTitle/Level: Software Development Engineer-1\nLocation: Bangalore\nSalary: INR 18,00,000\nPerformance Incentive: INR 1,80,000 (10% of base pay)\nESOPs: 48 units => INR 5,07,734 (vested over 4 years. 25% each year)\nRelocation Reimbursement: INR 40,000\nTelephone Reimbursement: INR 12,000\nHome Broadband Reimbursement: INR 12,000\nGratuity: INR 38,961\nInsurance: INR 27,000\nOther Benefits: INR 40,000 (15 days accomodation + travel) (this is different from the relocation reimbursement)\nTotal comp (Salary + Bonus + Stock): Total CTC: INR 26,57,695; First year: INR 22,76,895\nOther details: Standard Offer for On-Campus Hire\nAllowed Branches: B.Tech CSE/IT (6.0 CGPA & above)\nProcess consisted of Coding test & 3 rounds of interviews. I don't remember questions exactly. But they vary from topics such as Graph(Topological Sort, Bi-Partite Graph), Trie based questions, DP based questions both recursive and dp approach, trees, Backtracking.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].