All Projects → bryantbiggs → resume_tailor

bryantbiggs / resume_tailor

Licence: MIT License
An unsupervised analysis combining topic modeling and clustering to preserve an individuals work history and credentials while tailoring their resume towards a new career field

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to resume tailor

nlp workshop odsc europe20
Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and T…
Stars: ✭ 127 (+746.67%)
Mutual labels:  scikit-learn, nltk, gensim
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+7446.67%)
Mutual labels:  scikit-learn, nltk, gensim
Adam qas
ADAM - A Question Answering System. Inspired from IBM Watson
Stars: ✭ 330 (+2100%)
Mutual labels:  scikit-learn, gensim
Crime Analysis
Association Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (+33.33%)
Mutual labels:  scikit-learn, gensim
Ryuzaki bot
Simple chatbot in Python using NLTK and scikit-learn
Stars: ✭ 28 (+86.67%)
Mutual labels:  scikit-learn, nltk
Twitterldatopicmodeling
Uses topic modeling to identify context between follower relationships of Twitter users
Stars: ✭ 48 (+220%)
Mutual labels:  nltk, gensim
Doc2vec
📓 Long(er) text representation and classification using Doc2Vec embeddings
Stars: ✭ 92 (+513.33%)
Mutual labels:  scikit-learn, gensim
Text Classification
Machine Learning and NLP: Text Classification using python, scikit-learn and NLTK
Stars: ✭ 239 (+1493.33%)
Mutual labels:  scikit-learn, nltk
Shallowlearn
An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
Stars: ✭ 196 (+1206.67%)
Mutual labels:  scikit-learn, gensim
Practical Machine Learning With Python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Stars: ✭ 1,868 (+12353.33%)
Mutual labels:  scikit-learn, nltk
pygrams
Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence
Stars: ✭ 52 (+246.67%)
Mutual labels:  scikit-learn, nltk
Product-Categorization-NLP
Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (+100%)
Mutual labels:  nltk, gensim
frovedis
Framework of vectorized and distributed data analytics
Stars: ✭ 59 (+293.33%)
Mutual labels:  scikit-learn
wordfish-python
extract relationships from standardized terms from corpus of interest with deep learning 🐟
Stars: ✭ 19 (+26.67%)
Mutual labels:  gensim
live twitter sentiment analysis
Live Twitter sentiment analysis using Python, Apache Spark Streaming, Kafka, NLTK, SocketIO
Stars: ✭ 20 (+33.33%)
Mutual labels:  nltk
curso-IRI
Introdução à Recuperação de Informações
Stars: ✭ 14 (-6.67%)
Mutual labels:  nltk
AIPortfolio
Use AI to generate a optimized stock portfolio
Stars: ✭ 28 (+86.67%)
Mutual labels:  scikit-learn
hcn
Hybrid Code Networks https://arxiv.org/abs/1702.03274
Stars: ✭ 81 (+440%)
Mutual labels:  gensim
Awesome-Scripts
A collection of awesome scripts from developers around the globe.
Stars: ✭ 135 (+800%)
Mutual labels:  scikit-learn
python-machine-learning-book-2nd-edition
<머신러닝 교과서 with 파이썬, 사이킷런, 텐서플로>의 코드 저장소
Stars: ✭ 60 (+300%)
Mutual labels:  scikit-learn

Resume Tailor

Waffle Code Climate Issue Count Test Coverage

Project Description

An unsupervised analysis combining topic modeling and clustering to preserve an individuals work history and credentials while tailoring their resume towards a new career field.

Image source


Motivation

Currently undergoing a career switch from mechanical engineering into data science and data engineering, I was initially unsure of how to preserve what I had accomplished in my career so far while creating a resume that is targeted towards data science/data engineering roles. Through this project I hope to exchange similar words and phrases within my current resume in order to more closely match those in the data field without removing any prior work experience or accomplishments.


Data Sources

  • Indeed Resume search - inputing select terms (mechanical engineer, data scientist, etc.) will yield search results of individual resumes that fall within that category

Libraries Utilized

  • phantomjs, selenium - Spawn a pool of workers to request resumes from Indeed without being flagged as a crawler
  • beautifulsoup4, requests - Retrieve and extract data sources from web
  • pymongo - Upload and download retrieved resumes from MongoDB instance hosted on AWS
  • nltk, gensim, scikit-learn - Peform data cleansing (stop words, stemming), create LDA topic model, create TF-IDF matrics, calculate LSI and cosine distance

Process

  1. Crawl Indeed Resumes to retrieve a collection of resumes matching the select search terms (mechanical engineer, data scientist, data engineer, etc.)
  2. Upload each resume retrieved to MongoDB in AWS since data set is quite large (+10gb)
  3. Clean data set (remove stop words, punctuations, etc., apply stemming)
  4. Create LDA topic model from cleaned corpus of resumes
  5. Cluster corpus of resumes based on their topics
  6. Apply same pre-processing transformations to uploaded resume to be tailored to new target career field and gather topics
  7. Change current resumes topics that most closely match current field to similar (synonymous) topics found in intended target field
  8. Measure cosine similarity between modified resume and target field resumes using TF-IDF and LSI
  9. Continue to repeat changes to current resume wording in order to more closely match target field resumes in terms of cosine distance

Results

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].