All Projects → lorey → Social Media Profiles Regexs

lorey / Social Media Profiles Regexs

📇 Extract social media profiles and more with regular expressions

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Social Media Profiles Regexs

Sharer.js
🔛 🔖 Create your own social share buttons. No jquery.
Stars: ✭ 1,624 (+401.23%)
Mutual labels:  hackernews, telegram, twitter, facebook
Skraper
Kotlin/Java library and cli tool for scraping posts and media from various sources with neither authorization nor full page rendering (Facebook, Instagram, Twitter, Youtube, Tiktok, Telegram, Twitch, Reddit, 9GAG, Pinterest, Flickr, Tumblr, IFunny, VK, Pikabu)
Stars: ✭ 72 (-77.78%)
Mutual labels:  telegram, twitter, facebook, instagram
Keyring
Keyring is an authentication framework for WordPress. It comes with definitions for a variety of HTTP Basic, OAuth1 and OAuth2 web services. Use it as a common foundation for working with other web services from within WordPress code.
Stars: ✭ 52 (-83.95%)
Mutual labels:  linkedin, twitter, facebook, instagram
aboutmeinfo-telegram-bot
ℹ️ About Me Info Bot: Share your social media and links on Telegram
Stars: ✭ 20 (-93.83%)
Mutual labels:  instagram, facebook, telegram, linkedin
Social Login Helper Deprecated
A simple android library to easily implement social login into your android project
Stars: ✭ 81 (-75%)
Mutual labels:  linkedin, twitter, facebook, instagram
Zuck.js
A javascript library that lets you add stories EVERYWHERE.
Stars: ✭ 3,396 (+948.15%)
Mutual labels:  snapchat, facebook, instagram
Magento Chatbot
Magento Chatbot Integration with Telegram, Messenger, Whatsapp, WeChat, Skype and wit.ai.
Stars: ✭ 149 (-54.01%)
Mutual labels:  skype, telegram, facebook
Timeliner
In general, Timeliner obtains items from data sources and stores them in a timeline.
Stars: ✭ 2,911 (+798.46%)
Mutual labels:  twitter, facebook, instagram
Social-Media-Automation
Automate social media because you don't have to be active on all of them😉. Best way to be active on all social media without actually being active on them. 😃
Stars: ✭ 186 (-42.59%)
Mutual labels:  instagram, facebook, linkedin
Embera
A Oembed consumer library, that gives you information about urls. It helps you replace urls to youtube or vimeo for example, with their html embed code. It has advanced features like offline support, responsive embeds and caching support.
Stars: ✭ 268 (-17.28%)
Mutual labels:  twitter, facebook, instagram
DaProfiler
DaProfiler allows you to create a profile on your target based in France only. The particularity of this program is its ability to find the e-mail addresses your target.
Stars: ✭ 58 (-82.1%)
Mutual labels:  instagram, facebook, skype
Data-mining-python-script
It contain various script on web crawling/ data mining of social web(RSS,facebook,twitter,Linkedin)
Stars: ✭ 24 (-92.59%)
Mutual labels:  facebook, twitter, linkedin
Integrations
Connect your App to Multiple Messaging Channels with the W3C Open standard.
Stars: ✭ 721 (+122.53%)
Mutual labels:  skype, telegram, twitter
stay-productive
Remove feed from Facebook, Twitter and Linkedin... To stay productive !
Stars: ✭ 15 (-95.37%)
Mutual labels:  facebook, twitter, linkedin
Miranda Ng
Miranda NG: Next Generation of Miranda IM
Stars: ✭ 341 (+5.25%)
Mutual labels:  skype, twitter, facebook
Hackathon Starter Kit
A Node-Typescript/Express Boilerplate with Authentication(Local, Github, Facebook, Twitter, Google, Dropbox, LinkedIn, Discord, Slack), Authorization, and CRUD functionality + PWA Support!
Stars: ✭ 242 (-25.31%)
Mutual labels:  linkedin, twitter, facebook
cakephp-social-share
CakePHP link generator for sharing content on social networks
Stars: ✭ 30 (-90.74%)
Mutual labels:  facebook, twitter, linkedin
Laravel Social Auto Posting
🌈Laravel social auto posting
Stars: ✭ 306 (-5.56%)
Mutual labels:  telegram, twitter, facebook
oauth
Allow users to log in with GitHub, Twitter, Facebook, and more!
Stars: ✭ 21 (-93.52%)
Mutual labels:  facebook, twitter, linkedin
Login With
Stateless login-with microservice for OAuth
Stars: ✭ 2,301 (+610.19%)
Mutual labels:  linkedin, twitter, facebook

Regular Expressions to Match Social Media Profiles

This repository lists regular expressions to match and extract information from URLs of social media profiles. So if you find a hyperlink to this repo somewhere on the web, i.e. https://github.com/lorey/social-media-profiles-regexs/, the regular expressions in this repo allow you find out it's a Github link pointing to a repo as well as extract the username lorey and the repo name social-media-profiles-regexs from this URL.

Features:

  • detect the platform a url points to (all major platforms supported)
  • extract the information contained within the url (without opening the url, of course)
  • extract emails and phone numbers from hyperlinks

Please note: If you want to extract social media links, depending on your case, there are possibly easier ways:

  • I've created a Python library called socials that uses these expressions to automate url detection and data extraction. You input the urls, it extracts the type of platform as well as the contained information, e.g. the linked social media profiles.
  • There's also a Socials API which makes the socials python package available via REST and JSON. You can use it for free at socials.karllorey.com or deploy it yourself. You simply input any URL you want to extract profiles from. It will then fetch and return all social media links from the given website. Try it here.

If you're missing a particular platform, please feel free to add it. Also feel free to add a test that does not work. An explanation of how this repo works can be found in CONTRIBUTING.md. You might also open an issue, of course, I'm happy to help!

Table of Contents

angellist

company

(?:https?:)?\/\/angel\.co\/company\/(?P<company>[A-z0-9_-]+)(?:\/(?P<company_subpage>[A-z0-9-]+))?

Examples:

job

(?:https?:)?\/\/angel\.co\/company\/(?P<company>[A-z0-9_-]+)\/jobs\/(?P<job_permalink>(?P<job_id>[0-9]+)-(?P<job_slug>[A-z0-9-]+))

Examples:

user

(?:https?:)?\/\/angel\.co\/(?P<type>u|p)\/(?P<user>[A-z0-9_-]+)

There are root-level direct links to users, e.g. angel.co/karllorey, that get redirected to these new user links now. Sometimes it's /p/, sometimes it's /u/, haven't figured out why that is...

Examples:

crunchbase

company

(?:https?:)?\/\/crunchbase\.com\/organization\/(?P<organization>[A-z0-9_-]+)

Examples:

person

(?:https?:)?\/\/crunchbase\.com\/person\/(?P<person>[A-z0-9_-]+)

Examples:

email

mailto

mailto:(?P<email>[A-z0-9_.+-][email protected][A-z0-9_.-]+\.[A-z]+)

This is for scraping only and in no way usable as a validation.

Examples:

facebook

profile

(?:https?:)?\/\/(?:www\.)?(?:facebook|fb)\.com\/(?P<profile>(?![A-z]+\.php)(?!marketplace|gaming|watch|me|messages|help|search|groups)[A-z0-9_\-\.]+)\/?

A profile can be a page, a user profile, or something else. Since Facebook redirects these URLs to all kinds of objects (user, pages, events, and so on), you have to verify that it's actually a user. See https://developers.facebook.com/docs/graph-api/reference/profile

Examples:

profile by id

(?:https?:)?\/\/(?:www\.)facebook.com/(?:profile.php\?id=)?(?P<id>[0-9]+)

Examples:

github

repo

(?:https?:)?\/\/(?:www\.)?github\.com\/(?P<login>[A-z0-9_-]+)\/(?P<repo>[A-z0-9_-]+)\/?

Exclude subdomains as these redirect to github pages sometimes.

Examples:

user

(?:https?:)?\/\/(?:www\.)?github\.com\/(?P<login>[A-z0-9_-]+)\/?

Exclude subdomains other than www. as these redirect to github pages sometimes.

Examples:

google plus

user id

(?:https?:)?\/\/plus\.google\.com\/(?P<id>[0-9]{21})

Matches profile numbers with exactly 21 digits.

Examples:

username

(?:https?:)?\/\/plus\.google\.com\/\+(?P<username>[A-z0-9+]+)

Matches username.

Examples:

hackernews

item

(?:https?:)?\/\/news\.ycombinator\.com\/item\?id=(?P<item>[0-9]+)

An item can be a post or a direct link to a comment.

Examples:

user

(?:https?:)?\/\/news\.ycombinator\.com\/user\?id=(?P<user>[A-z0-9_-]+)

Examples:

instagram

profile

(?:https?:)?\/\/(?:www\.)?(?:instagram\.com|instagr\.am)\/(?P<username>[A-Za-z0-9_](?:(?:[A-Za-z0-9_]|(?:\.(?!\.))){0,28}(?:[A-Za-z0-9_]))?)

The rules:

  • Matches with one . in them disco.dude but not two .. disco..dude
  • Ending period not matched discodude.
  • Match underscores _disco__dude
  • Max characters of 30 1234567890123456789012345678901234567890

Examples:

linkedin

company

(?:https?:)?\/\/(?:[\w]+\.)?linkedin\.com\/company\/(?P<company_permalink>[A-z0-9-\.]+)\/?

Permalink may be a company id or a regular permalink. The id redirects to the permalink as soon as one is created.

Examples:

post

(?:https?:)?\/\/(?:[\w]+\.)?linkedin\.com\/feed\/update\/urn:li:activity:(?P<activity_id>[0-9]+)\/?

Direct link to a Linkedin post, only contains a post id.

Examples:

profile

(?:https?:)?\/\/(?:[\w]+\.)?linkedin\.com\/in\/(?P<permalink>[\w\-\_À-ÿ%]+)\/?

These are the currently used, most-common urls ending in /in/

Examples:

profile_pub

(?:https?:)?\/\/(?:[\w]+\.)?linkedin\.com\/pub\/(?P<permalink_pub>[A-z0-9_-]+)(?:\/[A-z0-9]+){3}\/?

These are old public urls not used anymore, more info at quora

Examples:

medium

post

(?:https?:)?\/\/medium\.com\/(?:(?:@(?P<username>[A-z0-9]+))|(?P<publication>[a-z-]+))\/(?P<slug>[a-z0-9\-]+)-(?P<post_id>[A-z0-9]+)(?:\?.*)?

Examples:

post of subdomain publication

(?:https?:)?\/\/(?P<publication>(?!www)[a-z-]+)\.medium\.com\/(?P<slug>[a-z0-9\-]+)-(?P<post_id>[A-z0-9]+)(?:\?.*)?

Can't match these with the regular post regex as redefinitions of subgroups are not allowed in pythons regex.

Examples:

user

(?:https?:)?\/\/medium\.com\/@(?P<username>[A-z0-9]+)(?:\?.*)?

Examples:

user by id

(?:https?:)?\/\/medium\.com\/u\/(?P<user_id>[A-z0-9]+)(?:\?.*)

Now redirects to new user profiles. Follow with a head or get request.

Examples:

phone

phone number

(?:tel|phone|mobile):(?P<number>\+?[0-9. -]+)

Should be cleaned afterwards to strip dots, spaces, etc.

Examples:

  • tel:+49 900 123456
  • tel:+49900123456

reddit

user

(?:https?:)?\/\/(?:[a-z]+\.)?reddit\.com\/(?:u(?:ser)?)\/(?P<username>[A-z0-9\-\_]*)\/?

Examples:

skype

profile

(?:(?:callto|skype):)(?P<username>[a-z][a-z0-9\.,\-_]{5,31})(?:\?(?:add|call|chat|sendfile|userinfo))?

Matches Skype's URLs to add contact, call, chat. More info at Skype SDK's docs.

Examples:

  • skype:echo123
  • skype:echo123?call

snapchat

profile

(?:https?:)?\/\/(?:www\.)?snapchat\.com\/add\/(?P<username>[A-z0-9\.\_\-]+)\/?

Examples:

stackexchange

user

(?:https?:)?\/\/(?:www\.)?stackexchange\.com\/users\/(?P<id>[0-9]+)\/(?P<username>[A-z0-9-_.]+)\/?

This is the meta-platform above stackoverflow, etc. Username can be changed at any time, user_id is persistent.

Examples:

stackexchange network

user

(?:https?:)?\/\/(?:(?P<community>[a-z]+(?!www))\.)?stackexchange\.com\/users\/(?P<id>[0-9]+)\/(?P<username>[A-z0-9-_.]+)\/?

While there are some "named" communities in the stackexchange network like stackoverflow, many only exist as subdomains, i.e. gaming.stackexchange.com. Again, username can be changed at any time, user_id is persistent.

Examples:

stackoverflow

question

(?:https?:)?\/\/(?:www\.)?stackoverflow\.com\/questions\/(?P<id>[0-9]+)\/(?P<title>[A-z0-9-_.]+)\/?

Examples:

user

(?:https?:)?\/\/(?:www\.)?stackoverflow\.com\/users\/(?P<id>[0-9]+)\/(?P<username>[A-z0-9-_.]+)\/?

Username can be changed at any time, user_id is persistent.

Examples:

telegram

profile

(?:https?:)?\/\/(?:t(?:elegram)?\.me|telegram\.org)\/(?P<username>[a-z0-9\_]{5,32})\/?

Matches for t.me, telegram.me and telegram.org.

Examples:

twitter

status

(?:https?:)?\/\/(?:[A-z]+\.)?twitter\.com\/@?(?P<username>[A-z0-9_]+)\/status\/(?P<tweet_id>[0-9]+)\/?

Examples:

user

(?:https?:)?\/\/(?:[A-z]+\.)?twitter\.com\/@?(?!home|share|privacy|tos)(?P<username>[A-z0-9_]+)\/?

Allowed for usernames are alphanumeric characters and underscores.

Examples:

vimeo

user

(?:https?:)?\/\/vimeo\.com\/user(?P<id>[0-9]+)

Examples:

video

(?:https?:)?\/\/(?:(?:www)?vimeo\.com|player.vimeo.com\/video)\/(?P<id>[0-9]+)

Examples:

youtube

channel

(?:https?:)?\/\/(?:[A-z]+\.)?youtube.com\/channel\/(?P<id>[A-z0-9-\_]+)\/?

Examples:

user

(?:https?:)?\/\/(?:[A-z]+\.)?youtube.com\/user\/(?P<username>[A-z0-9]+)\/?

Examples:

video

(?:https?:)?\/\/(?:(?:www\.)?youtube\.com\/(?:watch\?v=|embed\/)|youtu\.be\/)(?P<id>[A-z0-9\-\_]+)

Matches youtube video links like https://www.youtube.com/watch?v=dQw4w9WgXcQ and shortlinks like https://youtu.be/dQw4w9WgXcQ

Examples:

Monster Regex

If you want to match and extract the data from all urls with one regex, use this monster. It will return the data for all the platforms above. The regex subgroups are prefixed with the platform, e.g. angellist__company instead of just company in the angellist company regex, as some regex implementations don't support defining subgroups more than once which would introduce errors if the same subgroup name is used in two or more platforms.

(?P<angellist__company>(?:https?:)?\/\/angel\.co\/company\/(?P<angellist__company__company>[A-z0-9_-]+)(?:\/(?P<angellist__company__company_subpage>[A-z0-9-]+))?)|(?P<angellist__job>(?:https?:)?\/\/angel\.co\/company\/(?P<angellist__job__company>[A-z0-9_-]+)\/jobs\/(?P<angellist__job__job_permalink>(?P<angellist__job__job_id>[0-9]+)-(?P<angellist__job__job_slug>[A-z0-9-]+)))|(?P<angellist__user>(?:https?:)?\/\/angel\.co\/(?P<angellist__user__type>u|p)\/(?P<angellist__user__user>[A-z0-9_-]+))|(?P<crunchbase__company>(?:https?:)?\/\/crunchbase\.com\/organization\/(?P<crunchbase__company__organization>[A-z0-9_-]+))|(?P<crunchbase__person>(?:https?:)?\/\/crunchbase\.com\/person\/(?P<crunchbase__person__person>[A-z0-9_-]+))|(?P<email__mailto>mailto:(?P<email__mailto__email>[A-z0-9_.+-][email protected][A-z0-9_.-]+\.[A-z]+))|(?P<facebook__profile>(?:https?:)?\/\/(?:www\.)?(?:facebook|fb)\.com\/(?P<facebook__profile__profile>(?![A-z]+\.php)(?!marketplace|gaming|watch|me|messages|help|search|groups)[A-z0-9_\-\.]+)\/?)|(?P<facebook__profile_by_id>(?:https?:)?\/\/(?:www\.)facebook.com/(?:profile.php\?id=)?(?P<facebook__profile_by_id__id>[0-9]+))|(?P<github__repo>(?:https?:)?\/\/(?:www\.)?github\.com\/(?P<github__repo__login>[A-z0-9_-]+)\/(?P<github__repo__repo>[A-z0-9_-]+)\/?)|(?P<github__user>(?:https?:)?\/\/(?:www\.)?github\.com\/(?P<github__user__login>[A-z0-9_-]+)\/?)|(?P<google_plus__user_id>(?:https?:)?\/\/plus\.google\.com\/(?P<google_plus__user_id__id>[0-9]{21}))|(?P<google_plus__username>(?:https?:)?\/\/plus\.google\.com\/\+(?P<google_plus__username__username>[A-z0-9+]+))|(?P<hackernews__item>(?:https?:)?\/\/news\.ycombinator\.com\/item\?id=(?P<hackernews__item__item>[0-9]+))|(?P<hackernews__user>(?:https?:)?\/\/news\.ycombinator\.com\/user\?id=(?P<hackernews__user__user>[A-z0-9_-]+))|(?P<instagram__profile>(?:https?:)?\/\/(?:www\.)?(?:instagram\.com|instagr\.am)\/(?P<instagram__profile__username>[A-Za-z0-9_](?:(?:[A-Za-z0-9_]|(?:\.(?!\.))){0,28}(?:[A-Za-z0-9_]))?))|(?P<linkedin__company>(?:https?:)?\/\/(?:[\w]+\.)?linkedin\.com\/company\/(?P<linkedin__company__company_permalink>[A-z0-9-\.]+)\/?)|(?P<linkedin__post>(?:https?:)?\/\/(?:[\w]+\.)?linkedin\.com\/feed\/update\/urn:li:activity:(?P<linkedin__post__activity_id>[0-9]+)\/?)|(?P<linkedin__profile>(?:https?:)?\/\/(?:[\w]+\.)?linkedin\.com\/in\/(?P<linkedin__profile__permalink>[\w\-\_À-ÿ%]+)\/?)|(?P<linkedin__profile_pub>(?:https?:)?\/\/(?:[\w]+\.)?linkedin\.com\/pub\/(?P<linkedin__profile_pub__permalink_pub>[A-z0-9_-]+)(?:\/[A-z0-9]+){3}\/?)|(?P<medium__post>(?:https?:)?\/\/medium\.com\/(?:(?:@(?P<medium__post__username>[A-z0-9]+))|(?P<medium__post__publication>[a-z-]+))\/(?P<medium__post__slug>[a-z0-9\-]+)-(?P<medium__post__post_id>[A-z0-9]+)(?:\?.*)?)|(?P<medium__post_of_subdomain_publication>(?:https?:)?\/\/(?P<medium__post_of_subdomain_publication__publication>(?!www)[a-z-]+)\.medium\.com\/(?P<medium__post_of_subdomain_publication__slug>[a-z0-9\-]+)-(?P<medium__post_of_subdomain_publication__post_id>[A-z0-9]+)(?:\?.*)?)|(?P<medium__user>(?:https?:)?\/\/medium\.com\/@(?P<medium__user__username>[A-z0-9]+)(?:\?.*)?)|(?P<medium__user_by_id>(?:https?:)?\/\/medium\.com\/u\/(?P<medium__user_by_id__user_id>[A-z0-9]+)(?:\?.*))|(?P<phone__phone_number>(?:tel|phone|mobile):(?P<phone__phone_number__number>\+?[0-9. -]+))|(?P<reddit__user>(?:https?:)?\/\/(?:[a-z]+\.)?reddit\.com\/(?:u(?:ser)?)\/(?P<reddit__user__username>[A-z0-9\-\_]*)\/?)|(?P<skype__profile>(?:(?:callto|skype):)(?P<skype__profile__username>[a-z][a-z0-9\.,\-_]{5,31})(?:\?(?:add|call|chat|sendfile|userinfo))?)|(?P<snapchat__profile>(?:https?:)?\/\/(?:www\.)?snapchat\.com\/add\/(?P<snapchat__profile__username>[A-z0-9\.\_\-]+)\/?)|(?P<stackexchange__user>(?:https?:)?\/\/(?:www\.)?stackexchange\.com\/users\/(?P<stackexchange__user__id>[0-9]+)\/(?P<stackexchange__user__username>[A-z0-9-_.]+)\/?)|(?P<stackexchange_network__user>(?:https?:)?\/\/(?:(?P<stackexchange_network__user__community>[a-z]+(?!www))\.)?stackexchange\.com\/users\/(?P<stackexchange_network__user__id>[0-9]+)\/(?P<stackexchange_network__user__username>[A-z0-9-_.]+)\/?)|(?P<stackoverflow__question>(?:https?:)?\/\/(?:www\.)?stackoverflow\.com\/questions\/(?P<stackoverflow__question__id>[0-9]+)\/(?P<stackoverflow__question__title>[A-z0-9-_.]+)\/?)|(?P<stackoverflow__user>(?:https?:)?\/\/(?:www\.)?stackoverflow\.com\/users\/(?P<stackoverflow__user__id>[0-9]+)\/(?P<stackoverflow__user__username>[A-z0-9-_.]+)\/?)|(?P<telegram__profile>(?:https?:)?\/\/(?:t(?:elegram)?\.me|telegram\.org)\/(?P<telegram__profile__username>[a-z0-9\_]{5,32})\/?)|(?P<twitter__status>(?:https?:)?\/\/(?:[A-z]+\.)?twitter\.com\/@?(?P<twitter__status__username>[A-z0-9_]+)\/status\/(?P<twitter__status__tweet_id>[0-9]+)\/?)|(?P<twitter__user>(?:https?:)?\/\/(?:[A-z]+\.)?twitter\.com\/@?(?!home|share|privacy|tos)(?P<twitter__user__username>[A-z0-9_]+)\/?)|(?P<vimeo__user>(?:https?:)?\/\/vimeo\.com\/user(?P<vimeo__user__id>[0-9]+))|(?P<vimeo__video>(?:https?:)?\/\/(?:(?:www)?vimeo\.com|player.vimeo.com\/video)\/(?P<vimeo__video__id>[0-9]+))|(?P<youtube__channel>(?:https?:)?\/\/(?:[A-z]+\.)?youtube.com\/channel\/(?P<youtube__channel__id>[A-z0-9-\_]+)\/?)|(?P<youtube__user>(?:https?:)?\/\/(?:[A-z]+\.)?youtube.com\/user\/(?P<youtube__user__username>[A-z0-9]+)\/?)|(?P<youtube__video>(?:https?:)?\/\/(?:(?:www\.)?youtube\.com\/(?:watch\?v=|embed\/)|youtu\.be\/)(?P<youtube__video__id>[A-z0-9\-\_]+))

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].