All Projects → philipperemy → Facebook-Profile-Pictures-Downloader

philipperemy / Facebook-Profile-Pictures-Downloader

Licence: other
😆 Download public profile pictures from Facebook.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to Facebook-Profile-Pictures-Downloader

TwitterScraper
Scrape a User's Twitter data! Bypass the 3,200 tweet API limit for a User!
Stars: ✭ 80 (+247.83%)
Mutual labels:  scraper
RedditExtractor
A minimalistic R wrapper for the Reddit API
Stars: ✭ 58 (+152.17%)
Mutual labels:  scraper
latent space adventures
Buckle up, adventure in the styleGAN2-ada-pytorch network latent space awaits
Stars: ✭ 59 (+156.52%)
Mutual labels:  dataset-generation
fiveN1-rent-scraper
🏠 a.k.a 591 rent scraper(591 租屋網爬蟲)
Stars: ✭ 51 (+121.74%)
Mutual labels:  scraper
google-scraper
This class can retrieve search results from Google.
Stars: ✭ 33 (+43.48%)
Mutual labels:  scraper
jd-autobuy
Python爬虫,京东自动登录,在线抢购商品
Stars: ✭ 1,262 (+5386.96%)
Mutual labels:  scraper
Heroku ebooks
A script to generate Markov chains and to post to an _ebooks account on Twitter using Heroku
Stars: ✭ 251 (+991.3%)
Mutual labels:  scraper
nyt-first-said
Tweets when words are published for the first time in the NYT
Stars: ✭ 222 (+865.22%)
Mutual labels:  scraper
scrapetube
Get all videos from a youtube channel, get all videos from a playlist, get all videos that match a search
Stars: ✭ 120 (+421.74%)
Mutual labels:  scraper
file-extensions
JSON collection of scraped file extensions, along with their description and type, from FileInfo.com
Stars: ✭ 15 (-34.78%)
Mutual labels:  scraper
MangDL
The most inefficient Manga downloader for PC
Stars: ✭ 40 (+73.91%)
Mutual labels:  scraper
yellowpages-scraper
Yellowpages.com Web Scraper written in Python and LXML to extract business details available based on a particular category and location.
Stars: ✭ 56 (+143.48%)
Mutual labels:  scraper
lezhin-comics-downloader
📥 Downloader for lezhin comics
Stars: ✭ 30 (+30.43%)
Mutual labels:  scraper
tv grab fr telerama
XMLTV Grabber using telerama api data
Stars: ✭ 36 (+56.52%)
Mutual labels:  scraper
lopez
Crawling and scraping the Web for fun and profit
Stars: ✭ 20 (-13.04%)
Mutual labels:  scraper
Polite
Be nice on the web
Stars: ✭ 253 (+1000%)
Mutual labels:  scraper
proxy-scraper
⭐️ A proxy scraper made using Protractor | Proxy list Updates every three hour 🔥
Stars: ✭ 201 (+773.91%)
Mutual labels:  scraper
tinyPornManager
Made for pornhub. Fork from tinyMediaManager v3
Stars: ✭ 57 (+147.83%)
Mutual labels:  scraper
wikipedia-reference-scraper
Wikipedia API wrapper for references
Stars: ✭ 34 (+47.83%)
Mutual labels:  scraper
Pahe.ph-Scraper
Pahe.ph [Pahe.in] Movies Website Scraper
Stars: ✭ 57 (+147.83%)
Mutual labels:  scraper

Mining into Facebook public profiles with Deep Learning

Applying Deep Learning to Facebook public information to extract interesting patterns

Nothing very precise yet. We're just going to have fun and build a big Facebook dataset in the short term!



How to use it?

Install the latest facebook-sdk.

cd /tmp/
git clone [email protected]:mobolic/facebook-sdk.git
cd facebook-sdk
sudo pip3 install .

Then clone this repository and follow the instructions below.

# For Python 3.x
git clone [email protected]:philipperemy/Facebook-Profile-Deep-Learning.git facebook-explorer
cd facebook-explorer
sudo pip3 install -r requirements.txt
cp credentials.json.example credentials.json
vim credentials.json # Get your Token ID here https://developers.facebook.com/tools/explorer/
python3 profile_miner.py 10 # to start mining facebook profiles. Here we use 10 threads to query Facebook.

Facebook Token ID

Manual update

Get your Facebook Token ID here and load it into your credentials.json file. https://developers.facebook.com/tools/explorer/

Automatic update (much more useful)

Before using the automatic updates, make sure that it worked at least one time with the manual procedure (just above). Browse on https://developers.facebook.com/tools/explorer/ and request a Token ID. This part relies on web scraping. If everything is not correctly set up beforehand, it is very likely to fail.

Once it's done, let's start this server that will automatically ask Facebook servers for a new token. The main script profile_miner.py auto detects when the token expires. When this happens, a call is made to the server started by auto_token_generator.py.

Start the server with this command:

export [email protected] FB_PASS='i_love_apple';python3 auto_token_generator.py

Where FB_EMAIL is your Facebook email address and FB_PASS is your Facebook password. I advise you to create a specific Facebook account just for those tasks.

You can check if the server is responding by running this command:

curl http://localhost:5000/

Or just connecting to http://localhost:5000/ from your favorite browser. Be patient, it can take up to one minute to query Facebook servers. The procedure is explicitly slow to avoid any bot detection.

Scan data

python3 scan_data.py
This scripts refreshes every 10 seconds.
--------------------------------------------------------------------------------
Number of Facebook descriptions : 15097 (+15097)
Number of Facebook images       : 15088 (+15088)
--------------------------------------------------------------------------------
Number of Facebook descriptions : 15104 (+7)
Number of Facebook images       : 15096 (+8)
--------------------------------------------------------------------------------
Number of Facebook descriptions : 15115 (+11)
Number of Facebook images       : 15107 (+11)

Example of a public profile (contained in ###.pkl where ### is the ID of the user. The ID is undisclosed here for privacy reasons):

{
 'first_name': 'Susan', 
 'updated_time': '2016-12-28T16:26:46+0000', 
  'last_name': 'Cothran', 
  'link': 'https://www.facebook.com/app_scoped_user_id/###/', 
  'name': 'Susan Cothran', 
  'id': '###'
}

The corresponding profile picture is located in ###.jpg.

Common errors

Sometimes the profile is there but it's not available in the Graph API. Most of the time, the profile is inactive and it's better to move on, rather than raising an exception that would block the script:

INFO:facebook-deep-learning:Unsupported get request. Object with ID '827435111' does not exist, cannot be loaded due to missing permissions, or does not support this operation. Please read the Graph API documentation at https://developers.facebook.com/docs/graph-api

The token is only valid for one hour. If you guys have a better way to extend the expiration date, I'll be happy to hear!

facebook.GraphAPIError: Error validating access token: Session has expired on Saturday, 08-Apr-17 23:00:00 PDT. The current time is Saturday, 08-Apr-17 23:01:30 PDT.

The GraphAPI has implemented user request limits. From my experience it's something like 10,000 calls per hour. But it seems to depend upon the application. It's a very gross rule of thumb. When it happens, the script is put on hold for one hour before resuming.

INFO:facebook-deep-learning:(#17) User request limit reached
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].