All Projects → timgrossmann → Instagram Profilecrawl

timgrossmann / Instagram Profilecrawl

Licence: mit
📝 quickly crawl the information (e.g. followers, tags etc...) of an instagram profile.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Instagram Profilecrawl

instagram-profilecrawl
📝 quickly crawl the information (e.g. followers, tags etc...) of an instagram profile.
Stars: ✭ 964 (+18.14%)
Mutual labels:  instagram, simple, information, selenium, python-script
Instagram Bot
An Instagram bot developed using the Selenium Framework
Stars: ✭ 138 (-83.09%)
Mutual labels:  automation, crawler, instagram, selenium
Instagram Profilecrawl
💻 Quickly crawl the information (e.g. followers, tags, etc...) of an instagram profile. No login required!
Stars: ✭ 110 (-86.52%)
Mutual labels:  automation, crawler, instagram, selenium
Instagramfirstcommenter
This bot will post a predefined comment as fast as possible to a new post on the target profile. I used this to successfully win tickets for a big music festival.
Stars: ✭ 26 (-96.81%)
Mutual labels:  automation, instagram, selenium
Instagramcrawler
A non API python program to crawl public photos, posts or followers
Stars: ✭ 349 (-57.23%)
Mutual labels:  crawler, instagram, selenium
Spam Bot 3000
Social media research and promotion, semi-autonomous CLI bot
Stars: ✭ 79 (-90.32%)
Mutual labels:  automation, instagram, selenium
Instapy
📷 Instagram Bot - Tool for automated Instagram interactions
Stars: ✭ 12,473 (+1428.55%)
Mutual labels:  automation, instagram, selenium
Pychromeless
Python Lambda Chrome Automation (naming pending)
Stars: ✭ 219 (-73.16%)
Mutual labels:  automation, crawler, selenium
Linuxbashshellscriptforops
Linux Bash Shell Script and Python Script For Ops and Devops
Stars: ✭ 298 (-63.48%)
Mutual labels:  python-script, automation
Instapy Gui
gui for instapy automation
Stars: ✭ 313 (-61.64%)
Mutual labels:  automation, instagram
Insomniac
Instagram bot for automated Instagram interaction using Android device via ADB
Stars: ✭ 324 (-60.29%)
Mutual labels:  automation, instagram
Python Automation Scripts
Simple yet powerful automation stuffs.
Stars: ✭ 292 (-64.22%)
Mutual labels:  crawler, instagram
Selenium Document
a document with regard to selenium
Stars: ✭ 274 (-66.42%)
Mutual labels:  automation, selenium
Upme Plus
Smart Automation inside your browser for free. Start earning and double your followers
Stars: ✭ 318 (-61.03%)
Mutual labels:  automation, instagram
Playwright Go
Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.
Stars: ✭ 272 (-66.67%)
Mutual labels:  automation, selenium
Go Instabot
Automatically follow, like, and comment on instagram
Stars: ✭ 345 (-57.72%)
Mutual labels:  automation, instagram
Tor Browser Selenium
Tor Browser automation with Selenium.
Stars: ✭ 267 (-67.28%)
Mutual labels:  automation, selenium
Autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+399.63%)
Mutual labels:  automation, crawler
Comic Dl
Comic-dl is a command line tool to download manga and comics from various comic and manga sites. Supported sites : readcomiconline.to, mangafox.me, comic naver and many more.
Stars: ✭ 365 (-55.27%)
Mutual labels:  python-script, automation
Golem
A complete test automation tool
Stars: ✭ 441 (-45.96%)
Mutual labels:  automation, selenium

Instagram-Profilecrawl

Quickly crawl the information (e.g. followers, tags etc...) of an instagram profile. No login required!

Automation Script for crawling information from ones instagram profile.
Like e.g. the number of posts, followers, and the tags of the the posts

Guide to Bot Creation: Learn to Build your own Bots and Automations with the Creators of InstaPy

Getting started

Just do:

git clone https://github.com/timgrossmann/instagram-profilecrawl.git

It uses selenium and requests to get all the information so install them with:

pip install -r requirements.txt

Copy the .env.example to .env

cp .env.example .env

Modify your IG profile inside .env

IG_USERNAME=<Your Instagram Username>
IG_PASSWORD=<Your Instagram Password>

Install the proper chromedriver for your operating system. Once you download it just drag and drop it into instagram-profilecrawl/assets directory.

Use it!

Now you can start using it following this example:

python3.7 crawl_profile.py username1 username2 ... usernameX

Download The Images Posts to your local

python3.7 extract_image.py <colected_profiles_path>

Settings: To limit the amount of posts to be analyzed, change variable limit_amount in settings.py. Default value is 12000.

Optional login

If you want to access more features (such as private accounts which you followed with yours will be accessible) you must enter your username and password in setting.py. Remember, it's optional.

Here are the steps to do so:

  1. Open Settings.py
  2. Search for login_username & login_password
  3. Put your information inside the quotation marks

Second option: just the settings to your script

Settings.login_username = 'my_insta_account'
Settings.login_password = 'my_password_xxx'

Run on Raspberry Pi

To run the crawler on Raspberry Pi with Firefox, follow these steps:

  1. Install Firefox: sudo apt-get install firefox-esr
  2. Get the geckodriver as described here
  3. Install pyvirtualdisplay: sudo pip3 install pyvirtualdisplay
  4. Run the script for RPi: python3 crawl_profile_pi.py username1 username2 ...

Collecting stats:

If you are interested in collecting and logging stats from a crawled profile, use the log_stats.py script after runnig crawl_profile.py (or crawl_profile_pi.py). For example, on Raspberry Pi run:

  1. Run python3 crawl_profile_pi.py username
  2. Run python3 log_stats.py -u username for specific user or python3 log_stats.py for all user

This appends the collected profile info to stats.csv. Can be useful for monitoring the growth of an Instagram account over time. The logged stats are: Time, username, total number of followers, following, posts, likes, and comments. The two commands can simply be triggered using crontab (make sure to trigger log_stats.py several minutes after crawl_profile_pi.py).

Settings:

Path to the save the profile jsons:

Settings.profile_location = os.path.join(BASE_DIR, 'profiles')

Should the profile json file should get a timestamp

Settings.profile_file_with_timestamp = True

Path to the save the commentors:

Settings.profile_commentors_location = os.path.join(BASE_DIR, 'profiles')

Should the commentors file should get a timestamp

Settings.profile_commentors_file_with_timestamp = True

Scrape & save the posts json

Settings.scrape_posts_infos = True

How many (max) post should be scraped

Settings.limit_amount = 12000

Should the comments also be saved in json files

Settings.output_comments = False

Should the mentions in the post image saved in json files

Settings.mentions = True

Should the users who liked the post saved in json files Attention: be aware it would take a lot of time. script just can load 12 like at once. before making a break and load again

Settings.scrape_posts_likers = True

Should the profile followers be scrap Attention: crawler must has be logged in (see above) / crashes sometimes on huge accounts

Settings.scrape_follower = True

Time between post scrolling (increase if you got errors)

Settings.sleep_time_between_post_scroll = 1.5

Time between comment scrolling (increase if you got errors)

Settings.sleep_time_between_comment_loading = 1.5

Output debug messages to Console

Settings.log_output_toconsole = True

Path to the logfile

Settings.log_location = os.path.join(BASE_DIR, 'logs')

Output debug messages to File

Settings.log_output_tofile = True

New logfile for every run

Settings.log_file_per_run = False

The information will be saved in a JSON-File in ./profiles/{username}.json

Example of a files data

{
  "alias": "Tim Gro\u00dfmann",
  "username": "grossertim",
  "num_of_posts": 127,
  "posts": [
    {
      "caption": "It was a good day",
      "location": {
        "location_url": "https://www.instagram.com/explore/locations/345421482541133/caffe-fernet/",
        "location_name": "Caffe Fernet",
        "location_id": "345421482541133",
        "latitude": 1.2839,
        "longitude": 103.85333
      },
      "img": "https://scontent.cdninstagram.com/t51.2885-15/e15/p640x640/16585292_1355568261161749_3055111083476910080_n.jpg?ig_cache_key=MTQ0ODY3MjA3MTQyMDA3Njg4MA%3D%3D.2",
      "date": "2018-04-26T15:07:32.000Z",
      "tags": ["#fun", "#good", "#goodday", "#goodlife", "#happy", "#goodtime", "#funny", ...],
      "likes": 284,
      "comments": {
        "count": 0,
        "list": [],
       },
     },
     {
      "caption": "Wild Rocket Salad with Japanese Sesame Sauce",
      "location": {
        "location_url": "https://www.instagram.com/explore/locations/318744905241462/junior-kuppanna-restaurant-singapore/",
        "location_name": "Junior Kuppanna Restaurant, Singapore",
        "location_id": "318744905241462",
        "latitude": 1.31011,
        "longitude": 103.85672
      },
      "img": "https://scontent.cdninstagram.com/t51.2885-15/e35/16122741_405776919775271_8171424637851271168_n.jpg?ig_cache_key=MTQ0Nzk0Nzg2NDI2ODc5MTYzNw%3D%3D.2",
      "date": "2018-04-26T15:07:32.000Z",
      "tags": ["#vegan", "#veganfood", "#vegansofig", "#veganfoodporn", "#vegansofig", ...],
      "likes": 206,
      "comments": {
        "count": 1,
        "list": [
          {
            "user": "pastaglueck",
            "comment": "nice veganfood"
           },
         ],
       },
     },
     .
     .
     .
     ],
  "prof_img": "https://scontent.cdninstagram.com/t51.2885-19/s320x320/14564896_1313394225351599_6953533639699202048_a.jpg",
  "followers": 1950,
  "following": 310
}

The script also collects usernames of users who commented on the posts and saves it in ./profiles/{username}_commenters.txt file, sorted by comment frequency.

With the help of Wordcloud you could do something like that with your used tags


Have Fun & Feel Free to report any issues
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].