All Projects → bernsteining → Instaloctrack

bernsteining / Instaloctrack

An Instagram OSINT tool to collect all the geotagged locations available on an Instagram profile in order to plot them on a map, and dump them in a JSON.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Instaloctrack

InstagramLocationScraper
No description or website provided.
Stars: ✭ 13 (-84.71%)
Mutual labels:  instagram, scraper, selenium
Instagram-Scraper-2021
Scrape Instagram content and stories anonymously, using a new technique based on the har file (No Token + No public API).
Stars: ✭ 57 (-32.94%)
Mutual labels:  instagram, scraper, selenium
Scrapstagram
An Instagram Scrapper
Stars: ✭ 50 (-41.18%)
Mutual labels:  scraper, instagram, selenium
Spam Bot 3000
Social media research and promotion, semi-autonomous CLI bot
Stars: ✭ 79 (-7.06%)
Mutual labels:  scraper, instagram, selenium
Instagram-Comments-Scraper
Instagram comment scraper using python and selenium. Save the comments into excel.
Stars: ✭ 73 (-14.12%)
Mutual labels:  instagram, scraper, selenium
Osi.ig
Information Gathering Instagram.
Stars: ✭ 377 (+343.53%)
Mutual labels:  osint, scraper, instagram
Instagram Crawler
Get Instagram posts/profile/hashtag data without using Instagram API
Stars: ✭ 643 (+656.47%)
Mutual labels:  scraper, instagram
Socialmanagertools Igbot
🤖 📷 Instagram Bot made with love and nodejs
Stars: ✭ 699 (+722.35%)
Mutual labels:  instagram, selenium
Gisaid Scrapper
Scrapping tool for GISAID data regarding SARS-CoV-2
Stars: ✭ 25 (-70.59%)
Mutual labels:  scraper, selenium
Vue World Map
A Vue JS component for displaying dynamic data on a world map.
Stars: ✭ 33 (-61.18%)
Mutual labels:  geolocation, map
Operative Framework
operative framework is a OSINT investigation framework, you can interact with multiple targets, execute multiple modules, create links with target, export rapport to PDF file, add note to target or results, interact with RESTFul API, write your own modules.
Stars: ✭ 511 (+501.18%)
Mutual labels:  osint, scraper
Instagramfirstcommenter
This bot will post a predefined comment as fast as possible to a new post on the target profile. I used this to successfully win tickets for a big music festival.
Stars: ✭ 26 (-69.41%)
Mutual labels:  instagram, selenium
Botvid 19
Messenger Bot that scrapes for COVID-19 data and periodically updates subscribers via Facebook Messages. Created using Python/Flask, MYSQL, HTML, Heroku
Stars: ✭ 34 (-60%)
Mutual labels:  scraper, selenium
Instagram4j
📷 Instagram private API in Java
Stars: ✭ 629 (+640%)
Mutual labels:  scraper, instagram
Holehe
holehe allows you to check if the mail is used on different sites like twitter, instagram and will retrieve information on sites with the forgotten password function.
Stars: ✭ 568 (+568.24%)
Mutual labels:  osint, instagram
Instagram Profilecrawl
📝 quickly crawl the information (e.g. followers, tags etc...) of an instagram profile.
Stars: ✭ 816 (+860%)
Mutual labels:  instagram, selenium
Instagram Scraper
Scrapes an instagram user's photos and videos
Stars: ✭ 5,664 (+6563.53%)
Mutual labels:  scraper, instagram
Snoop
Snoop — инструмент разведки на основе открытых данных (OSINT world)
Stars: ✭ 886 (+942.35%)
Mutual labels:  osint, geolocation
Social Scraper
Tổng hợp script crawl dữ liệu từ các mạng xã hội & website tiếng Việt
Stars: ✭ 47 (-44.71%)
Mutual labels:  scraper, instagram
Skraper
Kotlin/Java library and cli tool for scraping posts and media from various sources with neither authorization nor full page rendering (Facebook, Instagram, Twitter, Youtube, Tiktok, Telegram, Twitch, Reddit, 9GAG, Pinterest, Flickr, Tumblr, IFunny, VK, Pikabu)
Stars: ✭ 72 (-15.29%)
Mutual labels:  scraper, instagram

instaloctrack

TL;DR : ascineema, video of the project

A tool to scrape geotagged locations on Instagram profiles. Output in JSON & interactive map.

requirements

sudo apt install chromium-chromedriver && chmod a+x /usr/bin/chromedriver

🛠️ installation

git clone https://github.com/bernsteining/instaloctrack
cd instaloctrack
pip3 install .

Or use Docker:

sudo docker build -t instaloctrack -f Dockerfile .

Usage

instaloctrack -h
usage: instaloctrack [-h] [-t TARGET_ACCOUNT] [-l LOGIN] [-p PASSWORD] [-v]

Instagram location data gathering tool. Usage: python3 instaloctrack.py -t <target_account>

optional arguments:
  -h, --help            show this help message and exit
  -t TARGET_ACCOUNT, --target TARGET_ACCOUNT
                        Instagram profile to investigate
  -l LOGIN, --login LOGIN
                        Instagram profile to connect to, in order to access
                        the instagram posts of the target account
  -p PASSWORD, --password PASSWORD
                        Password of the Instagram profile to connect to
  -v, --visual          Spawns Chromium GUI, otherwise Chromium is headless

e.g.

instaloctrack -t <target_account>

If the target profile is private and you have an account following the target profile you can scrape the data with a connected session:

instaloctrack -t <target_account> -l <your_account> -p <your_password>

or with Docker:

sudo docker run -v /tmp/output:/tmp/output instaloctrack -t <target_account> -o /tmp/output

⚙️ How it works

First, we retrieve all the pictures links of the account by scrolling the whole Instagram profile, thanks to selenium's webdriver.

Then, we retrieve asynchronously (asyncio) each picture link, we check if it contains a location in the picture description, and retrieve the location's data if there's one, and the timestamp.

  • NB: Since 2018 Instagram deprecated its location API and it's not possible anymore to get the GPS coordinates of a picture, all we can retrieve is the name of the location. (If you can prove me that I'm wrong about this, please tell me!)

Because Instagram doesn't provide GPS coordinates, and we're only given names of places, we have to geocode these (.ie. get the GPS coords from the name's place).

For this, I used Nominatim's awesome API, which uses OpenStreetMap. For our usage, no API key is required, and we respect Nominatim's usage Policy by requesting GPS coordinatess once every second.

Eventually, once we have all the GPS coordinatess, we generate a HTML (thanks to jinja2 templating) with Javascript embedded that plots an Open Street Map (thanks to Leaflet library) with all our locations pinned. Once again, no API key is required for this step.

Also, the data collected by the script (location names, timestamps, GPS coordinates, errors) are dumped to a JSON file in order to be re-used.

Example

As an example, here's the output on the former French President's Instagram profile, @fhollande:

Map of @fhollande's locations on Instagram

The Heatmap:

Heatmap of @fhollande's locations on Instagram

Information available when clicking on a marker:

available data when clicking on a marker

Stats about the location data:

stats about the location data

The JSON data dump (just a part of it to show the format for a given location):

{
    "link": "https://www.instagram.com/p/-Q_9EvR9eu",
    "place": {
      "id": "290297",
      "name": "Musée du quai Branly - Jacques Chirac",
      "slug": "musee-du-quai-branly-jacques-chirac",
      "street_address": " 37 quai Branly",
      " zip_code": " 75007",
      " city_name": " Paris",
      " region_name": " ",
      " country_code": " FR"
    },
    "timestamp": "2015-11-19",
    "gps": {
      "lat": "48.8566969",
      "lon": "2.3514616"
    }
  }

Possible Improvements

  • Cleaner code :D
  • Factorize the geocoding function which is waaay too long and cryptic
  • Use beautifulsoup instead of regex parsing
  • Remove weird blank space caused by progress bar
  • Use other geocoding tools (e.g. https://geo.api.gouv.fr/adresse) than Nominatim when it fails? (specify arg?)
    • Use geopy ?
    • Use Overpass instead of Nominatim ?
  • Add an argument to select only a set of pictures (selected by date, or rank)
  • Time information about the duration of the script
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].