All Projects → JonasSchroeder → InstaCrawlR

JonasSchroeder / InstaCrawlR

Licence: other
Crawl public Instagram data using R scripts without API access token. See InstaCrawlR Instructions.pdf

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to InstaCrawlR

DaProfiler
DaProfiler allows you to create a profile on your target based in France only. The particularity of this program is its ability to find the e-mail addresses your target.
Stars: ✭ 58 (-46.3%)
Mutual labels:  instagram, social-media
Spam Bot 3000
Social media research and promotion, semi-autonomous CLI bot
Stars: ✭ 79 (-26.85%)
Mutual labels:  instagram, social-media
social-media-profiler
Find information from Twitter, Instagram, LinkedIn and Google Search about a person.
Stars: ✭ 34 (-68.52%)
Mutual labels:  instagram, social-media
instastory.js
This is a jQuery plugin to make it easy to get a feed from instagram. No need of access tokens and other stuff, Only thing needed is jQuery.
Stars: ✭ 36 (-66.67%)
Mutual labels:  instagram, hashtag-scraper
Nallagram
Nallagram is an open source social networking platform where users can share their views on various topics and interact among people in which they create, share, and/or exchange information and ideas in virtual communities and networks.
Stars: ✭ 30 (-72.22%)
Mutual labels:  instagram, social-media
apollo-instagram-clone
Apollogram | A place where you could share photos, like media, and follow peoples.
Stars: ✭ 24 (-77.78%)
Mutual labels:  instagram, social-media
Socialmanagertools Igbot
🤖 📷 Instagram Bot made with love and nodejs
Stars: ✭ 699 (+547.22%)
Mutual labels:  instagram, social-media
Social-Media-Monitor
Automatically monitor and log fan counters from social media(Facebook Pages, Twitter, Instagram, YouTube, Google+, OneSignal, Alexa) using APIs to Google Spreadsheet. Very useful for website admins and social media managers.
Stars: ✭ 36 (-66.67%)
Mutual labels:  instagram, social-media
big-data-upf
RECSM-UPF Summer School: Social Media and Big Data Research
Stars: ✭ 21 (-80.56%)
Mutual labels:  social-media, social-network-analysis
vosonSML
R package for collecting social media data and creating networks for analysis.
Stars: ✭ 65 (-39.81%)
Mutual labels:  social-media, social-network-analysis
Hashtag-Wall-Server
Hashtag wall that displays posts from social media
Stars: ✭ 33 (-69.44%)
Mutual labels:  instagram, social-media
Social-Media-Automation
Automate social media because you don't have to be active on all of them😉. Best way to be active on all social media without actually being active on them. 😃
Stars: ✭ 186 (+72.22%)
Mutual labels:  instagram, social-media
GNN-FakeNews
A collection of GNN-based fake news detection models.
Stars: ✭ 127 (+17.59%)
Mutual labels:  social-media, social-network-analysis
Flutter-Photoarc-app
(Full-stack) Fully functional social media app (Instagram clone) written in flutter and dart with backend node.js and Postgres SQL.
Stars: ✭ 38 (-64.81%)
Mutual labels:  instagram, social-media
awosome-ai-in-social-media
💻 Collect those AI & Bot use in social media wechat/facebook/twitter/instagram/weibo/TikTok etc.
Stars: ✭ 21 (-80.56%)
Mutual labels:  social-media, social-network-analysis
Intrinsic Image Popularity
The pytorch code of the paper "Intrinsic Image Popularity Assessment"
Stars: ✭ 179 (+65.74%)
Mutual labels:  instagram, social-media
SocialApp-React-Native
Social Networking mobile app similar to Instagram in React Native.
Stars: ✭ 79 (-26.85%)
Mutual labels:  instagram, social-media
Whizzz-The-ChatApp
Whizzz is a real-time, one-to-one Android chat application made using Firebase, a beautiful user interface, and a push-notification feature.
Stars: ✭ 66 (-38.89%)
Mutual labels:  instagram, social-media
SocialOrbitLayout
Kotlin based custom view to show floating objects that can be used for social apps.
Stars: ✭ 28 (-74.07%)
Mutual labels:  social-media
WorkGroup
Self-Hosted private Social Media-Intranet for Companies.
Stars: ✭ 21 (-80.56%)
Mutual labels:  social-media

Update September 2019: The script databaseCreator.R is working again. The previous version stopped working properly some time ago and left the lists mentions, hashtags, and text without values (NULL).

Update July 2019: I can confirm that the scripts still use as intended. As a response to multiple requests on Twitter and LinkedIn I've updated the jsonReader.R script so it exports an additional column (post_URL) that you can directly feed as an input for databaseCreator.R

Update March 2019: Instagram seems to change the structure of their response from time to time. I fixed the issue. Read more about it here: https://medium.com/@jonas.schroeder1991/update-instacrawlr-still-crawling-6500cd376ea3

Update October 2018: I added a new script (databaseCreator.R) which enables you to build your own Instagram database that you can use for Social Media Monitoring, comparing and selecting Influencers, or Competitive Analyses. databaseCreator scrapes Instagram based on a list of post URLs for Post Meta Data (text, hashtags, mentions, number of likes and comments) and Profile Meta Data (Author's @handle, number of followers, following, and posts).

More about databaseCreator in this Medium article.https://medium.com/@jonas.schroeder1991/build-your-own-instagram-database-134281e8ee92


InstaCrawlR

Crawl public Instagram data using R scripts without API access token.

Here's an example: https://medium.com/@jonas.schroeder1991/social-network-analysis-of-related-hashtags-on-instagram-using-instacrawlr-46c397cb3dbe

Please consult "InstaCrawlR Instructions.pdf" for more information on what InstaCrawlR can and can't do and how to use it.

Jonas


Instagram is constantly changing their API’s functionality (platform changelog). Following Facebook’s Cambridge Analytica incident and the resulting public pressure, the API use got restricted even more severely in April 2018. The new limit is now 200 calls per user per hour instead of 5,000. More restrictions are announced to become active in July and December 2018.

The company’s rational for restricting access to data is probably to prevent spamming behavior and data exploitation. However, since Social Media Platforms is now an integral part of everyday life, data gathered from these services have become more and more interesting for academic researchers.

In 2016, Instagram totally changed their API system. Developers have to submit their app to a rigorous permission review process in order to get an access token. Since academic researchers are not programming applications that are suitable for this review process (e.g., video-screen casting the app’s functionality from an end user’s point of view), they are basically unable to officially access valuable data for their research.

InstaCrawlR is a collection of R scripts that can be used to crawl public Instagram data without the need to have access to the official API. Its functionality is limited compared to what is possible using the official API. However, it seems to be the only option for non-developers to gather and analyze Instagram data.

Please note two things: As of July 2018, the scripts run as intended. This can change any time soon since Instagram is constantly limiting their API’s functionality. Also keep in mind that using these scripts can have legal consequences since Instagram does not allow automated scripts. I am not responsible for consequences of any kind.

USE AT YOUR OWN RISK. BE ETHICAL WITH USER DATA.


What it can do

InstaCrawlR consist of four scripts – jsonReader, hashtagExtractor, graphCreator, and g2gephi – which are described in the instruction PDF. InstaCrawlR can be used to download and analyze the most recent posts for any specific hashtag that can be found on Instagram’s Explore page (instagram.com/explore/tags/HASHTAG/). More specifically it can:

• Download the most recent posts for any hashtag

• Export a csv file that shows post ID, URL, number of likes, post owner ID, post text, and post date

• Automatically extract related hashtags from post text

• Images can be automatically downloaded, too

• Export related hashtags and frequency

• Create a graph showing the relationship of related hashtags (social network analysis)

• Export graph for further analysis in Gephi

What it can’t

• No specification of a certain timeframe (only most recent)

• No information on who liked the posts (only counter)

• Only post owner ID, not profile name

• Suspicious posts must be filtered out by hand using Excel

• No location information available

Please consult the instructions PDF for details.

Closing Words

You can use the script or parts in your own code. Please note that I am not a professional developer or trained programmer. I am sure InstaCrawlR’s code can be simplified and improved a lot. Feel free to clean up my code or change it to increase its capabilities. Again, use the scripts at your own risk. I am not reliable for any consequences. InstaCrawlR may only function for a limited time since Instagram is constantly changing their system. I will not necessarily support InstaCrawlR in the future. If you have any comments or suggestions you can reach me on LinkedIn. I am always looking forward to a nice conversation about the future of digital marketing, entrepreneurship, and data science.

Best regards, Jonas Schröder University of Mannheim, July 2018

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].