All Projects → dean9703111 → social_crawler

dean9703111 / social_crawler

Licence: MIT License
《JavaScript 爬蟲新思路!從零開始帶你用 Node. js 打造 FB&IG 爬蟲專案》書籍範例程式

Programming Languages

javascript
184084 projects - #8 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to social crawler

Auto-Download-QQMail-Attach
Python + Selenium + Chrome 模拟登陆QQ邮箱,批量下载附件,本地重命名
Stars: ✭ 38 (+137.5%)
Mutual labels:  selenium-webdriver
google-sheet-to-github-website
This is a working project for operating a data driven website on Github Pages using Google Sheets as a data source.
Stars: ✭ 20 (+25%)
Mutual labels:  google-sheets
page-modeller
⚙️ Browser DevTools extension for modelling web pages for automation.
Stars: ✭ 66 (+312.5%)
Mutual labels:  selenium-webdriver
snap2
Advanced tooling for puzzle hunts: grid/crossword parser, crossword tool to fill in the grid when entering answers, heavy-duty pattern/anagram solver, and more
Stars: ✭ 14 (-12.5%)
Mutual labels:  google-sheets
awesome-placekey
😎 Awesome lists about awesome placekey related frameworks, libraries, software, tools, and resources
Stars: ✭ 21 (+31.25%)
Mutual labels:  google-sheets
discord2sheet-bot
Discord bot that stores messages to Google Sheet.
Stars: ✭ 40 (+150%)
Mutual labels:  google-sheets
Mosque-Screen
Chat: https://discord.gg/CG7frj2 - Email: [email protected]. We do not provide any support, this is a volunteer-based project therefore we cannot commit to any time to resolve local issues.
Stars: ✭ 54 (+237.5%)
Mutual labels:  google-sheets
esp32-weather-google-sheets
Weather station based on ESP32 and MicroPython with sending data to Google Sheets
Stars: ✭ 48 (+200%)
Mutual labels:  google-sheets
nakal
A MySQL backup tool for Google Sheets, written in Node.js.
Stars: ✭ 14 (-12.5%)
Mutual labels:  google-sheets
PWAF
Python Webdriver Automation Framework
Stars: ✭ 37 (+131.25%)
Mutual labels:  selenium-webdriver
gsheet to arb
Import translations (ARB/Dart) from Google Sheets
Stars: ✭ 21 (+31.25%)
Mutual labels:  google-sheets
sheets-database
Library to help use a Google Sheet as a database
Stars: ✭ 36 (+125%)
Mutual labels:  google-sheets
PaperScraper
A web scraping tool to systematically extract the text of scientific papers and corresponding metadata from university accessible journals.
Stars: ✭ 63 (+293.75%)
Mutual labels:  selenium-webdriver
BadgeHub
Raspberry Pi, Dymo Turbo Writer 450 badge printing service that logs user information such as name and email and prints a name badge and QR code associated with that information.
Stars: ✭ 25 (+56.25%)
Mutual labels:  google-sheets
LBDuoDian
No description or website provided.
Stars: ✭ 21 (+31.25%)
Mutual labels:  selenium-webdriver
Z-Spider
一些爬虫开发的技巧和案例
Stars: ✭ 33 (+106.25%)
Mutual labels:  selenium-webdriver
headless-chrome
Implementation of the new headless chrome with chromedriver and selenium.
Stars: ✭ 34 (+112.5%)
Mutual labels:  selenium-webdriver
atata-kendoui
A set of Atata components for Kendo UI
Stars: ✭ 17 (+6.25%)
Mutual labels:  selenium-webdriver
wdio-video-reporter
Reporter for WebdriverIO v6 that makes videos of failed tests and has optional allure integration
Stars: ✭ 54 (+237.5%)
Mutual labels:  selenium-webdriver
lifebot
Use Google Sheets to log your life by texting it Emojis and pulling in data from Fitbit automatically.
Stars: ✭ 15 (-6.25%)
Mutual labels:  google-sheets

JavaScript 爬蟲新思路!

從零開始帶你用 Node. js 打造 FB&IG 爬蟲專案

image

有興趣的朋友可以到天瓏書局選購,感謝大家的支持。購書連結

參考資源目錄

PART 2 開發前環境介紹&設定

Ch3. 開發前環境介紹&設定

PART 3 寫程式所需的基礎常識(Node.js)

Ch4. 寫程式時該注意的基本原則
Ch5. 認識 Node.js 專案
Ch6. 用 Yarn 安裝及控管套件
Ch7. 善用「.env」管理環境變數,幫你快速遷移專案
Ch8. 在「.gitignore」設定不加入版控的資料

PART 4 用 selenium-webdriver 爬蟲網頁資訊

Ch9. 爬蟲之前
Ch10. 認識 selenium-webdriver:操作所見即所得的爬蟲工具
Ch11. 爬蟲第一步,FB 先登入
Ch12. 關閉瀏覽器彈窗,取得 FB 粉專追蹤數
Ch13. 舉一反三,帶你了解 IG 爬蟲不可忽略的細節
Ch14. 將 FB 與 IG 爬蟲融合
Ch15. 重構程式碼,減少歷史業障
Ch16. 用 try-catch 捕獲爬蟲的過程中發生的錯誤
Ch17. json x 爬蟲 = 瑣事自動化
Ch18. 驗證 json 檔的內容是否符合格式
Ch19. 優化爬蟲的小技巧

PART 5 使用 Google Sheets 儲存爬蟲資訊

Ch21. 免費儲存資料的好選擇,一起進入省錢起手式
Ch22. 了解官方範例在做什麼事
Ch23. 你在文件迷路了嗎?用兩個處理 Sheet 的範例帶你攻略官方文件
Ch24. 寫入爬蟲資料,告別 Copy & Paste 的日子
Ch25. 客戶:「爬蟲資料塞錯位置!」專案被報 Bug 的處理方式
Ch26. 客戶:「我希望新資料插在最前面!」如何談需求變更
Ch27. 優化格式,滿足客戶需求&談使用者體驗

PART 6 設定排程自動執行爬蟲程式

Ch28. 用 schedule 套件讓爬蟲自己動起來
Ch29. 用 pm2 套件來控管排程,背景執行才是王道!
Ch30. 今天爬蟲怎麼沒有跑?來試試系統內建的排程吧!

PART 7 透過 LINE 回報爬蟲狀況

Ch31. 透過 POSTMAN 了解 LINE Notify 如何使用
Ch32. 用 axios 發出 LINE 通知
Ch33. 整合 LINE 的爬蟲通知,專案大功告成!

免責聲明:書中教學與範例程式僅抓取公開數據作爲研究,任何組織和個人不得以此技術盜取他人智慧財產、造成網站損害,否則一切后果由該組織或個人承擔。作者不承擔任何法律及連帶責任!

更新紀錄

2021.11.15:因應 FB 改版,微調爬蟲程式邏輯,解決「追蹤人數」精確度判定問題,相關 commit 請看:連結
2021.11.18:因應 IG 改版,調整登入檢測程式;並修改部分範例連結,相關 commit 請看:連結
2021.12.13:因應 IG 改版,調整抓取追蹤人數的 XPath,相關 commit 請看:連結
2021.12.17:因應 IG 改版,調整抓取追蹤人數的 XPath(IG 最近很喜歡改來改去的 QQ),相關 commit 請看:連結
2021.2.13:因應 IG 改版,調整抓取追蹤人數的 XPath(IG 常常會有路徑上細微的調整),相關 commit 請看:連結

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].