All Git Users → internetarchive

23 open source projects by internetarchive

1. Wayback Machine Webextension
A web browser extension for Chrome, Firefox, Edge, and Safari 14.
✭ 231
javascript
2. Warc
Python library for reading and writing warc files
✭ 209
python
3. Heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
4. Openlibrary Client
Python Client Library for the Archive.org OpenLibrary API
✭ 138
python
5. Warctools
Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)
✭ 92
python
6. Umbra
A queue-controlled browser automation tool for improving web crawl quality
✭ 56
python
7. Internet Archive Voice Apps
Voice Apps (Actions on Google, Alexa Skill) of Internet Archive. Just say: "Ok Google, Ask Internet Archive to Play Jazz" or "Alexa, Ask Internet Internet Archive to play Instrumental Music"
8. Epub
For code related to making ePub files
✭ 37
python
9. Bookreader
The Internet Archive BookReader
10. Brozzler
brozzler - distributed browser-based web crawler
✭ 457
python
12. Warcprox
WARC writing MITM HTTP/S proxy
✭ 255
python
13. liveweb
Liveweb proxy of the Wayback Machine project
✭ 40
python
14. xfetch
Cache stampede test harness. Code accompanies the presentation made at RedisConf 2017, 30 May to 1 June, 2017, in San Francisco.
✭ 18
PHP
17. dweb-transports
No description, website, or topics provided.
18. openlibrary-bots
A repository of cleanup bots implementing the openlibrary-client
19. dweb-archive
No description, website, or topics provided.
20. trough
Trough: Big data, small databases.
21. iaux
Monorepo for Archive.org UX development and prototyping.
23. webarchive-commons
No description, website, or topics provided.
✭ 16
1-23 of 23 user projects