Skip to content
@internetarchive

Internet Archive

The Internet Archive is "the library of the Internet", and a big supporter of Free Software.

Pinned Loading

  1. openlibrary openlibrary Public

    One webpage for every book ever published!

    Python 5.4k 1.4k

  2. bookreader bookreader Public

    The Internet Archive BookReader

    JavaScript 1k 428

  3. heritrix3 heritrix3 Public

    Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

    Java 2.9k 762

  4. cicd cicd Public

    build & test using github registry; deploy to nomad clusters

    14

Repositories

Showing 10 of 253 repositories
  • brozzler Public

    brozzler - distributed browser-based web crawler

    internetarchive/brozzler’s past year of commit activity
    Python 684 Apache-2.0 98 32 17 Updated Feb 15, 2025
  • bookreader Public

    The Internet Archive BookReader

    internetarchive/bookreader’s past year of commit activity
    JavaScript 1,022 AGPL-3.0 428 136 (3 issues need help) 94 Updated Feb 14, 2025
  • archive-hocr-tools Public

    Efficient hOCR tooling

    internetarchive/archive-hocr-tools’s past year of commit activity
    Python 42 9 2 1 Updated Feb 14, 2025
  • Zeno Public

    State-of-the-art web crawler 🔱

    internetarchive/Zeno’s past year of commit activity
    HTML 109 AGPL-3.0 17 21 (1 issue needs help) 4 Updated Feb 14, 2025
  • heritrix3 Public

    Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

    internetarchive/heritrix3’s past year of commit activity
    Java 2,896 762 34 4 Updated Feb 14, 2025
  • openlibrary Public

    One webpage for every book ever published!

    internetarchive/openlibrary’s past year of commit activity
    Python 5,440 AGPL-3.0 1,449 796 (34 issues need help) 145 Updated Feb 14, 2025
  • internetarchive/iaux-collection-browser’s past year of commit activity
    TypeScript 6 AGPL-3.0 1 2 15 Updated Feb 13, 2025
  • internetarchive/internetarchivebot’s past year of commit activity
    PHP 132 AGPL-3.0 34 0 2 Updated Feb 13, 2025
  • iare Public

    An interactive IARI JSON viewer

    internetarchive/iare’s past year of commit activity
    JavaScript 5 AGPL-3.0 5 32 2 Updated Feb 13, 2025
  • iaux-notification-toast Public

    displays notifications and automatically clears them

    internetarchive/iaux-notification-toast’s past year of commit activity
    TypeScript 0 AGPL-3.0 0 1 12 Updated Feb 12, 2025