All Projects → jirkapinkas → jsitemapgenerator

jirkapinkas / jsitemapgenerator

Licence: MIT license
Java sitemap generator. This library generates a web sitemap, can ping Google, generate RSS feed, robots.txt and more with friendly, easy to use Java 8 functional style of programming

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to jsitemapgenerator

grav-plugin-sitemap
Grav Sitemap Plugin
Stars: ✭ 34 (-10.53%)
Mutual labels:  sitemap, sitemap-generator
sitemap
A simple sitemap generator for Laravel Framework.
Stars: ✭ 32 (-15.79%)
Mutual labels:  sitemap, sitemap-generator
sitemap-plugin
Sitemap Plugin for Sylius eCommerce platform
Stars: ✭ 68 (+78.95%)
Mutual labels:  sitemap, sitemap-generator
php-sitemap
PHP Simple Sitemap Generator
Stars: ✭ 16 (-57.89%)
Mutual labels:  sitemap, sitemap-generator
ultimate-sitemap-parser
Ultimate Website Sitemap Parser
Stars: ✭ 118 (+210.53%)
Mutual labels:  sitemap, robots-txt
express-sitemap-xml
Serve sitemap.xml from a list of URLs in Express
Stars: ✭ 56 (+47.37%)
Mutual labels:  sitemap, sitemap-generator
sitewriter
A rust library to generate sitemaps.
Stars: ✭ 18 (-52.63%)
Mutual labels:  sitemap, sitemap-generator
X.Web.Sitemap
Simple sitemap generator for .NET
Stars: ✭ 66 (+73.68%)
Mutual labels:  sitemap, sitemap-generator
gatsby-blog-mdx
A ready-to-use, customizable personal blog with minimalist design
Stars: ✭ 61 (+60.53%)
Mutual labels:  rss, sitemap
HungryHippo
🦛 scrapes websites and generates rss feeds
Stars: ✭ 33 (-13.16%)
Mutual labels:  rss, rss-generator
scrape
Depth controllable Web scraper and Sitemap Generator in Go
Stars: ✭ 19 (-50%)
Mutual labels:  sitemap, sitemap-generator
Blog Generator
static blog generator for my blog at https://zupzup.org/
Stars: ✭ 57 (+50%)
Mutual labels:  rss, sitemap
eventsourcing-go
Event Sourcing + CQRS using Golang Tutorial
Stars: ✭ 75 (+97.37%)
Mutual labels:  rss, sitemap
buran
Bidirectional, data-driven RSS/Atom feed consumer, producer and feeds aggregator
Stars: ✭ 27 (-28.95%)
Mutual labels:  rss, rss-generator
Gatsby Advanced Starter
A high performance skeleton starter for GatsbyJS that focuses on SEO/Social features/development environment.
Stars: ✭ 1,224 (+3121.05%)
Mutual labels:  rss, sitemap
feedspora
FeedSpora posts RSS/Atom feeds to your social network accounts.
Stars: ✭ 31 (-18.42%)
Mutual labels:  rss
tidyRSS
An R package for extracting 'tidy' data frames from RSS, Atom, JSON and geoRSS feeds
Stars: ✭ 62 (+63.16%)
Mutual labels:  rss
react-production-deployment
Deploy your React app to production on Netlify, Vercel and Heroku
Stars: ✭ 51 (+34.21%)
Mutual labels:  lambda-functions
robots.txt
🤖 robots.txt as a service. Crawls robots.txt files, downloads and parses them to check rules through an API
Stars: ✭ 13 (-65.79%)
Mutual labels:  robots-txt
serializer
A linearizing social tech news reader
Stars: ✭ 89 (+134.21%)
Mutual labels:  rss

Java sitemap generator

This library generates a web sitemap and can ping Google that it has changed (also it can generate RSS feed and robots.txt). It has friendly, easy to use Java 8 functional API and is AWS-lambda friendly.

Typical usage:

Add this library to classpath:

<dependency>
  <groupId>cz.jiripinkas</groupId>
  <artifactId>jsitemapgenerator</artifactId>
  <version>4.5</version>
</dependency>

If you want to use "ping google / bing" functionality, also add this library to classpath:

<dependency>
    <groupId>com.squareup.okhttp3</groupId>
    <artifactId>okhttp</artifactId>
    <version>4.2.2</version> <!-- latest version should be fine, get latest version from https://javalibs.com/artifact/com.squareup.okhttp3/okhttp -->
</dependency>

Typical usage (web sitemap):

String sitemap = SitemapGenerator.of("https://example.com")
    .addPage("foo2.html") // simplest way how to add page - shorthand for addPage(WebPage.of("foo2.html"))
    .addPage(WebPage.of("foo1.html")) // same as addPage("foo1.html")
    .addPage(WebPage.builder().name("bar.html").build()) // builder is more complex
    .addPage(WebPage.builder().maxPriorityRoot().build()) // builder has lots of useful methods
    .toString();

or sitemap in gzip format:

byte[] sitemap = SitemapGenerator.of("https://example.com")
    .addPage(WebPage.builder().maxPriorityRoot().build())
    .addPage("foo.html")
    .addPage("bar.html")
    .toGzipByteArray();

you can set default settings (for the subsequent WebPages):

String sitemap = SitemapGenerator.of("https://example.com")
    .addPage(WebPage.builder().maxPriorityRoot().build()) // URL will be: "/"
    .defaultExtension("html")
    .defaultDir("dir1")
    .addPage("foo") // URL will be: "dir1/foo.html"
    .addPage("bar") // URL will be: "dir1/bar.html"
    .defaultDir("dir2")
    .addPage("hello") // URL will be: "dir2/hello.html"
    .addPage("yello") // URL will be: "dir2/yello.html"
    // btw. specifying dir and / or extension on WebPage overrides default settings
    .addPage(WebPage.builder().dir("dir3").extension(null).name("test").build()) // "dir3/test"
    .resetDefaultDir() // resets default dir
    .resetDefaultExtension() // resets default extension
    .addPage(WebPage.of("mypage")) // URL will be: "mypage"
    .toString();

or with list of pages:

List<String> pages = Arrays.asList("firstPage", "secondPage", "otherPage");
String sitemap = SitemapGenerator.of("https://example.com")
        .addPage(WebPage.builder().nameRoot().priorityMax().build())
        .defaultDir("dirName")
        .addPages(pages, page -> WebPage.of(page))
        .toString();

or list of pages in complex data type:

class News {
    private String name;
    public News(String name) { this.name = name; }
    public String getName() { return name; }
}
List<News> newsList = Arrays.asList(new News("a"), new News("b"), new News("c"));
String sitemap = SitemapGenerator.of("https://example.com")
        .addPage(WebPage.builder().nameRoot().priorityMax().build())
        .defaultDir("news")
        .addPages(newsList, news -> WebPage.of(news::getName))
        .toString();

or to store it to file & ping Google:

Ping ping = Ping.builder()
        .engines(Ping.SearchEngine.GOOGLE)
        .build();
SitemapGenerator.of("https://example.com")
    .addPage(WebPage.builder().maxPriorityRoot().changeFreqNever().lastModNow().build())
    .addPage("foo.html")
    .addPage("bar.html")
    // generate sitemap and save it to file ./sitemap.xml
    .toFile(Paths.get("sitemap.xml"))
    // inform Google that this sitemap has changed
    .ping(ping); // this requires okhttp in classpath!!!
    .callOnSuccess(() -> System.out.println("Pinged Google")) // what will happen on success
    .catchOnFailure(e -> System.out.println("Could not ping Google!")); // what will happen on error

Note: To ping Google / Bing, you can either use built-in support (requires OkHttp in classpath!!!), or you can use your own http client implementation. Supported http clients: Custom OkHttpClient, CloseableHttpClient (Apache Http Client), RestTemplate (from Spring). To use your own http client implementation just call on PingBuilder method: httpClient*() and pass inside your implementation.

How to create sitemap index:

String sitemapIndex = SitemapIndexGenerator.of("https://javalibs.com")
    .addPage("sitemap-plugins.xml")
    .addPage("sitemap-archetypes.xml")
    .toString();

How to create RSS channel:

... RSS ISN'T sitemap :-), but it's basically just a list of links (like sitemap) and if you need sitemap, then probably you also need RSS. Note: RssGenerator has lots of common methods with SitemapGenerator.

String rss = RssGenerator.of("https://topjavablogs.com", "Top Java Blogs", "Best Java Blogs")
    .addPage(WebPage.rssBuilder()
        .pubDate(LocalDateTime.now())
        .title("News Title")
        .description("News Description")
        .link("page-name")
        .build())
    .toString();

How to create robots.txt:

... robots.txt ISN'T sitemap :-), but inside it you reference your sitemap and if you need sitemap, then you probably need robots.txt as well :-)

String robotsTxt = RobotsTxtGenerator.of("https://example.com")
        .addSitemap("sitemap.xml")
        .addRule(RobotsRule.builder().userAgentAll().allowAll().build())
        .toString();

How to check sitemap:

Best practices & performance

  • SitemapGenerator (and other Generator classes) are builders, thus they're not immutable.
  • Also having SitemapGenerator as singleton and at the same time calling addPage() and toString() (in multiple threads) isn't really advised. SitemapGenerator operations aren't thread-safe (with one exception: SitemapGenerator.of(), which creates new instance of SitemapGenerator).
  • When you call addPage(), you store it to Map, where key is page's URL (so you cannot have two items with the same URL in sitemap).
  • toString(), toFile(), toGzipByteArray() methods (terminal operations) generate final sitemap from the Map of objects. So when creating sitemap, most time will be spent executing terminal operation.
  • If you need raw speed for accessing sitemap, I suggest to:
    • either save sitemap to external file and then just get the data from file
    • or cache the result of terminal operation

My other projects:

What I used to upload jsitemapgenerator to Maven Central:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].