All Projects → Dridi → Libvmod Querystring

Dridi / Libvmod Querystring

Licence: other
Query-string module for Varnish Cache

Programming Languages

c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to Libvmod Querystring

tall
Promise-based, No-dependency URL unshortner (expander) module for Node.js
Stars: ✭ 56 (-34.12%)
Mutual labels:  url, module, url-parsing
Tldts
JavaScript Library to work against complex domain names, subdomains and URIs.
Stars: ✭ 151 (+77.65%)
Mutual labels:  url, url-parsing
Galimatias
galimatias is a URL parsing and normalization library written in Java.
Stars: ✭ 146 (+71.76%)
Mutual labels:  url, url-parsing
Hyperlink
🔗 Immutable, Pythonic, correct URLs.
Stars: ✭ 198 (+132.94%)
Mutual labels:  url, url-parsing
Uddup
Urls de-duplication tool for better recon.
Stars: ✭ 103 (+21.18%)
Mutual labels:  url, url-parsing
Faup
Fast URL decoder library
Stars: ✭ 159 (+87.06%)
Mutual labels:  url, url-parsing
Furl
🌐 URL parsing and manipulation made easy.
Stars: ✭ 2,152 (+2431.76%)
Mutual labels:  url, url-parsing
libvmod-geoip2
A Varnish VMOD to query MaxMind GeoIP2 DB files
Stars: ✭ 37 (-56.47%)
Mutual labels:  module, varnish
uri
A type to represent, query, and manipulate a Uniform Resource Identifier.
Stars: ✭ 16 (-81.18%)
Mutual labels:  url, url-parsing
Tinypart
TinyPart is an iOS modularization framework implemented by Ojective-C. It also supports URL-routing and inter-module communication. TinyPart是一个由Objective-C编写的面向协议的iOS模块化框架,同时它还支持URL路由和模块间通信机制。
Stars: ✭ 120 (+41.18%)
Mutual labels:  module, url
Scala Uri
Simple scala library for building and parsing URIs
Stars: ✭ 225 (+164.71%)
Mutual labels:  url, url-parsing
Frontexpress
An Express.js-Style router for the front-end
Stars: ✭ 263 (+209.41%)
Mutual labels:  url, url-parsing
Universal Url
WHATWG URL for Node & Browser.
Stars: ✭ 20 (-76.47%)
Mutual labels:  url, url-parsing
Css Vars Ponyfill
Client-side support for CSS custom properties (aka "CSS variables") in legacy and modern browsers
Stars: ✭ 1,166 (+1271.76%)
Mutual labels:  module
Swiftlinkpreview
It makes a preview from an URL, grabbing all the information such as title, relevant texts and images.
Stars: ✭ 1,216 (+1330.59%)
Mutual labels:  url
Reexport.jl
Julia macro for re-exporting one module from another
Stars: ✭ 69 (-18.82%)
Mutual labels:  module
Invoice It
📃 Generate your orders or your invoices and export them in html, pdf or buffer easily.
Stars: ✭ 69 (-18.82%)
Mutual labels:  module
Gitio.fish
Create a custom git.io URL.
Stars: ✭ 81 (-4.71%)
Mutual labels:  url
Urllib
A modern URL toolset for Java. Compliant with RFC 3986.
Stars: ✭ 77 (-9.41%)
Mutual labels:  url-parsing
Annotationkit
The annotation implementation using Objective-C
Stars: ✭ 68 (-20%)
Mutual labels:  module

================ vmod-querystring

.. image:: https://travis-ci.org/Dridi/libvmod-querystring.svg :alt: Travis CI badge :target: https://travis-ci.org/Dridi/libvmod-querystring/ .. image:: https://codecov.io/gh/Dridi/libvmod-querystring/branch/master/graph/badge.svg :alt: Codecov badge :target: https://codecov.io/gh/Dridi/libvmod-querystring

Description

The purpose of this module is to give you a fine-grained control over a URL's query-string in Varnish Cache. It's possible to remove the query-string, clean it, sort its parameters or filter it to only keep a subset of them.

This can greatly improve your hit ratio and efficiency with Varnish, because by default two URLs with the same path but different query-strings are also different. This is what the RFCs mandate but probably not what you usually want for your web site or application.

A query-string is just a character string starting after a question mark in a URL. But in a web context, it is usually a structured key/values store encoded with the application/x-www-form-urlencoded media type. This module deals with this kind of query-strings.

Examples

Consider the default hashing in Varnish::

sub vcl_hash {
    hash_data(req.url);
    if (req.http.host) {
        hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }
    return (lookup);
}

Clients requesting /index.html and /index.html? will most likely get the exact same response with most web servers / frameworks / stacks / wossname but Varnish will see two different URLs and end up with two duplicate objects in the cache.

This is a problem hard to solve with Varnish alone because it requires some knowledge of the back-end application but it can usually be mitigated with a couple assumptions:

  • the application doesn't need query-strings
  • except for POST requests that are not cached
  • and for analytics/tracking purposes

In this case it can be solved like this::

import querystring;

sub vcl_hash {
    if (req.method == "GET" || req.method == "HEAD") {
        hash_data(querystring.remove(req.url));
    }
    else {
        hash_data(req.url);
    }
    hash_data(req.http.host);
    return (lookup);
}

This way Varnish will get the same unique hash for both /index.html and /index.html? but the back-end application will receive the original client request. Depending on your requirements/goals, you may also take a different approach.

Surely enough this module can do more than what a simple regular expression substitution (regsub) could do, right? First, readability is improved. It should be obvious what the previous snippet does with no regex to decipher.

Second, it makes more complex operations easier to implement. For instance, you may want to remove Google Analytics parameters from requests because:

  • they could create cache duplicates for every campaigns
  • the application does not need them, only marketing folks
  • the user's browser makes AJAX calls to GA regardless
  • they can be delivered to marketing via varnishncsa

It could be solved like this::

import std;
import querystring;

sub vcl_init {
    new ga = querystring.filter();
    ga.add_regex("^utm_.*");
}

sub vcl_recv {
    std.log("ga:" + ga.extract(req.url, mode = keep));
    set req.url = ga.apply(req.url);
}

This is enough to remove all Analytics parameters you may use (utm_source, utm_medium, utm_campaign etc) and keep the rest of the query-string unless there are no other parameters in which case it's simply removed. The log statement allows you to get those analytics parameters (and only them) in varnishncsa using the format string %{VCL_Log:ga}x.

All functions are documented in the manual page vmod_querystring(3).

Installation

The module relies on the GNU Build System, also known as autotools. To install it, start by grabbing the latest release [1]_ and follow these steps::

# Get to the source tree
tar -xzf vmod-querystring-${VERSION}.tar.gz
cd vmod-querystring-${VERSION}

# Build and install
./configure
make
make check # optional
sudo make install

You only need to have Varnish (at least 6.0.6) and its development files installed on your system. Instead of manually installing the module you can build packages, see below. The configure script also needs pkg-config installed to find Varnish development files.

If your Varnish installation did not use the default /usr prefix, you will likely need to at least set the pkg-config path to find your Varnish installation. For example add this in your environment before running ./configure::

export PKG_CONFIG_PATH=${PREFIX}/lib/pkgconfig

Or the approach recommended by autoconf::

./configure PKG_CONFIG_PATH=${PREFIX}/lib/pkgconfig ...

The module is then configured for an installation inside ${PREFIX}, unless the --prefix option was used in the configure execution. For more information about what can be configured, run ./configure --help.

Alongside the release archive, you will find a PDF export of the module's manual.

RPM Packaging

Instead of directly installing the package you can build an RPM::

make rpm

The resulting packages can be found in the rpmbuild directory in your build tree.

If you need to build an RPM for a different platform you may use mock(1) with the proper --root option. All you got to do is run make mockbuild and set the desired flags in the MOCK_OPTS variable. For instance, to build RPMs for CentOS 7::

make mockbuild MOCK_OPTS='--root epel-7-x86_64'

The resulting packages can be found in the mockbuild directory in your build tree.

DPKG Packaging

DPKG packaging is also available with dpkg-buildpackage(1), using the deb target::

make deb

It is possible to either redefine the DPKG_BUILDPACKAGE command or simply add options via DPKG_BUILDPACKAGE_OPTS. For example to specify a specific privilege escalation method::

make deb DPKG_BUILDPACKAGE_OPTS=-rfakeroot

The resulting packages can be found in the dpkgbuild directory in your build tree. By default sources and changes are NOT signed, in order to sign packages the DPKG_BUILDPACKAGE variable MUST be redefined.

If you need to build a Deb for a specific platform you may use pdebuild(1) and pbuilder(8) to set up the base tarball and then run make pdebuild and set the desired flags in the PDEBUILD_OPTS variable. For instance to build debs for Debian Sid, assuming your environment is properly configured to switch between distributions::

make pdebuild PDEBUILD_OPTS='-- --distribution sid'

The resulting packages can be found in the pdebuild directory in your build tree.

As an alternative to pdebuild(1) you may prefer sbuild(1) instead. Similarly, you may run make sbuild and set the desired flags in the SBUILD_OPTS variable. For instance to build debs for Debian Sid, assuming your environment is properly configured to switch between distributions::

make sbuild SBUILD_OPTS='--dist sid'

The resulting packages can be found in the sbuild directory in your build tree.

Hacking

When working on the source code, there are additional dependencies:

  • autoconf
  • automake
  • libtool
  • rst2man (python3-docutils)
  • varnish (at least 6.0.6)

You will notice the lack of a configure script, it needs to be generated with the various autotools programs. Instead, you can use the bootstrap script that takes care of both generating and running configure. It also works for VPATH_ builds.

.. _VPATH: https://www.gnu.org/software/automake/manual/html_node/VPATH-Builds.html

Arguments to the bootstrap script are passed to the underlying execution of the generated configure script. Once bootstrap is done, you can later run the configure script directly if you need to reconfigure your build tree or use more than one VPATH.

If your Varnish installation did not use the default /usr prefix, you need this in your environment before running ./bootstrap::

export ACLOCAL_PATH=${PREFIX}/share/aclocal

See also

To learn more about query-strings and HTTP caching, you can have a look at the relevant RFCs:

  • RFC 1866 Section 8.2.1__: The form-urlencoded Media Type
  • RFC 3986 Section 3__: Syntax Components
  • RFC 7234 Section 2__: Overview of Cache Operation

__ https://tools.ietf.org/html/rfc1866#section-8.2.1 __ https://tools.ietf.org/html/rfc3986#section-3 __ https://tools.ietf.org/html/rfc7234#section-2

The test suite also shows the differences in cache hits and misses with and without the use of this module.

.. [1] https://github.com/Dridi/libvmod-querystring/releases/latest

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].