All Projects → sean-public → Fast Skiplist

sean-public / Fast Skiplist

Licence: mit
A fast, threadsafe skip list in Go

Programming Languages

go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to Fast Skiplist

Skiplist
A Go library for an efficient implementation of a skip list: https://godoc.org/github.com/MauriceGit/skiplist
Stars: ✭ 140 (-6.67%)
Mutual labels:  data-structures
Data Structures And Algorithms
Stars: ✭ 144 (-4%)
Mutual labels:  data-structures
Competitive Programming
VastoLorde95's solutions to 2000+ competitive programming problems from various online judges
Stars: ✭ 147 (-2%)
Mutual labels:  data-structures
Lago
📕 Data Structures and Algorithms library in TypeScript
Stars: ✭ 1,966 (+1210.67%)
Mutual labels:  data-structures
Crdts
A library of Conflict-Free Replicated Data Types for JavaScript
Stars: ✭ 143 (-4.67%)
Mutual labels:  data-structures
Recipe
RECIPE : high-performance, concurrent indexes for persistent memory (SOSP 2019)
Stars: ✭ 145 (-3.33%)
Mutual labels:  data-structures
Data Structures
Common data structures and algorithms implemented in JavaScript
Stars: ✭ 139 (-7.33%)
Mutual labels:  data-structures
Leetcode In Swift
My solutions to LeetCode problems written in Swift
Stars: ✭ 150 (+0%)
Mutual labels:  data-structures
Play With Data Structures
Codes of my MOOC Course <Play Data Structures in Java>. Updated contents and practices are also included. 我在慕课网上的课程《Java语言玩转数据结构》示例代码。课程的更多更新内容及辅助练习也将逐步添加进这个代码仓。
Stars: ✭ 1,878 (+1152%)
Mutual labels:  data-structures
Data Structure And Algorithms
Every thing related to data structure and algorithms.
Stars: ✭ 146 (-2.67%)
Mutual labels:  data-structures
Algorithms
A collection of common algorithms and data structures implemented in java, c++, and python.
Stars: ✭ 142 (-5.33%)
Mutual labels:  data-structures
Ultimate Java Resources
Java programming. All in one Java Resource for learning. Updated every day and up to date. All Algorithms and DS along with Development in Java. Beginner to Advanced. Join the Discord link.
Stars: ✭ 143 (-4.67%)
Mutual labels:  data-structures
Hackerrank
A collection of algorithms and solutions to problems in various languages from the site Hacker Rank.
Stars: ✭ 145 (-3.33%)
Mutual labels:  data-structures
Interviews
A list of fancy questions I've been asked during the interviews I had. Some of them I ask when interviewing people.
Stars: ✭ 140 (-6.67%)
Mutual labels:  data-structures
Graphlib
Simple but powerful graph library for Rust
Stars: ✭ 148 (-1.33%)
Mutual labels:  data-structures
19 udacity dsa
Data Structures & Algorithms Nanodegree Program from Udacity
Stars: ✭ 140 (-6.67%)
Mutual labels:  data-structures
Jupyter
Stars: ✭ 145 (-3.33%)
Mutual labels:  data-structures
Opends
Template Library of Data Structures in C++17
Stars: ✭ 151 (+0.67%)
Mutual labels:  data-structures
Immer
Postmodern immutable and persistent data structures for C++ — value semantics at scale
Stars: ✭ 1,935 (+1190%)
Mutual labels:  data-structures
Data Structures And Algorithms
Data Structures and Algorithms implemented In Python, C, C++, Java or any other languages. Aimed to help strengthen the concepts of DS&A. Give a Star 🌟 if it helps you.
Stars: ✭ 146 (-2.67%)
Mutual labels:  data-structures

fast-skiplist

Purpose

As the basic building block of an in-memory data structure store, I needed an implementation of skip lists in Go. It needed to be easy to use and thread-safe while maintaining the properties of a classic skip list.

There are several skip list implementations in Go. However, they all are implemented in slightly different ways with sparse optimizations and occasional shortcomings. Please see the skiplist-survey repo for a comparison of Go skip list implementations (including benchmarks).

The purpose of this repo is to offer a new, fast implementation with an easy-to-use interface that will suit general data storage purposes.

Operation Time Complexity
Insertion O(log N)
Removal O(log N)
Check if contains O(log N)
Enumerate in order O(N)

Quickstart

To start using the library right away, just do:

go get github.com/sean-public/fast-skiplist

There are no external dependencies, so you can start using it right away:

import github.com/sean-public/fast-skiplist

list := skiplist.New()
list.Set(123, "This string data is stored at key 123!")
fmt.Println(list.Get(123).value)
fmt.Println(list.Length)	// prints 1
list.Remove(123)
fmt.Println(list.Length)	// prints 0

Of course there are tests, including benchmarks and race condition detection with concurrency:

$ go test -cover
PASS
coverage: 100.0% of statements
ok      github.com/sean-public/fast-skiplist    0.006s

$ go test -race
Structure sizes: SkipList is 136, Element is 48 bytes
PASS
ok  	github.com/sean-public/fast-skiplist	41.530s

$ go test -bench=.
Structure sizes: SkipList is 136, Element is 48 bytes
goos: darwin
goarch: amd64
pkg: github.com/sean-public/fast-skiplist
BenchmarkIncSet-8   5000000    370 ns/op    13484040.32 MB/s    62 B/op    3 allocs/op
BenchmarkIncGet-8   10000000   205 ns/op    48592107.58 MB/s    0 B/op     0 allocs/op
BenchmarkDecSet-8   10000000   281 ns/op    35547886.82 MB/s    62 B/op    3 allocs/op
BenchmarkDecGet-8   10000000   212 ns/op    47124462.78 MB/s    0 B/op     0 allocs/op
PASS
ok  	github.com/sean-public/fast-skiplist	21.709s

About fast-skiplist

"Perfection is achieved not when there is nothing more to add, but rather when there is nothing more to take away" — Antoine de Saint-Exupery

If fast-skiplist is faster than other packages with the same features, it's because it does less wherever possible. It locks less, it blocks less, and it traverses less data. Even with these tricks up its sleeve, it has fewer lines of code than most implementations.

Calculating the Height of New Nodes

When inserting, it calculates "height" directly instead of consecutive "coin tosses" to add levels. Additionally, it uses a local PRNG source that isn't blocked globally for improved concurrent insert performance.

The probability of adding new nodes to each level of the structure (it's height) is determined by successively "rolling the dice" at each level until it doesn't meet a fixed value P. The default P values for skip lists in the wild range from 0.25 to 0.5. In this implementation, the default is 1/e, which is optimal for a general-purpose skip list. To find the derivation of this number, see Analysis of an optimized search algorithm for skip lists Kirschenhofer et al (1995).

Almost all other implementations are using common functions in math/rand, which will block because querying the PRNG to determine the height of new nodes waits then acquires a lock via the system-wide random number generator. We get around this by assigning a new rand source to each skip list instantiated, so each skip list can only ever block itself. This significantly speeds up insert times when you are managing multiple lists with high concurrency.

Additionally, this implementation always requests just one number from the PRNG. A pre-computed probability table is used to look up what the height of the new node will be. This is faster and offers a fixed calculation time compared to successive "dice rolls" for each level. The table is computed for each level L using the default P value of 1/e: math.Pow(1.0/math.E, L-1). It is consulted during inserts by querying for a random number in range [0.0,1.0) and finding the highest level in the table where the random number is less than or equal to the computed number.

For example, let's say math.Float64() returned r=0.029 and the table was pre-computed to contain (with a maximum height of 6):

height probability
1 1.000000000
2 0.367879441
3 0.135335283
4 0.049787068
5 0.018315639
6 0.006737947

So the height for the new node would be 5 because p5 > r ≥ p6, or 0.018315639 > 0.029 ≥ 0.006737947.

I believe this fast new node height calculation to be novel and faster than any others with user-defined P values. Ticki, for example, proposes an O(1) calculation but it is fixed to P=0.5 and I haven't encountered any other optimizations of this calculation. In local benchmarks, this optimization saves 10-25ns per insert.

Better Cooperative Multitasking

Why not a lock-free implementation? The overhead created is more than the time spent in contention of a locking version under normal loads. Most research on lock-free structures assume manual alloc/free as well and have separate compaction processes running that are unnecessary in Go (particularly with improved GC as of 1.8). The same is true for the newest variation, the rotating skip list, which claims to be the fastest to date for C/C++ and Java because the compared implementations have maintenance threads with increased overhead for memory management.

Caching and Search Fingers

As Pugh described in A Skip List Cookbook, search "fingers" can be retained after each lookup operation. When starting the next operation, the finger will point to where the last one occurred and afford the opportunity to pick up the search there instead of starting at the head of the list. This offers O(log m) search times, where m is the number of elements between the last lookup and the current one (m is always less than n).

This implementation of a search finger does not suffer the usual problem of "climbing" up in levels when resuming search because it stores pointers to previous nodes for each level independently.

Benchmarks

Speed is a feature! Below is a set of results demonstrating the flat performance (time per operation) as the list grows to millions of elements. Please see the skiplist-survey repo for complete benchmark results from this and other Go skip list implementations.

benchmark results chart

Todo

  • Build more complex test cases (specifically to prove correctness during high concurrency).
  • Benchmark memory usage.
  • Add "span" to each element to store the distance to the next node on every level. This gives each node a calculable index (ZRANK and associated commands in Redis).
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].