All Projects → sargun → wat

sargun / wat

Licence: other
How fast are computers?

Programming Languages

c
50402 projects - #5 most used programming language
CMake
9771 projects

Projects that are alternatives of or similar to wat

Gear Lib
Gear-Lib, C library for IOT Embedded Multimedia and Network
Stars: ✭ 2,381 (+9057.69%)
Mutual labels:  ipc
go-ml-benchmarks
⏱ Benchmarks of machine learning inference for Go
Stars: ✭ 27 (+3.85%)
Mutual labels:  benchmarks
lf-os amd64
Hobby operating system for amd64 with high ambitions - mirror
Stars: ✭ 30 (+15.38%)
Mutual labels:  lowlevel
Iceoryx
iceoryx - true zero-copy inter-process-communication
Stars: ✭ 208 (+700%)
Mutual labels:  ipc
memo wise
The wise choice for Ruby memoization
Stars: ✭ 486 (+1769.23%)
Mutual labels:  benchmarks
procbridge
A super-lightweight IPC (Inter-Process Communication) protocol over TCP socket.
Stars: ✭ 118 (+353.85%)
Mutual labels:  ipc
Andromeda
Andromeda simplifies local/remote communication for Android modularization
Stars: ✭ 2,203 (+8373.08%)
Mutual labels:  ipc
Static-Sort
A simple C++ header-only library for fastest sorting of small arrays. Generates sorting networks on compile time via templates.
Stars: ✭ 30 (+15.38%)
Mutual labels:  benchmarks
anybench
CPU Benchmarks Set
Stars: ✭ 54 (+107.69%)
Mutual labels:  benchmarks
every
Fluent API job scheduling engine for .NET
Stars: ✭ 22 (-15.38%)
Mutual labels:  scheduling
Tosdatabridge
A collection of resources for pulling real-time streaming data off of TDAmeritrade's ThinkOrSwim(TOS) platform; providing C, C++, Java and Python interfaces.
Stars: ✭ 229 (+780.77%)
Mutual labels:  ipc
iris
Lightweight Component Model and Messaging Framework based on ØMQ
Stars: ✭ 50 (+92.31%)
Mutual labels:  message-passing
production-scheduling
Solving an item production scheduling problem with the help of mathematical optimization
Stars: ✭ 35 (+34.62%)
Mutual labels:  scheduling
Ohsce
PHP HI-REL SOCKET TCP/UDP/ICMP/Serial .高可靠性PHP通信&控制框架SOCKET-TCP/UDP/ICMP/硬件Serial-RS232/RS422/RS485 AND MORE!
Stars: ✭ 206 (+692.31%)
Mutual labels:  ipc
gr-eventstream
gr-eventstream is a set of GNU Radio blocks for creating precisely timed events and either inserting them into, or extracting them from normal data-streams precisely. It allows for the definition of high speed time-synchronous c++ burst event handlers, as well as bridging to standard GNU Radio Async PDU messages with precise timing easily.
Stars: ✭ 38 (+46.15%)
Mutual labels:  message-passing
Chromium Ipc Sniffer
A tool to capture communication between Chromium processes on Windows
Stars: ✭ 197 (+657.69%)
Mutual labels:  ipc
Chime
Time scheduler for Vert.x
Stars: ✭ 28 (+7.69%)
Mutual labels:  scheduling
FactoryOrchestrator
A cross-platform system service which provides a simple way to run and manage factory line validation, developer inner-loop, diagnostics, and fault analysis workflows.
Stars: ✭ 36 (+38.46%)
Mutual labels:  ipc
simple-wallet
This is a simple wallet REST api that is capable of acount deposits and withdrawals, checking for account balance and providing a ministatement. It follows domain driven design practices. The project uses the DDD architecture approach.
Stars: ✭ 32 (+23.08%)
Mutual labels:  message-passing
spring-batch-quartz-admin
Spring Batch scheduling using Quartz
Stars: ✭ 16 (-38.46%)
Mutual labels:  scheduling

Wat?

This project came out of the question: What's the fastest way to communicate between processes, and just how "fast" is it?

For the impatient, you need the following build dependencies on Ubuntu 16.04+. Earlier versions of Ubuntu don't have all the requisite dependencies, so you'll have trouble building with it. You must have an x86_64 processor with the constant_tsc feature. You can run cat /proc/cpuinfo |grep constant_tsc in order to determine whether or not your processor supports it. A processor that supports it will output some number of lines, one that doesn't will be blank.

Dependencies:

  • libck-dev
  • build-essential
  • cmake

It's recommended you install the following tools as well:

  • cpufrequtils
  • schedtool
  • linux-tools-common
  • linux-tools-generic
  • linux-tools-uname -r

Or, in a few quick commands:

# apt-get install -y build-essential libck-dev cmake 
## Recommended:
# apt-get install -y strace schedtool cpufrequtils linux-tools-common linux-tools-generic linux-tools-`uname -r`

# Debug symbols, based on: https://wiki.ubuntu.com/Debug%20Symbol%20Packages
# echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse
deb http://ddebs.ubuntu.com $(lsb_release -cs)-updates main restricted universe multiverse
deb http://ddebs.ubuntu.com $(lsb_release -cs)-proposed main restricted universe multiverse" | \
sudo tee -a /etc/apt/sources.list.d/ddebs.list
# sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 428D7C01 C8CAB6595FDFF622
# sudo apt-get update
# apt-get install -y libc6-dbg

In order to build it:

# mkdir build/
# cmake ..
# make

What's included?

Message passing benchmarks

These benchmarks test how long it takes to pass back and forth a 1 byte number between processes in as low latency as possible. Please read-on on how to run them.

These are message passing benchmark. It comes in a few different flavours:

  • bench: Shared memory based message passing
  • bench2: Shared-memory, single core based message passing with serialized execution
  • bench3: POSIX Queue based message passing benchmark
  • bench4: Unix Socket based message passing benchmark

All of these EXCEPT bench2 require invoking two separate processes (from the same directory). They require a sender mode process, and a receiver mode process:

Sender: ./bench s

Receiver: ./bench r

How slow is my computer benchmarks?

cgt

CGT determines how long it takes for your computer to call clock_gettime(CLOCK_MONOTONIC, *). The output is in cycles.

gp

GP determines how long a syscall takes on your computer, as an indirect mechanism to measure the cost of a context-switch. It uses the getpid syscall, and therefore it's very cheap. The output is in cycles.

rdtscp

RDTSCP measures both how long it takes for your system to execute the RDTSCP instruction, and how to convert between nanoseconds, and cycles. The output looks something like this:

# schedtool -F -p 99 -e taskset -c 0 ./rdtscp 
Nanoseconds per rdtscp: 9
Total Nanoseconds: 48489389
Total Counter increases: 169902778
Nanoseconds per cycle: 0.285395
Cycles per nanosecond: 3.503917

In order to convert from cycles to nanoseconds, you multiply times the constant 0.285395. Also, "Nanoseconds per rdtscp" determines roughly how many ns+ your benchmark will be.

sy

SY measured how long your kernel / computer / processor takes to call sched_yield(). Sched_yield is a syscall which context switches to the kernel and in turn calls schedule(), which hands off the current execution context to the system's pool of tasks. It's useful to determine how long involuntary scheduler switches would cost you, and running it under perf can help you find out where your computer is being slow.

Recommendations on running tests

CPU Frequency Utilities

You would be surprised how much modern processor power savings features hurt performance. In my testing I found a 220% latency decrease. For this reason, I recommend turning on the performance governor like so:

# PROCESSORS=$(nproc)
# cpufreq-set -c 0-$(($PROCESSORS-1)) -g performance

Core Pinning

Because we're testing how long it takes to pass messages, it's valuable to pin your tasks to cores. For this, you can use the taskset command like so:

# taskset -c 2 ./bench2

I also recommend denying the scheduler the ability to scheduler processes on the other thread that belongs to the given hyperthreaded core you've scheduled your process(es) upon. Usually, while testing, I run benchmarks on cores 0, and 2, and I deny schedulong on 1 and 3 because they're thread siblings of those cores.

Scheduler Fun

WARNING WARNING WARNING WARNING

Do not even think of doing this if you have fewer than 3 physical cores

Be very careful with this. If you use SCHED_FIFO you are completely taking over a CPU core, and stealing it away from the operating system for everything other than hardware interrupts.

In order to do this, you prefix the command with: schedtool -F -p 99 -e. -F tells schedtool to use SCHED_FIFO, which tells the Linux kernel to never preempt your task. -p 99 tells the kernel that it's priority 99. -e indicates to run the command following.

Example:

### A run with SCHED_FIFO:
# schedtool -F -p 99 -e ./bench2
Average cycles: 92
Median Iteration Cycles: 90
Min Cycles: 70
95th Percentile Cycles: 110
Invol Ctx Switches: 2
Voluntary Ctx Switches: 0

### A run without SCHED_FIFO:
# ./bench2
Average cycles: 93
Median Iteration Cycles: 90
Min Cycles: 72
95th Percentile Cycles: 112
Invol Ctx Switches: 82
Voluntary Ctx Switches: 0

All Together

Preparation:

# PROCESSORS=$(nproc)
# cpufreq-set -c 0-$(($PROCESSORS-1)) -g performance
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo 0 > /sys/devices/system/cpu/cpu3/online

Process Sender (Start first):

# schedtool -F -p 99 -e taskset -c 0 ./bench s 

Process Receiver (Start second):

# schedtool -F -p 99 -e taskset -c 2 ./bench r 

Cleanup:

# echo 1 > /sys/devices/system/cpu/cpu1/online
# echo 1 > /sys/devices/system/cpu/cpu3/online

Samples:

All tests done on packet.net's Type 1 server, with performance governor as root.

Example CPU:

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 94
model name	: Intel(R) Xeon(R) CPU E3-1240 v5 @ 3.50GHz
stepping	: 3
microcode	: 0x6a
cpu MHz		: 3899.628
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		:
bogomips	: 7008.62
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual

Bench: Shared Memory

# schedtool -F -p 99 -e taskset -c 0 ./bench s
Median Iteration Time: 492
Min Time: 300
95th percentile Time: 600
Invol Ctx Switches: 0
Voluntary Ctx Switches: 0

Bench2: Sequential Execution

# schedtool -F -p 99 -e taskset -c 0 ./bench2 
Average cycles: 73
Median Iteration Cycles: 74
Min Cycles: 68
95th Percentile Cycles: 76
Invol Ctx Switches: 0
Voluntary Ctx Switches: 0

Bench3: SysV / POSIX Queues

# schedtool -F -p 99 -e taskset -c 0 ./bench3 s 
Median Iteration Time: 10454
Min Time: 5612
95th Percentile Time: 10850
Invol Ctx Switches: 1
Voluntary Ctx Switches: 3999695

Bench4: UNIX Domain Sockets

# schedtool -F -p 99 -e taskset -c 0 ./bench4 s 
Median Iteration Time: 11250
Min Time: 7326
95th percentile Time: 11698
Invol Ctx Switches: 0
Voluntary Ctx Switches: 4001850

CGT: How long do VDSO syscalls take?

# schedtool -F -p 99 -e taskset -c 2 ./cgt
Average cycles per clock_gettime: 57
Minimum cycles per iteration: 84
Median cycles per iteration: 92
95th percentile cycles per iteration: 94

GP: How long do normal syscalls take:

# ./gp
Average cycles per getpid: 129
Minimum cycles per iteration: 156
Median cycles per iteration: 162
95th percentile cycles per iteration: 164

RDTSCP: How long does is a nanosecond (or a cycle)?

Use this to convert from cycles (above) to nanoseconds.

Nanoseconds per rdtscp: 8
Total Nanoseconds: 424982501
Total Counter increases: 1489109408
Nanoseconds per cycle: 0.285394
Cycles per nanosecond: 3.503931
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].