All Projects → JustinAzoff → flow-indexer

JustinAzoff / flow-indexer

Licence: other
Flow-Indexer indexes flows found in chunked log files from bro,nfdump,syslog, or pcap files

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to flow-indexer

Zeek
Zeek is a powerful network analysis framework that is much different from the typical IDS you may know.
Stars: ✭ 4,180 (+9620.93%)
Mutual labels:  pcap, bro
Nfstream
NFStream: a Flexible Network Data Analysis Framework.
Stars: ✭ 622 (+1346.51%)
Mutual labels:  netflow, pcap
Zeek-Network-Security-Monitor
A Zeek Network Security Monitor tutorial that will cover the basics of creating a Zeek instance on your network in addition to all of the necessary hardware and setup and finally provide some examples of how you can use the power of Zeek to have absolute control over your network.
Stars: ✭ 38 (-11.63%)
Mutual labels:  pcap, bro
Tcpreplay
Pcap editing and replay tools for *NIX and Windows - Users please download source from
Stars: ✭ 745 (+1632.56%)
Mutual labels:  netflow, pcap
Mhtextsearch
A fast full-text search library for Objective-C
Stars: ✭ 79 (+83.72%)
Mutual labels:  search-engine, index
zeek-docs
Documentation for Zeek
Stars: ✭ 41 (-4.65%)
Mutual labels:  pcap, bro
Fastnetmon
FastNetMon - very fast DDoS sensor with sFlow/Netflow/IPFIX/SPAN support
Stars: ✭ 2,860 (+6551.16%)
Mutual labels:  netflow, pcap
Riot
Go Open Source, Distributed, Simple and efficient Search Engine; Warning: This is V1 and beta version, because of big memory consume, and the V2 will be rewrite all code.
Stars: ✭ 6,025 (+13911.63%)
Mutual labels:  search-engine, index
Blast
Blast is a full text search and indexing server, written in Go, built on top of Bleve.
Stars: ✭ 934 (+2072.09%)
Mutual labels:  search-engine, index
Sonic
🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.
Stars: ✭ 12,347 (+28613.95%)
Mutual labels:  search-engine, index
auctus
Dataset search engine, discovering data from a variety of sources, profiling it, and allowing advanced queries on the index
Stars: ✭ 34 (-20.93%)
Mutual labels:  search-engine, index
markdown-index
Generate a global index for multiple markdown files recursively
Stars: ✭ 15 (-65.12%)
Mutual labels:  index
domhttpx
domhttpx is a google search engine dorker with HTTP toolkit built with python, can make it easier for you to find many URLs/IPs at once with fast time.
Stars: ✭ 59 (+37.21%)
Mutual labels:  search-engine
Intelligent Document Finder
Document Search Engine Tool
Stars: ✭ 45 (+4.65%)
Mutual labels:  search-engine
nlp-lt
Natural Language Processing for Lithuanian language
Stars: ✭ 17 (-60.47%)
Mutual labels:  search-engine
Horizon
A ZeroNet search engine
Stars: ✭ 15 (-65.12%)
Mutual labels:  search-engine
hohser
Highlight or Hide Search Engine Results
Stars: ✭ 89 (+106.98%)
Mutual labels:  search-engine
pufferfish
An efficient index for the colored, compacted, de Bruijn graph
Stars: ✭ 94 (+118.6%)
Mutual labels:  index
packet cafe
A platform built for easy-to-use automated network traffic analysis
Stars: ✭ 40 (-6.98%)
Mutual labels:  pcap
Werk
High-throughput / low-latency C++ application framework
Stars: ✭ 30 (-30.23%)
Mutual labels:  pcap

Flow Indexer Build Status

flow-indexer indexes flows

Usage: 
  flow-indexer [command]

Available Commands: 
  compact     Compact the database
  daemon      Start daemon
  expandcidr  Expand a CIDR range from those seen in the database
  index       Index flows
  search      Search flows
  help        Help about any command

Flags:
      --dbpath="flows.db": Database path
  -h, --help[=false]: help for flow-indexer


Use "flow-indexer [command] --help" for more information about a command.

Quickstart

Install

$ export GOPATH=~/go
$ go get github.com/JustinAzoff/flow-indexer

Create configuration

$ cp ~/go/src/github.com/JustinAzoff/flow-indexer/example_config.json config.json
$ vi config.json # Adjust log paths and database paths.

The indexer configuration is as follows:

  • name - The name of the indexer. Keep this short and lowercase, as you will use it as an http query param.
  • backend - The backend log ip extractor to use. Choices: bro, bro_json, nfdump, syslog, pcap, and argus.
  • file_glob - The shell globbing pattern that should match all of your log files.
  • recent_file_glob - The strftime+shell globbing pattern that should match todays log files.
  • filename_to_database_regex - A regular expression applied to each filename used to extract information used to name the database.
  • database_root - Where databases will be written to. Should be indexer specific.
  • datapath_path - The name of an individual database. This can contain $variables set in filename_to_database_regex.

The deciding factor for how to partition the databases is how many unique ips you see per day. I suggest starting with monthly indexes. If the indexing performance takes a huge hit by the end of the month, switch to daily indexes.

Run initial index

the indexall command will expand file_glob and index any log file that matches.

$ ~/go/bin/flow-indexer indexall

Start Daemon

Once the initial index is complete, start the daemon. Starting the daemon will expand recent_file_glob and index any recently created log file that matches.

$ ~/go/bin/flow-indexer daemon

It will do this in a 60 second loop to keep itself up to date.

Query API

$ curl -s 'localhost:8080/search?i=conn&q=1.2.3.0/24'
$ curl -s 'localhost:8080/dump?i=conn&q=1.2.3.0/24'
$ curl -s 'localhost:8080/stats?i=conn&q=1.2.3.0/24'

Service Configuration

Running flow-indexer as a service

systemd

To run flow-indexer as a service on a system using systemd, you can use the provided flow-indexer.service file.

upstart

If you are planning to run flow-indexer as a service on a system that uses upstart, you may want to consider a conf file like the following in order to properly syslog stdout and stderr from flow-indexer, and to run as a non-root user.

# flow-indexer - Flow Indexer
#
# flow-indexer is a service that indexes and allows retrieval of flows using bro logs

description     "Flow Indexer Daemon"

start on runlevel [345]
stop on runlevel [!345]

setuid flowindexer
setgid flowindexer

exec /path/to/bin/flow-indexer daemon --config /path/to/flow-indexer/config.json 2>&1 | logger -t flow-indexer

Common Issues

In order to avoid too many open files errors, you may want to increase the number of open files you allow the user that flow-indexer runs as to have access to. This can be done by changing your nofile setting in /etc/security/limits.conf as shown below.

flowindexer soft nofile 65535
flowindexer hard nofile 65535

Lower level commands example

Not really used anymore in practice, the daemon is the recommended way to use flow-indexer. But these commands can be useful for testing and development.

Index flows

./flow-indexer --dbpath /tmp/f/flows.db index /tmp/f/conn*
2016/02/06 23:36:51 /tmp/f/conn.00:00:00-01:00:00.log.gz: Read 4260 lines in 24.392765ms
2016/02/06 23:36:51 /tmp/f/conn.00:00:00-01:00:00.log.gz: Wrote 281 unique ips in 2.215219ms
2016/02/06 23:36:51 /tmp/f/conn.01:00:00-02:00:00.log.gz: Read 4376 lines in 24.186168ms
2016/02/06 23:36:51 /tmp/f/conn.01:00:00-02:00:00.log.gz: Wrote 310 unique ips in 1.495277ms
[...]
2016/02/06 23:36:51 /tmp/f/conn.22:00:00-23:00:00.log.gz: Read 7799 lines in 18.350788ms
2016/02/06 23:36:51 /tmp/f/conn.22:00:00-23:00:00.log.gz: Wrote 775 unique ips in 5.155262ms
2016/02/06 23:36:51 /tmp/f/conn.23:00:00-00:00:00.log.gz: Read 5255 lines in 15.296847ms
2016/02/06 23:36:51 /tmp/f/conn.23:00:00-00:00:00.log.gz: Wrote 400 unique ips in 2.910344ms

Re-Index flows

./flow-indexer --dbpath /tmp/f/flows.db index /tmp/f/conn*
2016/02/06 23:37:36 /tmp/f/conn.00:00:00-01:00:00.log.gz Already indexed
2016/02/06 23:37:36 /tmp/f/conn.01:00:00-02:00:00.log.gz Already indexed
2016/02/06 23:37:36 /tmp/f/conn.02:00:00-03:00:00.log.gz Already indexed
2016/02/06 23:37:36 /tmp/f/conn.03:00:00-04:00:00.log.gz Already indexed
[...]
2016/02/06 23:37:36 /tmp/f/conn.20:00:00-21:00:00.log.gz Already indexed
2016/02/06 23:37:36 /tmp/f/conn.21:00:00-22:00:00.log.gz Already indexed
2016/02/06 23:37:36 /tmp/f/conn.22:00:00-23:00:00.log.gz Already indexed
2016/02/06 23:37:36 /tmp/f/conn.23:00:00-00:00:00.log.gz Already indexed

Expand CIDR Range

./flow-indexer --dbpath /tmp/f/flows.db expandcidr 192.30.252.0/24
192.30.252.86
192.30.252.87
192.30.252.92
192.30.252.124
192.30.252.125
192.30.252.126
192.30.252.127
192.30.252.128
192.30.252.129
192.30.252.130
192.30.252.131
192.30.252.141

Search

./flow-indexer --dbpath /tmp/f/flows.db search 192.30.252.0/24
/tmp/f/conn.03:00:00-04:00:00.log.gz
/tmp/f/conn.04:00:00-05:00:00.log.gz
/tmp/f/conn.06:00:00-07:00:00.log.gz
/tmp/f/conn.14:00:00-15:00:00.log.gz
/tmp/f/conn.18:00:00-19:00:00.log.gz
/tmp/f/conn.20:00:00-21:00:00.log.gz
/tmp/f/conn.22:00:00-23:00:00.log.gz
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].