All Projects → francelabs → Datafari

francelabs / Datafari

Licence: apache-2.0
Open Source, Distributed, Big Data Enterprise Search Engine

Programming Languages

python
139335 projects - #7 most used programming language
java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Datafari

Transformalize
Configurable Extract, Transform, and Load
Stars: ✭ 125 (+165.96%)
Mutual labels:  solr, elasticsearch
Relevant Search Book
Code and Examples for Relevant Search
Stars: ✭ 231 (+391.49%)
Mutual labels:  solr, elasticsearch
Code4java
Repository for my java projects.
Stars: ✭ 164 (+248.94%)
Mutual labels:  solr, elasticsearch
Spring Boot 2.x Examples
Spring Boot 2.x code examples
Stars: ✭ 104 (+121.28%)
Mutual labels:  solr, elasticsearch
Pdf
编程电子书,电子书,编程书籍,包括C,C#,Docker,Elasticsearch,Git,Hadoop,HeadFirst,Java,Javascript,jvm,Kafka,Linux,Maven,MongoDB,MyBatis,MySQL,Netty,Nginx,Python,RabbitMQ,Redis,Scala,Solr,Spark,Spring,SpringBoot,SpringCloud,TCPIP,Tomcat,Zookeeper,人工智能,大数据类,并发编程,数据库类,数据挖掘,新面试题,架构设计,算法系列,计算机类,设计模式,软件测试,重构优化,等更多分类
Stars: ✭ 12,009 (+25451.06%)
Mutual labels:  solr, elasticsearch
Ik Analyzer
支持Lucene5/6/7/8+版本, 长期维护。
Stars: ✭ 112 (+138.3%)
Mutual labels:  solr, elasticsearch
Open Semantic Etl
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Stars: ✭ 165 (+251.06%)
Mutual labels:  solr, elasticsearch
Elasticsearch Synonyms
Curated synonym files and Helpers for Elasticsearch Synonym Token Filter
Stars: ✭ 51 (+8.51%)
Mutual labels:  solr, elasticsearch
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+763.83%)
Mutual labels:  solr, elasticsearch
Janusgraph
JanusGraph: an open-source, distributed graph database
Stars: ✭ 4,277 (+9000%)
Mutual labels:  solr, elasticsearch
Springboot Templates
springboot和dubbo、netty的集成,redis mongodb的nosql模板, kafka rocketmq rabbit的MQ模板, solr solrcloud elasticsearch查询引擎
Stars: ✭ 100 (+112.77%)
Mutual labels:  solr, elasticsearch
Springbootexamples
Spring Boot 学习教程
Stars: ✭ 794 (+1589.36%)
Mutual labels:  solr, elasticsearch
Logisland
Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (+106.38%)
Mutual labels:  solr, elasticsearch
Srchx
A standalone lightweight full-text search engine built on top of blevesearch and Go with multiple storage (scorch, boltdb, leveldb, badger)
Stars: ✭ 118 (+151.06%)
Mutual labels:  solr, elasticsearch
Vectorsinsearch
Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Searching with Vectors' talk from Haystack 2019 (US). Builds upon my conceptual search and semantic search work from 2015
Stars: ✭ 71 (+51.06%)
Mutual labels:  solr, elasticsearch
Query Translator
Query Translator is a search query translator with AST representation
Stars: ✭ 165 (+251.06%)
Mutual labels:  solr, elasticsearch
Typo3 Docker Boilerplate
🍲 TYPO3 Docker Boilerplate project (NGINX, Apache HTTPd, PHP-FPM, MySQL, Solr, Elasticsearch, Redis, FTP)
Stars: ✭ 240 (+410.64%)
Mutual labels:  solr, elasticsearch
Php Docker Boilerplate
🍲 PHP Docker Boilerplate for Symfony, Wordpress, Joomla or any other PHP Project (NGINX, Apache HTTPd, PHP-FPM, MySQL, Solr, Elasticsearch, Redis, FTP)
Stars: ✭ 503 (+970.21%)
Mutual labels:  solr, elasticsearch
Nagios Plugins
450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (+2027.66%)
Mutual labels:  solr, elasticsearch
Kuzzle
Open-source Back-end, self-hostable & ready to use - Real-time, storage, advanced search - Web, Apps, Mobile, IoT -
Stars: ✭ 991 (+2008.51%)
Mutual labels:  elasticsearch

--------------------------- DATAFARI V. 4.5-dev ------------------------

NOTE: For the changes compared to the previous version of DATAFARI, please check CHANGES.txt.

Datafari is the perfect product for anyone who needs to search and analyze its corporate big data, based on the most advanced open source technologies. Datafari combines the Apache Solr, Cassandra, ManifoldCF products and ELK. It allows its users to search into file shares, cloud shares (dropbox, google drive), databases, but also sharepoint, alfresco and many more sources.

Available as community and enterprise edition, Datafari is different from the competition :

  • Its open source license is not aggressive, as it uses the Apache v2 license: you are free to do whatever you want with it, you just need to mention that you are using it.
  • It combines renowned Apache projects, namely Cassandra, Solr and ManifoldCF, which gives Datafari a long term vision.
  • It leverages ELK the reference stack to analyze unstructured big data

The complete documentation (for users, admins and developers) is available here : https://datafari.atlassian.net/wiki/display/DATAFARI/Datafari

Requirements:

  • Debian8 or higher Environment 64 bits (a Docker image is available if you are on Windows environment) Recommended version is Debian 9
  • Processor : 1GHZ and RAM : 8GB
  • Ports 8080, 5432, 9200, 5601 are opened
  • Debian environment : requires curl, debconf, unzip, sudo, libc6-dev, jq, lsof
  • Java JDK 8

How to install and start Datafari :

You can build the Debian installer with the ant script Datafari/debian7/build.xml. You can download Debian installer, an OVA or a Docker image from https://www.datafari.com/en/download.html .

  1. Install Datafari :

dpkg -i datafari.deb

  1. Start Datafari with a non root user:

cd /opt/datafari/bin

bash start-datafari.sh

  1. Stop Datafari :

cd /opt/datafari/bin

bash stop-datafari.sh

You can find video tutorials on how to install and start Datafari from the installer (Warning: the videos are for version 1.x) :

If you want to crawl fileshares, you will need to activate the jcifs-ng connector in ManifoldCF: follow this documentation: https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/662700036/Add+the+JCIFS-NG+Connector+to+Datafari+-+Community+Edition

You have to configure your Repository connector and job to add documents to Datafari. You can find a video tutorial on how to index local file share here (Warning: the videos are for version 1.x) : https://www.youtube.com/watch?v=w0FtsvZO9SI You can find documentation on how to create connectors and jobs here : http://manifoldcf.apache.org/release/release-2.13/en_US/end-user-documentation.html

For all the changes please check CHANGES.txt

--------------------------- DATAFARI V. 3.2.1 ------------------------

NOTE: For the major changes compared to DATAFARI V2.2, please check at the bottom of this page.

Datafari 3.2.1 is the perfect product for anyone who needs to search and analyze its corporate big data, based on the most advanced open source technologies. Datafari 3.2.1 combines the Apache Solr, Cassandra, ManifoldCF products and ELK. It allows its users to search into file shares, cloud shares (dropbox, google drive), databases, but also emails and many more sources.

Available as community and enterprise edition, Datafari is different from the competition :

  • Its open source license is not aggressive, as it uses the Apache v2 license: you are free to do whatever you want with it, you just need to mention that you are using it.
  • It combines three renowned Apache projects, namely Cassandra, Solr and ManifoldCF, which gives Datafari a long term vision.
  • It leverages ELK the reference stack to analyze unstructured big data

The complete documentation (for users, admins and developers) is available here : https://datafari.atlassian.net/wiki/display/DATAFARI/Datafari

Requirements:

  • Debian7 or higher Environment 64 bits (a Docker image is available if you are on Windows environment) Recommended version is Debian 8 (if you are on Debian 7 you will need to add the testing repo in /etc/apt/sources.list)
  • Processor : 1GHZ and RAM : 8GB
  • Ports 8080, 5432, 9200, 5601 are opened
  • Debian environment : requires curl, debconf, unzip, sudo, libc6-dev, jq, lsof

How to install and start Datafari :

You can build the Debian installer with the ant script Datafari/debian7/build.xml. You can download Debian installer and Docker image from www.datafari.com.

  1. Install Datafari :

dpkg -i datafari.deb

  1. Start Datafari with a non root user:

cd /opt/datafari/bin

bash start-datafari.sh

  1. Stop Datafari :

cd /opt/datafari/bin

bash stop-datafari.sh

You can find video tutorials on how to install and start Datafari from the installer (Warning: the videos are for version 1.x) :

If you want to use the jcifs connector in ManifoldCF, download jcifs-1.3.xx.jar from http://jcifs.samba.org/src/ to DATAFARI_SOURCE_DIR\mcf\mcf_home\connector-lib-proprietary Then edit the file Datafari/mcf/mcf_home/connectors.xml and uncomment the line :

And restart Datafari

You have to configure your Repository connector and job to add documents to Datafari. You can find a video tutorial on how to index local file share here (Warning: the videos are for version 1.x) : https://www.youtube.com/watch?v=w0FtsvZO9SI You can find documentation on how to create connectors and jobs here : http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html

Major changes compared to v3.1.0

  • Tika updated to version 1.15
  • Manifold CF updated to version 2.6
  • Cassandra updated to version 3.10
  • New UI
  • New Advanced Search
  • New languages :
    • German
    • Portuguese/Brazilian

--------------------------- DATAFARI V. 3.1.2 ------------------------

NOTE: For the major changes compared to DATAFARI V2.2, please check at the bottom of this page.

Datafari 3.1 is the perfect product for anyone who needs to search and analyze its corporate big data, based on the most advanced open source technologies. Datafari 3.1 combines the Apache Solr, Cassandra, ManifoldCF products and ELK. It allows its users to search into file shares, cloud shares (dropbox, google drive), databases, but also emails and many more sources.

Available as community and enterprise edition, Datafari is different from the competition :

  • Its open source license is not aggressive, as it uses the Apache v2 license: you are free to do whatever you want with it, you just need to mention that you are using it.
  • It combines three renowned Apache projects, namely Cassandra, Solr and ManifoldCF, which gives Datafari a long term vision.
  • It leverages ELK the reference stack to analyze unstructured big data

The complete documentation (for users, admins and developers) is available here : https://datafari.atlassian.net/wiki/display/DATAFARI/Datafari

Requirements:

  • Debian7 or higher Environment 64 bits (a Docker image is available if you are on Windows environment) Recommended version is Debian 8 (if you are on Debian 7 you will need to add the testing repo in /etc/apt/sources.list)
  • Processor : 1GHZ and RAM : 8GB
  • Ports 8080, 5432, 9200, 5601 are opened
  • Debian environment : requires curl, debconf, unzip, sudo, libc6-dev, jq, lsof

How to install and start Datafari :

You can build the Debian installer with the ant script Datafari/debian7/build.xml. You can download Debian installer and Docker image from www.datafari.com.

  1. Install Datafari :

dpkg -i datafari.deb

  1. Start Datafari with a non root user:

cd /opt/datafari/bin

bash start-datafari.sh

  1. Stop Datafari :

cd /opt/datafari/bin

bash stop-datafari.sh

You can find video tutorials on how to install and start Datafari from the installer (Warning: the videos are for version 1.x) :

If you want to use the jcifs connector in ManifoldCF, download jcifs-1.3.xx.jar from http://jcifs.samba.org/src/ to DATAFARI_SOURCE_DIR\mcf\mcf_home\connector-lib-proprietary Then edit the file Datafari/mcf/mcf_home/connectors.xml and uncomment the line :

And restart Datafari

You have to configure your Repository connector and job to add documents to Datafari. You can find a video tutorial on how to index local file share here (Warning: the videos are for version 1.x) : https://www.youtube.com/watch?v=w0FtsvZO9SI You can find documentation on how to create connectors and jobs here : http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html

To see major changes, see CHANGES.txt

--------------------------- DATAFARI V. 3.0.0 ------------------------

NOTE: For the major changes compared to DATAFARI V2.2, please check at the bottom of this page.

Datafari 3.0 is the perfect product for anyone who needs to search and analyze its corporate big data, based on the most advanced open source technologies. Datafari 3.0 combines the Apache Solr, Cassandra, ManifoldCF products and ELK. It allows its users to search into file shares, cloud shares (dropbox, google drive), databases, but also emails and many more sources.

Available as community and enterprise edition, Datafari is different from the competition :

  • Its open source license is not aggressive, as it uses the Apache v2 license: you are free to do whatever you want with it, you just need to mention that you are using it.
  • It combines three renowned Apache projects, namely Cassandra, Solr and ManifoldCF, which gives Datafari a long term vision.
  • It leverages ELK the reference stack to analyze unstructured big data

The complete documentation (for users, admins and developers) is available here : https://datafari.atlassian.net/wiki/display/DATAFARI/Datafari

Requirements:

  • Debian7 or higher Environment 64 bits (a Docker image is available if you are on Windows environment) Recommended version is Debian 8 (if you are on Debian 7 you will need to add the testing repo in /etc/apt/sources.list)
  • Processor : 1GHZ and RAM : 8GB
  • Ports 8080, 5432, 9200, 5601 are opened
  • Debian environment : requires curl, debconf, unzip, sudo, libc6-dev

How to install and start Datafari :

You can build the Debian installer with the ant script Datafari/debian7/build.xml. You can download Debian installer and Docker image from www.datafari.com.

  1. Install Datafari :

dpkg -i datafari.deb

  1. Start Datafari with a non root user:

cd /opt/datafari/bin

bash start-datafari.sh

  1. Stop Datafari :

cd /opt/datafari/bin

bash stop-datafari.sh

You can find video tutorials on how to install and start Datafari from the installer (Warning: the videos are for version 1.x) :

If you want to use the jcifs connector in ManifoldCF, download jcifs-1.3.xx.jar from http://jcifs.samba.org/src/ to DATAFARI_SOURCE_DIR\mcf\mcf_home\connector-lib-proprietary Then edit the file Datafari/mcf/mcf_home/connectors.xml and uncomment the line :

And restart Datafari

You have to configure your Repository connector and job to add documents to Datafari. You can find a video tutorial on how to index local file share here (Warning: the videos are for version 1.x) : https://www.youtube.com/watch?v=w0FtsvZO9SI You can find documentation on how to create connectors and jobs here : http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html

Major changes compared to v2.2

  • Solrcloud on a single node activated by default
  • Added query elevator admin fonctionnality
  • Solr updated to version 5.5.1
  • Postgres updated to version 9.5.3
  • Usage of Tika embedded in MCF instead of Solr

--------------------------- DATAFARI V. 2.2 ------------------------

NOTE: For the major changes compared to DATAFARI V1.x, please check at the bottom of this page.

Datafari is the perfect product for anyone who needs to search within its corporate big data, based on the most advanced open source technologies. Datafari 2.2 combines the Apache Solr, Cassandra and ManifoldCF products. It allows its users to search into file shares, cloud shares (dropbox, google drive), databases, but also emails and many more sources.

Available as community and enterprise edition, Datafari is different from the competition :

  • Its open source license is not aggressive, as it uses the Apache v2 license: you are free to do whatever you want with it, you just need to mention that you are using it.
  • It combines three renowned Apache projects, namely Cassandra, Solr and ManifoldCF, which gives Datafari a long term vision.

Pre-Requirements:

  • Debian Environment 64 bits (a Docker image is available if you are on Windows environment)
  • Processor : 1GHZ and RAM : 2GB
  • Ports 8080 and 5432 are opened
  • Debian environment : requires curl, debconf, unzip, sudo, libc6-dev

How to install and start Datafari :

You can build the Debian installer with the ant script Datafari/debian7/build.xml. You can download Debian installer and Docker image from www.datafari.com.

  1. Install Datafari :

dpkg -i datafari.deb

  1. Start Datafari with a non root user:

cd /opt/datafari/bin

bash start-datafari.sh

  1. Stop Datafari :

cd /opt/datafari/bin

bash stop-datafari.sh

You can find video tutorials on how to install and start Datafari from the installer :

If you want to use the jcifs connector in ManifoldCF, download jcifs-1.3.xx.jar from http://jcifs.samba.org/src/ to DATAFARI_SOURCE_DIR\mcf\mcf_home\connector-lib-proprietary Then edit the file Datafari/mcf/mcf_home/connectors.xml and uncomment the line :

And restart Datafari

You have to configure your Repository connector and job to add documents to Datafari. You can find a video tutorial on how to index local file share here : https://www.youtube.com/watch?v=w0FtsvZO9SI You can find documentation on how to create connectors and jobs here : http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html

Major changes compared to v1.0

  • Integration of Apache Cassandra
  • Proper user management including an admin UI
  • Complete overhaul of the admin UI, using the great Devoops v2 template.
  • Complete overhaul of the Ajaxfrancelabs search UI, with new widgets and a cool responsive design
  • Migration to Apache Solr 5
  • Admin UI to configure connection to an Active Directory
  • Admin UI to manage promolinks
  • Admin UI to boost Solr fields at search time
  • Admin UI to configure the autocomplete
  • Admin UI to configure the synonyms
  • Migration of JDK to JVM version 1.8 u66
  • Restructuring of the configuration files to facilitate update processes
  • Bugfix for the alerts feature
  • Added unit testing
  • Added SKOS and OWL ontologies support through Apache Jena 3.0.1

Enjoy :-)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].