All Projects → akabe → docker-ocaml-jupyter-datascience

akabe / docker-ocaml-jupyter-datascience

Licence: MIT License
Dockerfiles for data science in OCaml on Jupyter

Programming Languages

Jupyter Notebook
11667 projects
shell
77523 projects

Projects that are alternatives of or similar to docker-ocaml-jupyter-datascience

Data-Scientist-In-Python
This repository contains notes and projects of Data scientist track from dataquest course work.
Stars: ✭ 23 (-39.47%)
Mutual labels:  datascience, dataanalysis
genie
Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)
Stars: ✭ 21 (-44.74%)
Mutual labels:  datascience
lumberjack
Track changes in data with ease
Stars: ✭ 58 (+52.63%)
Mutual labels:  datascience
datascience-environment
Docker Environment for data science
Stars: ✭ 18 (-52.63%)
Mutual labels:  datascience
nl4dv
A python toolkit to create Visualizations (Vis) using natural language (NL) or add an NL interface to existing Vis.
Stars: ✭ 63 (+65.79%)
Mutual labels:  datascience
genstar
Generation of Synthetic Populations Library
Stars: ✭ 17 (-55.26%)
Mutual labels:  datascience
awesome-open-mlops
The Fuzzy Labs guide to the universe of open source MLOps
Stars: ✭ 304 (+700%)
Mutual labels:  datascience
neptune-examples
Examples of using Neptune to keep track of your experiments (maintenance only).
Stars: ✭ 22 (-42.11%)
Mutual labels:  datascience
data science chile
Lista de cursos de Data Science en Chile 📈📊🇨🇱
Stars: ✭ 22 (-42.11%)
Mutual labels:  datascience
data-science-best-practices
The goal of this repository is to enable data scientists and ML engineers to develop data science use cases and making it ready for production use. This means focusing on the versioning, scalability, monitoring and engineering of the solution.
Stars: ✭ 53 (+39.47%)
Mutual labels:  datascience
r-resources-for-data-science
A biggest collection of free books and other resources for R programming
Stars: ✭ 24 (-36.84%)
Mutual labels:  datascience
analytics-platform-ops
Ops and deployment resources for MOJ Analytics platform
Stars: ✭ 18 (-52.63%)
Mutual labels:  datascience
DataScience-Squad
Data Science Squad Roadmap
Stars: ✭ 28 (-26.32%)
Mutual labels:  dataanalysis
genero-nomes
Classifica nomes por gênero de acordo com API do IBGE
Stars: ✭ 33 (-13.16%)
Mutual labels:  datascience
Data-Science-and-Machine-Learning-Resources
List of Data Science and Machine Learning Resource that I frequently use
Stars: ✭ 19 (-50%)
Mutual labels:  datascience
dst
yet another custom data science template via cookiecutter
Stars: ✭ 59 (+55.26%)
Mutual labels:  datascience
ETL-Starter-Kit
📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.
Stars: ✭ 21 (-44.74%)
Mutual labels:  datascience
ODSC India 2018
My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Stars: ✭ 26 (-31.58%)
Mutual labels:  datascience
hmac-timing-attacks
HMAC timing attack's w/ statistical analysis
Stars: ✭ 22 (-42.11%)
Mutual labels:  datascience
HackyHourHandbook
A handbook for those who want to start coordinating Hacky Hour events in their University/Institute
Stars: ✭ 43 (+13.16%)
Mutual labels:  datascience

akabe/ocaml-jupyter-datascience

Travis CI MicroBadger
Build Status Image Status

A ready-to-use environment of Jupyter (IPython notebook) and OCaml Jupyter (OCaml kernel) with libraries for data science and machine learning.

Getting started

First, launch a Jupyter server as follows.

$ docker run -it -p 8888:8888 akabe/ocaml-jupyter-datascience
[I 15:38:04.170 NotebookApp] Writing notebook server cookie secret to /home/opam/.local/share/jupyter/runtime/notebook_cookie_secret
[W 15:38:04.190 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[I 15:38:04.197 NotebookApp] Serving notebooks from local directory: /notebooks
[I 15:38:04.197 NotebookApp] 0 active kernels
[I 15:38:04.197 NotebookApp] The Jupyter Notebook is running at: http://[all ip addresses on your system]:8888/?token=4df0fee0719115f474c8dd9f9281abed28db140d25f933e9
[I 15:38:04.197 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 15:38:04.198 NotebookApp] No web browser found: could not locate runnable browser.
[C 15:38:04.198 NotebookApp]

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:8888/?token=4df0fee0719115f474c8dd9f9281abed28db140d25f933e9

Second, access to the URL at the above last line to your web browser, then

Screenshot of Jupyter with OCaml

You can create OCaml notebooks!

Notebooks on your host machine can be mounted to a Docker container like

docker run -it -p 8888:8888 -v $PWD:/notebooks akabe/ocaml-jupyter-datascience

Distributions

The default images are built on Debian 8:

Tag OCaml OPAM Command Dockerfile
latest 4.05.0 1.2.2 docker pull akabe/ocaml-jupyter-datascience Dockerfile
4.04.1 4.04.1 1.2.2 docker pull akabe/ocaml-jupyter-datascience:4.04.1 Dockerfile

CentOS

Distribution OCaml OPAM Command Dockerfile
CentOS 7 4.05.0 1.2.2 docker pull akabe/ocaml-jupyter-datascience:centos7_ocaml4.05.0 Dockerfile
CentOS 7 4.04.1 1.2.2 docker pull akabe/ocaml-jupyter-datascience:centos7_ocaml4.04.1 Dockerfile

Debian

Distribution OCaml OPAM Command Dockerfile
Debian 8 4.05.0 1.2.2 docker pull akabe/ocaml-jupyter-datascience:debian8_ocaml4.05.0 Dockerfile
Debian 8 4.04.1 1.2.2 docker pull akabe/ocaml-jupyter-datascience:debian8_ocaml4.04.1 Dockerfile

Pre-installed packages

Standard libraries

The OCaml standard library is too small in practical use. The following packages provide popular data structures, a lot of frequently-used functions such as string operations, various iteration on collections, etc.

  • Jane Street Core (GitHub, API) — A huge extended standard library developed by Jane Street Capital. The library is actively maintained and reliable due to industrial use of Jane Street. Its interface is designed differently from the OCaml standard library.
  • Batteries Included (GitHub, API) — A famous extended standard library compatible with the OCaml standard library. It is smaller than Jane Street Core, but commonly-used functions are implemented.

Numerical computation

Visualization

Data sources

Concurrent programming

Other packages

  • Re (GitHub) — A fast and easy-to-use regular expression library for OCaml. This library supports Glob, POSIX, Perl, PCRE, and OCaml-Str-style syntaxes.
  • Camomile (GitHub, API) — Camomile is a library for character encoding conversion and unicode utilities.
  • LambdaSoup (GitHub, API) — Lambda Soup is a functional HTML scraping and manipulation library for OCaml aimed at being easy to use.
  • OCaml CSV (GitHub, API) — A library to read and write comma-separated-values (CSV) format files.
  • ppx_sexp_conv — Automatic generation of converters between OCaml datatypes and S-expressions.
  • ppx_deriving_yojson — Automatic generation of converters between OCaml datatypes and JSONs.
  • ppx_regexp — Pattern matching by PCRE-style regular expressions.

CUI tools

  • ImageMagick — ImageMagick is a program to create, edit, compose, or convert bitmap images. This supports many formats, e.g., PNG, JPEG, GIF, TIFF, PDF, etc.
  • FFmpeg — FFmpeg is a powerful tool for converting audio and video files.
  • PhantomJS — PhantomJS is a headless WebKit scriptable with a JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG.

Examples

$ git clone https://github.com/akabe/docker-ocaml-jupyter-datascience.git
$ docker run -it -p 8888:8888 -v $PWD/docker-ocaml-jupyter-datascience/notebooks:/notebooks akabe/ocaml-jupyter-datascience

Contribution

If you know a widely-used numerical library in OCaml, find a bug, or have an idea to improve this environment, please create an issue or pull-request your changes.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].