datasweet / Datatable
Licence: apache-2.0
A go in-memory table
Stars: ✭ 215
Projects that are alternatives of or similar to Datatable
daany
Daany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
Stars: ✭ 49 (-77.21%)
Mutual labels: series, dataframe
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+332.09%)
Mutual labels: dataframe, dataset
tv
📺(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.
Stars: ✭ 1,763 (+720%)
Mutual labels: datatable, dataframe
Weihanli.npoi
NPOI Extensions, excel/csv importer/exporter for IEnumerable<T>/DataTable, fluentapi(great flexibility)/attribute configuration
Stars: ✭ 157 (-26.98%)
Mutual labels: dataset, datatable
Tech.ml.dataset
A Clojure high performance data processing system
Stars: ✭ 205 (-4.65%)
Mutual labels: dataframe, dataset
Trump Lies
Tutorial: Web scraping in Python with Beautiful Soup
Stars: ✭ 201 (-6.51%)
Mutual labels: dataset
Mini Imagenet Tools
Tools for generating mini-ImageNet dataset and processing batches
Stars: ✭ 209 (-2.79%)
Mutual labels: dataset
Ballista
Distributed compute platform implemented in Rust, and powered by Apache Arrow.
Stars: ✭ 2,274 (+957.67%)
Mutual labels: dataframe
Inspectdf
🛠️ 📊 Tools for Exploring and Comparing Data Frames
Stars: ✭ 195 (-9.3%)
Mutual labels: dataframe
Dialogrpt
EMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data"
Stars: ✭ 216 (+0.47%)
Mutual labels: dataset
Covid19za
Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Stars: ✭ 208 (-3.26%)
Mutual labels: dataset
Semantic Segmentation Suite
Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!
Stars: ✭ 2,395 (+1013.95%)
Mutual labels: dataset
Awesome Json Datasets
A curated list of awesome JSON datasets that don't require authentication.
Stars: ✭ 2,421 (+1026.05%)
Mutual labels: dataset
Ava downloader
⏬ Download AVA dataset (A Large-Scale Database for Aesthetic Visual Analysis)
Stars: ✭ 214 (-0.47%)
Mutual labels: dataset
Peroxide
Rust numeric library with R, MATLAB & Python syntax
Stars: ✭ 191 (-11.16%)
Mutual labels: dataframe
React Table
⚛️ Hooks for building fast and extendable tables and datagrids for React
Stars: ✭ 15,739 (+7220.47%)
Mutual labels: datatable
Omnianomaly
KDD 2019: Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network
Stars: ✭ 208 (-3.26%)
Mutual labels: dataset
datatable
datatable is a Go package to manipulate tabular data, like an excel spreadsheet. datatable is inspired by the pandas python package and the data.frame R structure. Although it's production ready, be aware that we're still working on API improvements
Installation
go get github.com/datasweet/datatable
Features
- Create custom Series (ie custom columns). Currently available, serie.Int, serie.String, serie.Time, serie.Float64.
- Apply expressions
- Selects (head, tail, subset)
- Sorting
- InnerJoin, LeftJoin, RightJoin, OuterJoin, Concats
- Aggregate
- Import from CSV
- Export to map, slice
Creating a DataTable
package main
import (
"fmt"
"github.com/datasweet/datatable"
)
func main() {
dt := datatable.New("test")
dt.AddColumn("champ", datatable.String, datatable.Values("Malzahar", "Xerath", "Teemo"))
dt.AddColumn("champion", datatable.String, datatable.Expr("upper(`champ`)"))
dt.AddColumn("win", datatable.Int, datatable.Values(10, 20, 666))
dt.AddColumn("loose", datatable.Int, datatable.Values(6, 5, 666))
dt.AddColumn("winRate", datatable.Float64, datatable.Expr("`win` * 100 / (`win` + `loose`)"))
dt.AddColumn("winRate %", datatable.String, datatable.Expr(" `winRate` ~ \" %\""))
dt.AddColumn("sum", datatable.Float64, datatable.Expr("sum(`win`)"))
fmt.Println(dt)
}
/*
CHAMP <NULLSTRING> CHAMPION <NULLSTRING> WIN <NULLINT> LOOSE <NULLINT> WINRATE <NULLFLOAT64> WINRATE % <NULLSTRING> SUM <NULLFLOAT64>
Malzahar MALZAHAR 10 6 62.5 62.5 % 696
Xerath XERATH 20 5 80 80 % 696
Teemo TEEMO 666 666 50 50 % 696
*/
Reading a CSV and aggregate
package main
import (
"fmt"
"log"
"os"
"time"
"github.com/datasweet/datatable"
"github.com/datasweet/datatable/import/csv"
)
func main() {
dt, err := csv.Import("csv", "phone_data.csv",
csv.HasHeader(true),
csv.AcceptDate("02/01/06 15:04"),
csv.AcceptDate("2006-01"),
)
if err != nil {
log.Fatalf("reading csv: %v", err)
}
dt.Print(os.Stdout, datatable.PrintMaxRows(24))
dt2, err := dt.Aggregate(datatable.AggregateBy{Type: datatable.Count, Field: "index"})
if err != nil {
log.Fatalf("aggregate COUNT('index'): %v", err)
}
fmt.Println(dt2)
groups, err := dt.GroupBy(datatable.GroupBy{
Name: "year",
Type: datatable.Int,
Keyer: func(row datatable.Row) (interface{}, bool) {
if d, ok := row["date"]; ok {
if tm, ok := d.(time.Time); ok {
return tm.Year(), true
}
}
return nil, false
},
})
if err != nil {
log.Fatalf("GROUP BY 'year': %v", err)
}
dt3, err := groups.Aggregate(
datatable.AggregateBy{Type: datatable.Sum, Field: "duration"},
datatable.AggregateBy{Type: datatable.CountDistinct, Field: "network"},
)
if err != nil {
log.Fatalf("Aggregate SUM('duration'), COUNT_DISTINCT('network') GROUP BY 'year': %v", err)
}
fmt.Println(dt3)
}
Creating a custom serie
To create a custom serie you must provide:
- a caster function, to cast a generic value to your serie value. The signature must be func(i interface{}) T
- a comparator, to compare your serie value. The signature must be func(a, b T) int
Example with a NullInt
// IntN is an alis to create the custom Serie to manage IntN
func IntN(v ...interface{}) Serie {
s, _ := New(NullInt{}, asNullInt, compareNullInt)
if len(v) > 0 {
s.Append(v...)
}
return s
}
type NullInt struct {
Int int
Valid bool
}
// Interface() to render the current struct as a value.
// If not provided, the serie.All() or serie.Get() wills returns the embedded value
// IE: NullInt{}
func (i NullInt) Interface() interface{} {
if i.Valid {
return i.Int
}
return nil
}
// asNullInt is our caster function
func asNullInt(i interface{}) NullInt {
var ni NullInt
if i == nil {
return ni
}
if v, ok := i.(NullInt); ok {
return v
}
if v, err := cast.ToIntE(i); err == nil {
ni.Int = v
ni.Valid = true
}
return ni
}
// compareNullInt is our comparator function
// used to sort
func compareNullInt(a, b NullInt) int {
if !b.Valid {
if !a.Valid {
return Eq
}
return Gt
}
if !a.Valid {
return Lt
}
if a.Int == b.Int {
return Eq
}
if a.Int < b.Int {
return Lt
}
return Gt
}
Who are we ?
We are Datasweet, a french startup providing full service (big) data solutions.
Questions ? problems ? suggestions ?
If you find a bug or want to request a feature, please create a GitHub Issue.
Contributors
Cléo Rebert |
License
This software is licensed under the Apache License, version 2 ("ALv2"), quoted below.
Copyright 2017-2020 Datasweet <http://www.datasweet.fr>
Licensed under the Apache License, Version 2.0 (the "License"); you may not
use this file except in compliance with the License. You may obtain a copy of
the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under
the License.
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].