人工智能学习路线图，整理近200个实战案例与项目，免费提供配套教材，零基础入门，就业实战！包括：Python，数学，机器学习，数据分析，深度学习，计算机视觉，自然语言处理，PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域

Stars: ✭ 4,387 (+3125.74%)

Mutual labels: numpy, pandas, data-analysis

Udacity-Data-Analyst-Nanodegree

Repository for the projects needed to complete the Data Analyst Nanodegree.

Stars: ✭ 31 (-77.21%)

Mutual labels: numpy, pandas, data-analysis

Mlcourse.ai

Open Machine Learning Course

Stars: ✭ 7,963 (+5755.15%)

Mutual labels: numpy, pandas, data-analysis

Data Analysis

主要是爬虫与数据分析项目总结，外加建模与机器学习，模型的评估。

Stars: ✭ 142 (+4.41%)

Mutual labels: numpy, pandas, data-analysis

Pyda 2e Zh

📖 [译] 利用 Python 进行数据分析 · 第 2 版

Stars: ✭ 866 (+536.76%)

Mutual labels: numpy, pandas, data-analysis

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+16111.76%)

Mutual labels: spark, numpy, pandas

Zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

Stars: ✭ 303 (+122.79%)

Mutual labels: spark, pandas, data-analysis

Data Science Hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

Stars: ✭ 273 (+100.74%)

Mutual labels: numpy, pandas, data-analysis

Data-Analyst-Nanodegree

Kai Sheng Teh - Udacity Data Analyst Nanodegree

Stars: ✭ 42 (-69.12%)

Mutual labels: numpy, pandas, data-analysis

Seaborn Tutorial

This repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.

Stars: ✭ 114 (-16.18%)

Mutual labels: numpy, pandas, data-analysis

Awkward 1.0

Manipulate JSON-like data with NumPy-like idioms.

Stars: ✭ 203 (+49.26%)

Mutual labels: numpy, pandas, data-analysis

Datscan

DatScan is an initiative to build an open-source CMS that will have the capability to solve any problem using data Analysis just with the help of various modules and a vast standardized module library

Stars: ✭ 13 (-90.44%)

Mutual labels: numpy, pandas, data-analysis

Data-Science-Resources

A guide to getting started with Data Science and ML.

Stars: ✭ 17 (-87.5%)

Mutual labels: numpy, pandas, data-analysis

Exploratory Data Analysis Visualization Python

Data analysis and visualization with PyData ecosystem: Pandas, Matplotlib Numpy, and Seaborn

Stars: ✭ 78 (-42.65%)

Mutual labels: numpy, pandas

Data-Scientist-In-Python

This repository contains notes and projects of Data scientist track from dataquest course work.

Stars: ✭ 23 (-83.09%)

Mutual labels: numpy, pandas

Algorithmic-Trading

I have been deeply interested in algorithmic trading and systematic trading algorithms. This Repository contains the code of what I have learnt on the way. It starts form some basic simple statistics and will lead up to complex machine learning algorithms.

Stars: ✭ 47 (-65.44%)

Mutual labels: numpy, pandas

Product-Categorization-NLP

Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).

Stars: ✭ 30 (-77.94%)

Mutual labels: pandas, data-analysis

View All Similar Projects ➔

And these visions of data types, they kept us up past the dawn.

The Semantic Data Library

Visions provides a set of tools for defining and using semantic data types.

Semantic type detection & inference on sequence data.
Automated data processing
Completely customizable. Visions makes it easy to build and modify semantic data types for domain specific purposes
Out of the box support for multiple backend implementations including pandas, spark, numpy, and python
A robust set of default types and typesets covering the most common use cases.

Check out the complete documentation here.

Installation

Source code is available on github and binary installers via pip.

# Pip
pip install visions

Complete installation instructions (including extras) are available in the docs.

Quick Start Guide

If you want to play immediately check out the examples folder on . Otherwise, let's get some data

import pandas as pd

df = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")
df.head(2)

PassengerId	Survived	Pclass	Name	Sex	Age	SibSp	Parch	Ticket	Fare	Cabin	Embarked
1	0	3	Braund, Mr. Owen Harris	male	22.0	1	0	A/5 21171	7.2500	NaN	S
2	1	1	Cumings, Mrs. John Bradley (Florence Briggs Thayer)	female	38.0	1	0	PC 17599	71.2833	C85	C

The most import abstraction in visions are Types - these represent semantic notions about data. You have access to a range of well tested types like Integer, Float, and Files covering the most common software development use cases. Types can be bundled together into typesets. Behind the scenes, visions builds a traversable graph for any collection of types.

from visions import types, typesets

# StandardSet is the basic builtin typeset
typeset = typesets.CompleteSet()
typeset.plot_graph()

Note: Plots require pygraphviz to be installed.

Because of the special relationship between types these graphs can be used to detect the type of your data or infer a more appropriate one.

# Detection looks like this
typeset.detect_type(df)

# While inference looks like this
typeset.infer_type(df)

# Inference works well even if we monkey with the data, say by converting everything to strings
typeset.infer_type(df.astype(str))
>> {
    'PassengerId': Integer,
    'Survived': Integer,
    'Pclass': Integer,
    'Name': String,
    'Sex': String,
    'Age': Float,
    'SibSp': Integer,
    'Parch': Integer,
    'Ticket': String,
    'Fare': Float,
    'Cabin': String,
    'Embarked': String
}

Visions solves many of the most common problems working with tabular data for example, sequences of Integers are still recognized as integers whether they have trailing decimal 0's from being cast to float, missing values, or something else altogether. Much of this cleaning is performed automatically providing nicely cleaned and processed data as well.

cleaned_df = typeset.cast_to_inferred(df)

This is only a small taste of everything visions can do including building your own domain specific types and typesets so please check out the API documentation or the examples/ directory for more info!

Supported frameworks

Thanks to its dispatch based implementation Visions is able to exploit framework specific capabilities offered by libraries like pandas and spark. Currently it works with the following backends by default.

Pandas (feature complete)
Numpy (boolean, complex, date time, float, integer, string, time deltas, string, objects)
Spark (boolean, categorical, date, date time, float, integer, numeric, object, string)
Python (string, float, integer, date time, time delta, boolean, categorical, object, complex - other datatypes are untested)

If you're using pandas it will also take advantage of parallelization tools like swifter if available.

It also offers a simple annotation based API for registering new implementations as needed. For example, if you wished to extend the categorical data type to include a Dask specific implementation you might do something like

from visions.types.categorical import Categorical
from pandas.api import types as pdt
import dask


@Categorical.contains_op.register
def categorical_contains(series: dask.dataframe.Series, state: dict) -> bool:
    return pdt.is_categorical_dtype(series.dtype)

Contributing and support

Contributions to visions are welcome. For more information, please visit the community contributions page and join on us on slack. The github issues tracker is used for reporting bugs, feature requests and support questions.

Also, please check out some of the other companies and packages using visions including:

If you're currently using visions or would like to be featured here please let us know.

Acknowledgements

This package is part of the dylan-profiler project. The package is core component of pandas-profiling. More information can be found here. This work was partially supported by SIDN Fonds.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

dylan-profiler / visions

Programming Languages

Labels

Projects that are alternatives of or similar to visions

The Semantic Data Library

Installation

Quick Start Guide

Supported frameworks

Contributing and support

Acknowledgements