Python and Pyspark Tips for Data Scientist
๐๐ก๐๐ญ?
The book focuses on Python and Pyspark Tips for Data Scientist
, and includes the following topics:
- Setup python on local computer or on cloud
- Connect with database
- Working with AWS S3
- Python primer function
- Data Structures: List, Tuple, Dictionary
- pandas.DataFrame vs PySpark DataFrame
- Package Wrapper
- Model Deployment with Flask
- API Book
- Online machine learning Courses and Useful Websites
๐
๐๐๐ฌ๐จ๐ฎ๐ซ๐๐๐ฌ - ๐๐ง๐ฅ๐ข๐ง๐ ๐ฏ๐๐ซ๐ฌ๐ข๐จ๐ง: https://runawayhorse001.github.io/PythonTipsDS
- ๐๐๐ ๐ฏ๐๐ซ๐ฌ๐ข๐จ๐ง: https://runawayhorse001.github.io/PythonTipsDS/pythonTipsDS.pdf
- ๐๐จ๐ฎ๐ซ๐๐ ๐๐จ๐๐: https://github.com/runawayhorse001/PythonTipsDS
Authors
Wenqiang Feng, Ph.D. and Jing Yang, Ph.D.
B๐จ๐จ๐ค ๐๐จ๐ซ ๐๐ฉ๐๐๐ก๐ ๐๐ฉ๐๐ซ๐ค (by Wenqiang Feng, Ph.D.)
If you are interested to learn data science with ๐๐ฉ๐๐๐ก๐ ๐๐ฉ๐๐ซ๐ค ๐๐๐๐ ๐๐ฒ๐๐ฉ๐๐ซ๐ค, I recommend you to check ๐ป๐๐ ๐ณ๐๐๐๐๐๐๐ ๐จ๐๐๐๐๐ ๐บ๐๐๐๐ ๐๐๐๐ ๐ท๐๐๐๐๐ by Wenqiang Feng, Ph.D.
๐๐ก๐๐ญ?
The book focuses on the core data science applications of Apache Spark with PySpark, and includes the following topics:
- Introduction to Spark
- Working and exploring data
- Core stats and ML - regression, classification, clustering
- Text mining and social media analysis
- Monte Carlo and MCMC simulation
- Neural network
- Working with Spark wrappers
๐
๐๐๐ฌ๐จ๐ฎ๐ซ๐๐๐ฌ - ๐๐ง๐ฅ๐ข๐ง๐ ๐ฏ๐๐ซ๐ฌ๐ข๐จ๐ง: https://runawayhorse001.github.io/LearningApacheSpark/index.html
- ๐๐๐ ๐ฏ๐๐ซ๐ฌ๐ข๐จ๐ง: https://runawayhorse001.github.io/LearningApacheSpark/pyspark.pdf
- ๐๐จ๐ฎ๐ซ๐๐ ๐๐จ๐๐: https://github.com/runawayhorse001/LearningApacheSpark