Our pick this week is two lists for current or budding data scientists or analysts still working on t
|
January 8 · Issue #171 · View online |
|
Our pick this week is two lists for current or budding data scientists or analysts still working on their New Year’s resolutions: five books and five Github repos. We also have roundups of what 2020 brought to Redshift and Snowflake, and a couple of good examples using Matplotlib. Stay healthy!
|
|
|
5 Books Every Data Scientist Should Read in 2021 | by Arthur Mello | Jan, 2021 | Medium
Our first pick is five data science books for you to read in 2021.
|
Top Python Github Repos Jan 2021 for Data Scientists | by Mohammad Ahmad | Jan, 2021 | Towards Data Science
To complement the five books, here are five Github repos with code for data scientists or analysts.
|
|
BigTips: Removing Duplicates while Maintaining Row History | by Brian Suk | Google Cloud - Community | Dec, 2020 | Medium
Getting rid of duplicates is a common task for any data pipeline. This piece describes how to do it within BigQuery, using native BigQuery tools.
|
Xplenty | Simplified ETL & ELT to BigQuery, Snowflake, Redshift & Azure
Rapid data preparation and transformation for ever-evolving and changing data requirements. Secure & compliant data pipelines. Get started today. [Sponsored Content]
|
|
Snowflake Data Cloud Summit 2020. Snowflake Data Cloud Summit 2020 is a… | by Hari Bairaju | Jan, 2021 | Medium
The title of this piece is a little misleading – it’s mainly a list of new features introduced in Snowflake during 2020, but it’s a good roundup with pointers to documentation for every new feature.
|
The best new features for data analysts in Amazon Redshift in 2020 | Amazon Web Services
A good roundup of the (significant) advancements in Redshift during 2020. If you haven’t been paying attention to Redshift lately, this is worth a read.
|
|
Advanced NumPy: Master stride tricks with 25 illustrated exercises | by Raimi Karim | Jan, 2021 | Towards Data Science
A group of nicely illustrated NumPy array transforms for readers with some Python knowlege.
|
7 Most Recommended Skills to Learn in 2021 to be a Data Scientist | by Terence Shin | Jan, 2021 | Towards Data Science
Here’s a good piece for those of you still thinking about your New Year’s resolutions – seven solid skills to learn in 2021.
|
|
How to Create Animation using Matplotlib and Celluloid | by Rizky MN | Data Driven Investor
How to create animation using Matplotlib and Celluloid in Python, with both 2D and 3D examples.
|
How to visualize hypergraphs with Python and networkx — The Easy Way | by Alessandro Angioi | Jan, 2021 | Towards Data Science
Hypergraphs are a generalization of graphs where one relaxes the requirement for edges to connect just two nodes and allows instead edges to connect multiple nodes. This piece includes a nice tutorial-style
|
|
Testing data quality at scale with PyDeequ | Amazon Web Services
PyDeequ is a Python wrapper for the Scala-based tool Deequ, Amazon’s in-house tool for verifying the quality of large production datasets.
|
|
|
Did you enjoy this issue?
|
|
|
|
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
|
|
650 California St., San Francisco, CA 94108
|