|
October 15 · Issue #211 · View online |
|
This week’s pick is a humorous data visualization of the experiences a new mum faces, including tears, tummy time and advice. We also have a piece on multiprocessing to speed up BigQuery data retrieval, and deep dive into Weatherbug’s use of Spectrum to decrease ETL latency. Stay Healthy!
|
|
|
How I Survived Being a New Mum Using My Dataviz Goggles | Nightingale
When the author found out she was pregnant in the middle of the first lockdown, she decided to use a data-driven approach to document her pregnancy and early motherhood.
|
|
BigQuery fetching + multiprocessing | by Tristan Bilot | Oct, 2021 | Towards Data Science
Benchmarking different approaches to fetching data using multiple threads, on a laptop and on a GCP compute engine.
|
The fastest real-time data replication on the market | FlyData
FlyData has the fasted change data capture (CDC) on the market. Get real time data insights and updates so you don’t miss important changes to your data. [Sponsored]
|
|
How To Build a Robust Data Infrastructure | *instinctools
A basic high-level summary describing five must-dos for creating an infrastructure that allows analysts easy access to your data.
|
Automate your Amazon Redshift performance tuning with automatic table optimization | Amazon Web Services
An example with benchmarks showing how automatic distribution keys and sortkeys can make your Redshift tables perform without manual intervention.
|
|
Why You Should Rethink Where You Write Your SQL | by Robert Yi | Oct, 2021 | Towards Data Science
“Where there once was only a thin trail of Slack whispers leading to an obscure Google doc, there now stands an almanac of past SQL queries, insights, and resulting business decisions.”
|
Understanding Python imports, __init__.py and pythonpath — once and for all | by Dr. Varshita Sher | Oct, 2021 | Towards Data Science
A clear, example-based explanation of the workings of Python imports.
|
|
Big Data Visualization Using Datashader in Python | by Sophia Yang | Oct, 2021 | Towards Data Science
Four lines of code and six milliseconds to plot 11 million rows of data on a laptop, with Datashader.
|
Visualize data on a Choropleth map with Geopandas and Matplotlib | by Aindriya Barua (She/They) | Oct, 2021 | Medium
A nice example that plots the knowledge of English in India on a map.
|
|
WeatherBug reduced ETL latency to 30 times faster using Amazon Redshift Spectrum | Amazon Web Services
Instead of using a Hadoop/Airflow transform, the WeatherBug team stored data directly in S3 for a striking increase in data pipeline performance.
|
|
|
Did you enjoy this issue?
|
|
|
|
In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
|
|
650 California St., San Francisco, CA 94108
|