This week's pick is an interesting counterpart to the notion that adding more engineers to a problem
|
January 15 · Issue #172 · View online |
|
This week’s pick is an interesting counterpart to the notion that adding more engineers to a problem will make it better – worth a read if only for a different perspective. We also have pieces on Hive, metric standardization at Uber, and a complex visualization in PowerBI. Stay healthy!
|
|
|
Hiring additional data engineers is a problem | Mammoth Analytics
Adding additional engineers to a messy data problem is usually counterproductive. This piece argues that data quality (cleansing and consolidation) is best handled by the knowledge worker.
|
|
Introducing SAYN: A Simple Yet Powerful Data Processing Framework | by Robin Watteaux | Jan, 2021 | Towards Data Science
An introduction to SAYN, an open source data processing framework, built to be simple yet flexible.
|
Building complex workflows with Amazon MWAA, AWS Step Functions, AWS Glue, and Amazon EMR | Amazon Web Services
Using a set of AWS tools to create a complex workflow to load your dataset – including the relatively new AWS Step Functions.
|
Xplenty | Simplified ETL & ELT to BigQuery, Snowflake, Redshift & Azure
Rapid data preparation and transformation for ever-evolving and changing data requirements. Secure & compliant data pipelines. Get started today. [Sponsored Content]
|
|
Apache Hive Class: Tasting The Honey | by Or Bar Ilan | Jan, 2021 | Medium
Hive is a data warehouse and query interface on top of Hadoop’s native Map-Reduce, which allows SQL-style queries in the Hive Query Language. This piece is a good overview plus the source of a couple Winnie-the-Pooh memes.
|
|
Assessing batch-processed analytics of anonymized rideshare data | by Farhan Juneja | Jan, 2021 | Medium
The City of Chicago is the first city in the country to publish anonymized rideshare data from companies including Uber, Lyft, and Via.
|
Change Data Analysis with Debezium and Apache Pinot | by Kenny Bastani | Apache Pinot Developer Blog | Jan, 2021 | Medium
Exploring real-time analytics based on combining the popular CDC tool, Debezium, with the real-time OLAP datastore, Apache Pinot.
|
|
COVID-19 Global Vaccination - DataChant
This complex, well-done visualization of COVID-19 vaccination data is a good one for the PowerBI aficionados.
|
Visualizing the Sustainable Development Goals | by Claire Santoro | Nightingale | Jan, 2021 | Medium
|
|
The Journey Towards Metric Standardization | Uber Engineering Blog
How Uber implemented uMetric, their unified metric platform, with the goal of making sure the entire organization is aware and agrees on the key metrics that drive making decisions and measuring progress.
|
|
|
Did you enjoy this issue?
|
|
|
|
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
|
|
650 California St., San Francisco, CA 94108
|