|
November 26 · Issue #217 · View online |
|
This week’s pick is a look at data governance, why it got a bad name, and how to save its reputation. We also have a piece on data catalog choice, as well as data integrity using Great Expectations. Stay Healthy!
|
|
|
Data Governance Has a Serious Branding Problem | by Prukalpa | Nov, 2021 | Towards Data Science
Why data governance is having an identity crisis, what it was envisioned as decades ago, and how we can save the reputations of data stewards everywhere.
|
|
Provide data reliability in Amazon Redshift at scale using Great Expectations library | Amazon Web Services
Data reliability or data integrity is always a difficult challenge. This solution shows how to use the Great Expectations data validation library in a data pipeline with a Redshift endpoint.
|
How to create a Salesforce ETL pipeline in less than 30 minutes | Xplenty
We’ll show you how easy it is to create a Salesforce ETL pipeline so you can migrate your Salesforce data to a data warehouse or data lake for analytics and reporting. [Sponsored]
|
|
Use the Amazon Redshift SQLAlchemy dialect to interact with Amazon Redshift | Amazon Web Services
SQLAlchemy is a Object Relational Mapper (ORM) for Python. This piece shows how to use it with Redshift.
|
How to evaluate a data catalog. The data catalog is becoming a… | by Grant Seward | Nov, 2021 | Medium
The data catalog, along with the warehouse and BI tool, is one of the three pillars of a data ecosystem. This piece explains how to choose one.
|
|
An introduction to Probability Sampling Methods | by Eugenia Anello | Nov, 2021 | Towards Data Science
Five different probabilistic methods for choosing a sample for a statistical study.
|
Use Cloud Storage as a mounted local file system in Vertex AI and AI Platform to store the training data and outputs. | Google Cloud Blog
This blog post introduces Google Cloud Storage FUSE to Vertex AI and AI Platform users. This feature enables the training jobs on these platforms to read and write the data on Cloud Storage via a mounted file system.
|
|
Visualising Global Population Datasets with Python | by Parvathy Krishnan | Nov, 2021 | Towards Data Science
Using publicly-available data to explore administrative boundaries and population, and evaluation which data sources and techniques work best for this application.
|
5 Steps To Choosing Great Data Visualizations for Your Data Science Projects | by Benjamin Nweke | Nov, 2021 | Towards Data Science
A good step-by-step for those new to data science visualizations.
|
|
The Rise (and Lessons Learned) of ML Models to Personalize Content on Home (Part II) : Spotify Engineering
Part 1 of this series explained the models used at Spotify to recommend content. This piece explains how Spotify evaluates those models.
|
Parameter Exploration at Lyft. What is Parameter Exploration | by Henry Quan | Nov, 2021 | Lyft Engineering
How the Lyft experimentation team uses A/B testing, time-split and region-split experiments to evaluate product changes.
|
|
|
Did you enjoy this issue?
|
|
|
|
In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
|
|
650 California St., San Francisco, CA 94108
|