|
March 25 · Issue #232 · View online |
|
This week’s pick is an introduction to the data artist role at a grocery retailer. We also have a piece explaining the performance benefits of sharding, as well as a Python implementation of the Levehshtein algorithm for plagarism detection. Stay healthy!
|
|
|
So what does a data artist in grocery retail work on? | by Cognetry Labs Inc. | Mar, 2022 | Medium
If you think you’re a data artist (and who doesn’t), this piece is for you.
|
|
Accelerate your data warehouse migration to Amazon Redshift – Part 5 | Amazon Web Services
Part 5 of this series delves into the details of replicating SET tables from Teradata and optimizing INSERT-SELECT statements. The first four parts of this series are similarly detailed and comprehensive.
|
Magento2 vs. Shopify Plus: How to Choose the Right Ecommerce Platform | Integrate.io
We discuss two of the top e-commerce solutions, Magento 2 vs. Shopify Plus, so you can make the right decision for your business. [Sponsored]
|
|
How sharding a database can make it faster - Stack Overflow Blog
Sharding was one of the first ways databases were distributed to improve performance. Why recent innovations have made it one of the best.
|
Data Mesh Architecture
A data mesh architecture is a decentralized approach that enables domain teams to perform cross-domain data analysis on their own. This site is a deep dive into theory and implementation of data mesh.
|
A Fundamental Guide to SQL Query Optimization | by Koushik Thota | Mar, 2022 | Medium
A simple but important set of tips and tricks to optimize your SQL queries.
|
|
Data Sampling Methods in Python. A ready-to-run code with different data… | by Tatev Karen | Mar, 2022 | Towards Data Science
Data Sampling forms the essential part of the majority of research, scientific and data experiments. This piece shares sample code to create random and representative samples in Python.
|
Text Similarity w/ Levenshtein Distance in Python | by Vatsal | Mar, 2022 | Towards Data Science
How Levenshtein distance works, and how to use Levenshtein distance in building a plagiarism detection pipeline.
|
|
How to use Color Palettes for your Data Visualization | by Dr. Gregor Scheithauer | Mar, 2022 | Towards Data Science
A step-by-step color palette tutorial for Seaborn, Altair and ggplot2.
|
Who are the finalists of the 2022 Iron Viz Qualifiers?
The top ten visualizations and three finalists selected to compete in the 2022 Iron Viz Championship at Tableau Conference.
|
|
How Beike built its Unified Metrics Platform using Apache Kylin | by Coco Li | Kyligence | Mar, 2022 | Medium
Beike is China’s combination of Zillow and MLS. This piece traces the history of their move from Hive + MySQL to Kylin, including details of their contribution to the Kylin project.
|
Graph machine learning with missing node features
Two Twitter engineers explain how feature propagation can be an efficient and scalable approach for handling missing features in graph machine learning applications.
|
|
|
Did you enjoy this issue?
|
|
|
|
In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
|
|
650 California St., San Francisco, CA 94108
|