View profile

SF Data Weekly - Wish's Data Infrastructure, Spark Stream-Stream Joins, Kafka Event Sourcing, Python for Analytics

March 19 · Issue #59 · View online
SF Data Weekly
Our Pick
Building The Analytics Team At Wish | Wish Engineering And Data Science
A data infrastructure built on top of Amazon Redshift, for support of data decisions.
Data Pipelines
Introducing Stream-Stream Joins in Apache Spark 2.3
Sample timeline of join between ad impressions and ad clicks streams.
How to Migrate Mainframe Batch to Cloud Microservices
Microservices design in the used "Velocity" framework.
Apache Spark 2.3 with Native Kubernetes Support
Apache Spark running natively in a Kubernetes cluster.
Data Storage
Event Sourcing Using Apache Kafka | Confluent
Aggregated stream, called projections, used to create data views.
Analyzing Amazon RDS Database Workloads with Performance Insights
A snapshot of the Performance Insights dashboard.
Data Analysis
Data Pre-Processing in Python: How I Learned to Love Parallelized Applies with Dask and Numba
Performance comparison among different setups.
Deep Neural Network Implemented in Pure SQL over BigQuery
Data Visualization
5 Quick and Easy Data Visualizations in Python with Code
An example scatter plot with color groupings and size encoding for the country size.
Data-driven Products
Serverless Dynamic Web Pages in AWS: Provisioned with CloudFormation
Data Engineering Jobs
Data Engineer - Atlassian
Senior Data Engineer - ThousandEyes
Each week we feature job postings that are relevant to all members of our community. 🎉 If you want to post a job you can do it here.
Did you enjoy this issue?
If you don't want these updates anymore, please unsubscribe here
If you were forwarded this newsletter and you like it, you can subscribe here
Powered by Revue
650 California St., San Francisco, CA 94108