View profile

SF Data Weekly - Uber's HDFS, Serverless Architectures, Analytics Pipelines, Redshift & Kafka Diagnostics

April 16 · Issue #63 · View online
SF Data Weekly
Our Pick
Serverless — the Future of Software Architecture?
The serverless solution at A Cloud Guru.
Data Pipelines
Give Meaning to 100 Billion Analytics Events a Day
Data aggregation spanning between AWS and GCP.
We love syslogs: Real-time syslog Processing with Apache Kafka and KSQL—Part 1: Filtering
Filtering log messages with KSQL.
A Simple and Scalable Analytics Pipeline
An example analytics architecture in GCP.
Data Storage
Scaling Uber’s Hadoop Distributed File System for Growth
Using ViewFs in multiple data centers to help manage HDFS namespaces.
Cross-cluster Diagnostic Queries on Multiple Amazon Redshift Clusters
Of Streams and Tables in Kafka and Stream Processing, Part 1
Data Analysis
Deep Learning With Apache Spark — Part 1
Deep Learning Pipelines - an open source library created by Databricks.
Sentiment Analysis with PySpark
Data Visualization
Color: From Hexcodes to Eyeballs
Combination of subpixels in the frequency domain.
Data-driven Products
How Artificial Intelligence and Data Add Value to Businesses | McKinsey & Company
Data Engineering Jobs
Data Engineer - Amazon
At the end of each SF Data Weekly issue you can find job postings that are relevant to all members of our community. 🎉
If you want to post a job for your company, you can do it here.
Did you enjoy this issue?
If you don't want these updates anymore, please unsubscribe here
If you were forwarded this newsletter and you like it, you can subscribe here
Powered by Revue
650 California St., San Francisco, CA 94108