View profile

SF Data Weekly - Uber's HDFS, Serverless Architectures, Analytics Pipelines, Redshift & Kafka Diagnostics

April 16 · Issue #63 · View online
SF Data Weekly
Our Pick
Serverless — the Future of Software Architecture?
The serverless solution at A Cloud Guru.
Data Pipelines
Give Meaning to 100 Billion Analytics Events a Day
Data aggregation spanning between AWS and GCP.
We love syslogs: Real-time syslog Processing with Apache Kafka and KSQL—Part 1: Filtering
Filtering log messages with KSQL.
A Simple and Scalable Analytics Pipeline
An example analytics architecture in GCP.
Data Storage
Scaling Uber’s Hadoop Distributed File System for Growth
Using ViewFs in multiple data centers to help manage HDFS namespaces.
Cross-cluster Diagnostic Queries on Multiple Amazon Redshift Clusters
Of Streams and Tables in Kafka and Stream Processing, Part 1
Data Analysis
Deep Learning With Apache Spark — Part 1
Deep Learning Pipelines - an open source library created by Databricks.
Sentiment Analysis with PySpark
Data Visualization
Color: From Hexcodes to Eyeballs
Combination of subpixels in the frequency domain.
Data-driven Products
How Artificial Intelligence and Data Add Value to Businesses | McKinsey & Company
Data Engineering Jobs
Data Engineer - Amazon
At the end of each SF Data Weekly issue you can find job postings that are relevant to all members of our community. 🎉
If you want to post a job for your company, you can do it here.
Did you enjoy this issue?
Thumbs up 1ae5a7bdfcd3220e2b376aa0c1607bc5edaba758e5dd83b482d03965219a220b Thumbs down e13779fa29e2935b47488fb8f82977fedcf689a0cc0cc3c19fa3c6bb14d1493b
If you don't want these updates anymore, please unsubscribe here
If you were forwarded this newsletter and you like it, you can subscribe here
Powered by Revue
650 California St., San Francisco, CA 94108