View profile

SF Data Weekly - Airflow Key Concepts, Kafka Internals, Analytics with PySpark, AWS Pipelines

May 14 · Issue #67 · View online
SF Data Weekly
Our Pick
Platforms & The Myth of Data
Data Pipelines
Analyze Data in Amazon DynamoDB Using Amazon SageMaker for Real-time Prediction
Architecture of the presented solution in AWS.
Analyze Apache Parquet optimized data using Amazon Kinesis Data Firehose, Amazon Athena, and Amazon Redshift
The solution given in 5 steps.
Understanding Apache Airflow’s Key Concepts
An example workflow represented with a Directed Acyclic Graph (DAG).
Data Storage
How Kafka’s Storage Internals Work
Kafka uses segments for faster message retrieval.
Level Up Your KSQL
Data Analysis
Get Started with PySpark and Jupyter Notebook in 3 Minutes
TED Talks Analysis — EDA for Beginners
Distribution of views and duration of Ted talks.
Data Visualization
Fonts for Complex Data
An example of presenting different types of data in a single view.
Data-driven Products
The Intersection of Big Data and 5G
Data Engineering Jobs
Data Engineer - DialPad
Data Warehouse Engineer - Coffee Meets Bagel
At the end of each SF Data Weekly issue you can find job postings that are relevant to all members of our community. 🎉
If you want to post a job for your company, you can do it here.
Did you enjoy this issue?
If you don't want these updates anymore, please unsubscribe here
If you were forwarded this newsletter and you like it, you can subscribe here
Powered by Revue
650 California St., San Francisco, CA 94108