|
February 4 · Issue #225 · View online |
|
This week’s pick is a project to predict whether a given hip-hop song will be successful on Spotify. We also have a piece explaining the basics of K-means clustering, and a visualization contest to re-create W.E.B. Du Bois’ visualizations at the 1900 Paris Exposition using modern tools. Stay Healthy!
|
|
|
Using Deep Learning to Predict Hip-Hop Popularity on Spotify | by Nicholas Indorf | Jan, 2022 | Towards Data Science
The author built a tool to help his cousin, a hip-hop artist named “KC Makes Music,“ assess whether his unreleased songs would be successful on Spotify.
|
|
Validate streaming data over Amazon MSK using schemas in cross-account AWS Glue Schema Registry | Amazon Web Services
How the AWS Glue Schema Registry can be used to centrally publish, discover, control, validate and evolve schemas for stream processing applications for a number of Amazon apps, including MSK (Kafka).
|
Develop a Tailored View of Your Salesforce Customer Data, Acquisition, and Billing
If you have outgrown Salesforce Data Loader and the Salesforce Data Import Wizard - you need a more scalable solution. Check out Integrate.io’s Salesforce capabilities. [Sponsored]
|
|
How I Discovered Thousands of Open Databases on AWS | by Avi Lumelsky | Jan, 2022 | InfoSec Write-ups
This scary piece details how the author found thousands of ElasticSearch databases and Kibana dashboards in just one day.
|
Working with JSON data in BigQuery | by Lak Lakshmanan | Google Cloud - Community | Jan, 2022 | Medium
BigQuery recently announced that it will support JSON as a data type. This piece explores what that means for BigQuery users.
|
Executing Multiple SQL Statements in a Stored Procedure | Snowflake
Example code showing the basics of stored procedures in Snowflake.
|
|
K-Means Clustering: Explain It To Me Like I’m 10 | by Shreya Rao | Jan, 2022 | Towards Data Science
|
Neural Network From Scratch In Excel | by Angela Shi | Jan, 2022 | Towards Data Science
Using Excel or Google Sheets to implement a simple Neural Network model for exploration and learning purposes.
|
|
The #DuBois Challenge | Nightingale
In February 2021, people on Twitter were challenged to re-create the historical data visualizations of W. Du Bois. A year later, this piece looks at the wide variety of visualizations created as part of the challenge.
|
Submit Your Work for the Outlier Viz Exhibit! | Nightingale
It’s not too late to submit your visualization to Outlier 2022.
|
|
Cost Efficiency @ Scale in Big Data File Format
Uber uses Apache Parquet as the file format for its data lake. This piece shows the different strategies used to store data more efficiently, including compression, column deletion and column re-ordering.
|
|
|
Did you enjoy this issue?
|
|
|
|
In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
|
|
650 California St., San Francisco, CA 94108
|