This week's pick is a side-by-side comparison of Exploratory Data Analysis (EDA) in Pandas and BigQue
|
November 25 · Issue #165 · View online |
|
This week’s pick is a side-by-side comparison of Exploratory Data Analysis (EDA) in Pandas and BigQuery SQL. We also have pieces on the use of Amazon Elastic Map Reduce (EMR) in pipelines, and the Flyte orchestration platform at Lyft. Have a safe and happy Thanksgiving!
|
|
|
Exploratory Data Analysis with BigQuery SQL? Easy! | by Mike Shakhomirov | Nov, 2020 | Towards Data Science
EDA is often done with tools like Python Pandas. This piece compares doing EDA with Pandas to using BigQuery SQL. It has a plethora of side by side examples showing the Pandas and BigQuery approach. One of the better EDA articles we’ve seen in a long time.
|
|
Using the Amazon Redshift Data API to interact from an Amazon SageMaker Jupyter notebook | Amazon Web Services
Using the Redshift Data API instead of a traditional JDBC connection to connect a Jupyter notebook to Redshift.
|
Xplenty | Simplified ETL & ELT to BigQuery, Snowflake, Redshift & Azure
Rapid data preparation and transformation for ever-evolving and changing data requirements. Secure & compliant data pipelines. Get started today. [Sponsored Content]
|
|
Snowflake vs BigQuery - IMO. I know what you’re thinking - great… | by Simon Darr | Servian | Nov, 2020 | Medium
Another A vs B post which turns out to be a well-written and compelling post detailing some of the reasons that Snowflake is easier to use than BigQuery.
|
Is BigQuery Omni the next revolution in Data Warehousing? | by Janaka Ekanayake | Nov, 2020 | Medium
“Revolution” might be an overstatement, but this article is a good overview of the features of BigQuery Omni, which is Google’s new serverless version of BigQuery.
|
|
Orchestrating analytics jobs by running Amazon EMR Notebooks programmatically | Amazon Web Services
Amazon EMR is a big data service offered by AWS to run Apache Spark and other open-source applications on AWS. Amazon EMR Notebooks is a managed environment based on Jupyter Notebook. This piece shows how to schedule and run EMR notebooks using the AWS CLI and chaining notebooks using CloudWatch Events.
|
Python, Pandas & XlsxWriter | by Dean McGrath | Towards Data Science
Everybody needs to push out an Excel spreadsheet for further analysis, or for a colleague, once in a while. This piece show to output a Pandas DataFrame to Excel with table formatting using XlsxWriter and Python.
|
|
Is Tableau Really Better Than Power BI? | by Jenna Eagleson | Nightingale | Nov, 2020 | Medium
For a tool with a big market share, Power BI doesn’t have big mind share. This piece by a Power BI user lists a number of compelling features that she misses while using Tableau.
|
Applying row-level and column-level security on Amazon QuickSight dashboards | Amazon Web Services
Amazon QuickSight is a cloud-scale business intelligence (BI) service. One of its enterprise/real-world features is the column- and row-level security, and this piece shows how that’s done.
|
|
Migrating Slack Airflow to Python 3 Without Disruption - Slack Engineering
With the EOL of Python 2, Slack wanted to upgrade their Airflow workflows to Python 3. This article shows how they did it while maintaining transparency to users, and increasing reliability.
|
Building a Gateway to Flyte. At the beginning of the year we… | by Katrina Rogan | Nov, 2020 | Lyft Engineering
Flyte is Lyft’s open source, Python-based orchestration platform for data and machine learning workflows. This piece looks at Flyte Admin, the control plane and gateway to Flyte.
|
|
|
Did you enjoy this issue?
|
|
|
|
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
|
|
650 California St., San Francisco, CA 94108
|