Athena now supports Apache Spark, an open-source distributed processing system for running fast analytics workloads. Using Jupyter Notebook in Athena, data analysts and engineers can process data and interact with Spark applications programmatically.
With Athena, customers can query data from various sources and visualize the results of their analyses within seconds, even for interactive Spark workloads.
As an addition to Amazon Athena’s SQL capabilities, Apache Spark on Athena offers on-demand scaling to meet changing data volumes and processing requirements. AWS’ principal developer advocate, Donnie Prakoso, explains the main benefit of the new serverless option based on Spark 3.2
The Apache Spark platform is an open-source distributed processing system that runs fast analytics workloads. It is commonly used in a variety of industries to perform complex data analysis and to gain insights from data lakes. Athena enables data engineers to create Apache Spark applications using notebooks from the AWS console or programmatically using the Athena APIs.
Some developers are unsure about the advantages of the new option since Apache Spark workloads are already supported on AWS using Jupyter notebooks with Glue or EMR Serverless. Athena SQL views are not supported by Athena for Spark, thus making cross-engine queries impossible. AWS has dedicated a podcast episode to discussing the new feature in greater detail.
Apache Spark code executions are charged at 0.35 USD per data processing unit (DPU) per hour, billed per second, with Athena notebooks provided at no additional cost. Four virtual CPUs and 16 GB of memory are provided by a single DPU.
Currently, Athena for Apache Spark is available in the following AWS regions: Ohio, Northern Virginia, Oregon, Tokyo, and Ireland.
Chromium to Allow the Use of Third-Party Rust Libraries to Improve Safety and Security
To improve security, safety, and speed up development, the Chromium Project will add a Rust…
Kubernetes Report Finds Increase in Poorly Configured Workloads
Fairwinds, a provider of Kubernetes software, has released its Kubernetes Benchmark Report 2023…