At AWS re:Invent 2021 – a learning conference hosted by Amazon Web Services for the global cloud computing community - we learned about many cool new features and services. Looking at the Data Analytics track, the primary focus is on serverless computing, data security, archiving, and improving ease of use. Let’s briefly discuss these new offerings and how they will prove helpful for AWS customers.
The new release offers the flexibility to run Big Data applications without acquiring clusters. It automatically provisions and scales the compute and memory the applications require, so you only pay for the resources used.
With pre-initialized workers, jobs run immediately when the application starts up. Benefits include:
Until now, a Redshift cluster was necessary to use Redshift’s powerful query engine. AWS introduced Redshift Serverless to run high-performance analytics.
It helps manage variable workloads with unpredictable spikes, where it can be challenging to continually manage capacity in a cluster.
Benefits include:
It is now possible to run the Apache Kafka cluster without managing and scaling its capacity. It supports native AWS integrations that provide private connectivity with AWS PrivateLink, secure client access with IAM, and schema evolution control with AWS Glue. Benefits include:
Governed Tables is a new kind of S3 table that supports ACID transactions, lets multiple users insert and delete data concurrently, and ensures data is consistent and up to date. It includes automatic compaction for storage optimization.
A Row/Cell-Level Security feature gives users fine-grained access control of the data stored in a Data Lake. It supports both governed and traditional S3 tables.
Optimize storage costs by 60% by moving infrequently accessed data into Amazon’s new DynamoDB Standard-Infrequent Access table class.
Benefits include:
Finding a suitable third-party dataset, then licensing and loading the data can be time-consuming. AWS Data Exchange for APIs makes it easy to subscribe to the datasets in AWS’s Marketplace. Once subscribed to a data product, load data directly into Amazon S3 or Redshift and analyze it with AWS analytics and ML services.
Benefits include:
Businesses rely heavily on critical use cases like fraud detection, churn reduction, and inventory optimization to make better decisions. However, they wait for the experts to build such models for them. With SageMaker Canvas any engineer or business analyst can use the drag-and-drop editor to generate accurate ML predictions without any ML experience.
Benefits include:
Building labeling applications and managing raw data such as images, text files, and videos can be tedious. SageMaker Ground Truth Plus is more efficient than its predecessor SageMaker Ground Truth. It helps prepare high-quality training sets by adding informative labels for the ML models.
Benefits include:
Amazon’s SageMaker Serverless Inference enables the deployment of ML models without configuring or managing the underlying infrastructure. When you select the serverless option, SageMaker automatically provisions, scales, and turns off compute capacity based on the volume of inference requests.
Benefits include:
SageMaker Studio Lab is a free, zero-configuration service for training on CPUs or GPUs with 15 GB of persistent storage. Based on the open- source Jupyter Lab web app, it lets people learn and experiment with Machine Learning leveraging most of the open-source frameworks.
Serverless allows tenants to save cost and adds the benefit of not having to manage or monitor resources. It also enables customers to not worry too much about configuration before trial and test. And AWS has added several point-and-click features in their ML offerings allowing users to design ML solutions without any code writing skills or previous experience. Overall, they have simplified several services that enable customers to forget about cost savings plans and handle unpredictable workloads seamlessly.