T-Systems-Claim-Logo
Search
Red abstract lines on a black background

AWS re:Invent 2021: New Data Analytics offerings

A roundup of AWS’s cool new data analytics offerings, announced at re:Invent 2021

December 20 2021Parvez Ahmad

AWS re:Invent highlights – Serverless computing

At AWS re:Invent 2021 – a learning conference hosted by Amazon Web Services for the global cloud computing community - we learned about many cool new features and services. Looking at the Data Analytics track, the primary focus is on serverless computing, data security, archiving, and improving ease of use. Let’s briefly discuss these new offerings and how they will prove helpful for AWS customers.

New release of Amazon EMR Serverless

Woman in an office looking at a tablet

The new release offers the flexibility to run Big Data applications without acquiring clusters. It automatically provisions and scales the compute and memory the applications require, so you only pay for the resources used.

With pre-initialized workers, jobs run immediately when the application starts up. Benefits include:

  • No need to configure, optimize, operate, re-size, or secure clusters
  • A job is run in a single AZ, avoiding performance implications of network traffic across AZs
  • No need to re-write, existing applications should work on EMR Serverless

Introducing Amazon Redshift Serverless

Cloud Network Solution digital background

Until now, a Redshift cluster was necessary to use Redshift’s powerful query engine. AWS introduced Redshift Serverless to run high-performance analytics.

It helps manage variable workloads with unpredictable spikes, where it can be challenging to continually manage capacity in a cluster.



Benefits include:

  • Provisions the right compute resources and scales seamlessly to demand
  • Pay for querying and loading data. No cost when a warehouse is idle
  • Query datastores like S3, Aurora, and RDS
  • Use the web-based query editor without configuring an SQL client

New cluster type: Amazon MSK Serverless

It is now possible to run the Apache Kafka cluster without managing and scaling its capacity. It supports native AWS integrations that provide private connectivity with AWS PrivateLink, secure client access with IAM, and schema evolution control with AWS Glue. Benefits include:

  • Fewer configurations make it easy to launch Kafka cluster
  • Automatically manage cluster capacity and partitions 
  • Pay-as-you-go pricing, no upfront fees. Hourly rate per cluster, per partition

AWS Lake Formation; new table & security   

Governed Tables is a new kind of S3 table that supports ACID transactions, lets multiple users insert and delete data concurrently, and ensures data is consistent and up to date. It includes automatic compaction for storage optimization.
 

A Row/Cell-Level Security feature gives users fine-grained access control of the data stored in a Data Lake. It supports both governed and traditional S3 tables.

New Standard-Infrequent Access table

Optimize storage costs by 60% by moving infrequently accessed data into Amazon’s new DynamoDB Standard-Infrequent Access table class.

Benefits include:

  • No need for process of archiving in S3
  • Enjoy the same performance as the Standard table class
  • Get single-digit millisecond read and write performance

New AWS Data Exchange for APIs

Finding a suitable third-party dataset, then licensing and loading the data can be time-consuming. AWS Data Exchange for APIs makes it easy to subscribe to the datasets in AWS’s Marketplace. Once subscribed to a data product, load data directly into Amazon S3 or Redshift and analyze it with AWS analytics and ML services.

Benefits include:

  • Quick to choose from diverse datasets such as climate, healthcare, media etc
  • No need to request physical media, manage FTP, or integrate APIs from multiple providers

Introducing Amazon SageMaker Canvas

Businesses rely heavily on critical use cases like fraud detection, churn reduction, and inventory optimization to make better decisions. However, they wait for the experts to build such models for them. With SageMaker Canvas any engineer or business analyst can use the drag-and-drop editor to generate accurate ML predictions without any ML experience.

Benefits include:

  • Quick access to cloud or on-premises data
  • Combine datasets to create a unified dataset for training
  • Automatically detect data errors
  • Use SageMaker’s AutoML technology to build and train models automatically

Successor to SageMaker Ground Truth

Building labeling applications and managing raw data such as images, text files, and videos can be tedious. SageMaker Ground Truth Plus is more efficient than its predecessor SageMaker Ground Truth. It helps prepare high-quality training sets by adding informative labels for the ML models.

Benefits include:

  • Reduce costs by 40%
  • No need for deep ML experience
  • Serves a wide variety of use cases like computer vision, NLP, and speech recognition

New SageMaker Serverless Inference

Amazon’s SageMaker Serverless Inference enables the deployment of ML models without configuring or managing the underlying infrastructure. When you select the serverless option, SageMaker automatically provisions, scales, and turns off compute capacity based on the volume of inference requests.

Benefits include:

  • Pay only for the duration of running the code and the data processed, not for idle time
  • Handle intermittent or unpredictable traffic efficiently

No-cost Amazon SageMaker Studio Lab

SageMaker Studio Lab is a free, zero-configuration service for training on CPUs or GPUs with 15 GB of persistent storage. Based on the open- source Jupyter Lab web app, it lets people learn and experiment with Machine Learning leveraging most of the open-source frameworks.

Conclusion

Serverless allows tenants to save cost and adds the benefit of not having to manage or monitor resources. It also enables customers to not worry too much about configuration before trial and test. And AWS has added several point-and-click features in their ML offerings allowing users to design ML solutions without any code writing skills or previous experience. Overall, they have simplified several services that enable customers to forget about cost savings plans and handle unpredictable workloads seamlessly.

About the author
IM-Ahmad-Parvez

Parvez Ahmad

Data Analyst , T-Systems International GmbH

Show profile and articles

Does your heart beat green yet?

The new issue of Future Practice.

You might also be interested in:

Do you visit t-systems.com outside of Germany? Visit the local website for more information and offers for your country.