Amazon EMR with Spark and Bigstream

Same EMR, but Faster and Cheaper

Bigstream is available to you today with just a few additional clicks when you launch.

Sign up for our free 30-day trial and follow our simple step-by-step instructions. This software-driven version of hyperacceleration is a no-brainer and you’ll see results today.

Sign Up

Speed Your EMR Jobs

Bigstream’s innovative software examines each Spark job and adds further optimization. It can optimize across different cluster hardware configurations.

Easy to Deploy

In just minutes, you can add Bigstream Hyperacceleration to your existing Spark EMR jobs. Simply register with Bigstream and add a bootstrap link to your EMR cluster provisioning.

Reduce Your AWS Costs

TCO Calculator

No Risk

With the current 30-day free trial, try Bigstream on all your workloads for full performance and AWS cost reduction visibility. Beyond the trial, you choose the workloads where you want to enable Bigstream.

Deliver Results

Fast Insights

Jobs run up to 10X faster without changing your Apache Spark code, increasing data science productivity and the ability to meet tight SLAs

Enriched Analytics

Incorporate fuller data sets and more data sources by eliminating the bottlenecks forcing you to make compromises

Lower TCO

Instead of scale-out and scale-up, optimize your existing Spark environment with Bigstream, lowering hardware and operating expenses

Zero Code Change Means Fast Implementation

Bigstream provides the communication and translation between Spark code and the computing environment. This means you can be up and running in minutes.

FPGA/F1 Instances

Hyperacceleration also lets Spark tap into specialized hardware such as the FPGAs on AWS F1 instances.

F1 instances are currently not available for EMR users, but if you run Spark directly on EC2, see the EC2 page to take advantage of F1 and Bigstream.

How It Works

Bigstream’s Hyperacceleration improves on Spark’s native memory management, using and reusing memory more efficiently, thus avoiding unnecessary data copies. Spark’s in-memory benefits are often lost when garbage collection consumes excess memory. With Bigstream’s approach, no time is spent on garbage collection.