The need for acceleration of data ingestion tools and ELT processes comes from the fact that moving data is often one of the most expensive and important operations in a Big Data pipeline. Data engineering and performance engineering teams know that the best option is to leave data in place, but for some operations there is no choice.
Bigstream focuses on these key parts of the Ingest pipeline:
- High-speed data connectors to Amazon S3, Kafka and Hadoop File System (HDFS)
- Hyper-accelerated compression and decompression of GZIP, snappy, DEFLATE and LZO formats
- Hyper-accelerated parsing of JSON, Avro, CSV and Parquet documents/data