July 8, 2022

StreamSets Product Roadmap Fall 22

July 8, 2022

Data pipeline architecture: Building a path from ingestion to analytics

Data pipelines transport raw data from software-as-a-service (SaaS) platforms and database sources to data warehouses for use by analytics and business intelligence (BI) tools. Developers can build pipelines themselves by writing code and manually interfacing with source databases — or they can avoid reinventing the wheel and use a SaaS data pipeline instead. To understand [...]

July 8, 2022

Data Mesh Architecture

This is a presentation on how to migrate from data lake to data mesh architecture. It provides pros and cons of data mesh over data lake. Evolution of mesh and need for data mesh in today’s data world.

July 8, 2022

StreamSets + Kafka: Match Made in Heaven

Data engineers are data professionals that are tasked with creating and maintaining complex systems that are capable of collecting, storing, and analyzing big data for companies. The way Kafka treats the concept of data is entirely different from what we have thought of data to be. Kafka provides a more robust, fault tolerant and distributed [...]

July 8, 2022

Democratizing Data Engineering

Data engineers must be adept in several areas. With years of SME experience on conventional tools, moving data to Hadoop will be a steep learning curve. Hadoop platform also offer a wide variety of services that are custom built to serve a specific purpose. A data engineer is expected to integrate with assortment of data [...]

July 8, 2022

The Case for Declarative Machine Learning

The future of machine learning is declarative and data first. This talk will make the case for declarative ML and provide a peek at what this potential future could look like.

July 8, 2022

GitOps – StreamSets Platform Automation

GitOps is an operational framework that takes DevOps best practices used for application development such as version control, collaboration, compliance, and CI/CD, and applies them to infrastructure automation. In cloud environments Terraform IAC is commonly used to deploy infrastructure resources. The automation of this is now encapsulated into well established patterns. In this presentation, using [...]