June 10, 2022

Data pipeline architecture: Building a path from ingestion to analytics

Data pipelines transport raw data from software-as-a-service (SaaS) platforms and database sources to data warehouses for use by analytics and business intelligence (BI) tools. Developers can build pipelines themselves by writing code and manually interfacing with source databases — or they can avoid reinventing the wheel and use a SaaS data pipeline instead. To understand [...]

June 10, 2022

Give me a sequential number and lock my data please, Ms. Cassandra!

Distributed databases present a unique challenge especially when data has to be strictly read/written in sequential order. When it comes to Cassandra Database, Cassandra chooses availability and partition tolerance over tunable consistency. In transactions for creating user accounts or blocking MTN’s, race conditions between two potential writes must be regulated to ensure that one write [...]

June 10, 2022

The Case for Declarative Machine Learning

The future of machine learning is declarative and data first. This talk will make the case for declarative ML and provide a peek at what this potential future could look like.

June 10, 2022

GitOps – StreamSets platform automation

GitOps is an operational framework that takes DevOps best practices used for application development such as version control, collaboration, compliance, and CI/CD, and applies them to infrastructure automation. In cloud environments Terraform IAC is commonly used to deploy infrastructure resources. The automation of this is now encapsulated into well established patterns. In this presentation, using [...]