Erin Sheehan is a consultant at Avaap, a management and technology consulting firm that specializes in data and analytics. Erin helps clients solve their toughest data challenges, through innovative, sustainable solutions. As a data engineer, Erin works directly with StreamSets, Python, and Spark to build pipelines and automate processes. Avaap clients benefit from Erin’s experience and leadership, helping empower new data enthusiasts every day.
Data engineers must be adept in several areas. With years of SME experience on conventional tools, moving data to Hadoop will be a steep learning curve. Hadoop platform also offer a wide variety of services that are custom built to serve a specific purpose. A data engineer is expected to integrate with assortment of data sources at scale, read & stage data, transform and load data across on-premises and cloud ecosystems amongst the specified purpose built targets. StreamSets provides us with the capability to reduce the barrier for developers to adopt to a distributed ecosystem, load and transform data on big data ecosystems. Re-usable & parameterized pipelines, fragments, connections and service specific modules improve the usability of the pipelines.