Teradata open sources Kylo data lake pipeline

Hadoop, Spark, NiFi bundle released under Apache 2.0 license to enable ‘fit for purpose’ data lakes.

Teradata has open sourced its ‘Kylo’ data pipeline technology that is said to simplify and accelerate data lake deployment. Kylo embeds a suite of open source technologies including Apache Hadoop, Spark and NiFi. The Teradata-sponsored project has now been released under an Apache 2.0 license. Kylo evolved from code harvested from data lake engagements led by Think Big Analytics, a company that Teradata acquired in 2014.

Commenting the release, Enterprise Strategy Group’s Nik Rouda observed that for many, implementing the Hadoop stack is a complex endeavor, ‘Big data technologies are heavily oriented to software engineering, developers and system administrators.’ ESG research found that many struggle to staff teams with BI and analytics talent. Big data and open source solution expertise is even harder to come by. Most surveyed said that their big data initiatives take between seven months and three years to show significant business value. Even when a data lake has been achieved, it may fail to attract users who find it difficult to explore.

Kylo address such challenges by simplifying development of the data pipeline and common data management tasks. Kylo’s user interface allows for code-free self-service data ingest and wrangling, while reusable templates increase productivity. Teradata’s Duncan Irving told Oil IT Journal, ‘Kylo looks really promising for oil and gas as part of the data governance story around populating a fit-for-purpose data lake.’

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.