CloudEXPO 2018
Back To Schedule
Monday, November 12 • 10:00am - 10:40am
Addressing Critical Shortcomings of ETL Tools for Better Data Analytics

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.

Extract, transform and load (ETL) tools like AWC Glue bring much needed functionality. This tool enables new approaches to pulling, processing and pushing data from source to target, and introduces concepts such as performing data transformation tasks using SparkSQL scripts in Apache spark environment. However, there are shortcomings with AWS Glue, leading to a number of challenges and questions:

Where are the necessary coding techniques when it comes to dealing with specific data types and many such technical aspects of traditional ETL tools?

How can data analysts rely on the Glue-Dynamic frame concept for some of the key design aspects when it comes to incremental loads design and data conversion process?

How can accurate reporting be achieved when the data processing step truncates decimals points, creating huge data discrepancies?

At Infostretch, we believe these challenges can be handled by utilizing specific database level functions. In this session, we will showcase real-life experience with such deal-breaking scenarios, and demonstrate how to mitigate these issues without jeopardizing reporting accuracy, compromising on quality, or endlessly waiting for new releases.

Attendees will learn how to overcome the critical shortcomings of the AWS Glue ETL tool to achieve success:

How to use specific database level functions without jeopardizing quality

Address issue of truncating decimals to produce accurate reports

Ensure source system KPIs match up with target system KPIs for complete business insights.

avatar for Maulik Parikh

Maulik Parikh

Enterprise Architect, Infostretch
Maulik Parikh is Enterprise Architect - Data and Cloud Engineering at Infostretch, enabling enterprises accelerate Digital initiatives with Data Analytics, state of the art Cloud Enablement and Enterprise applications. He brings more than 11 years of experience in Architecture and... Read More →
avatar for Deven Samant

Deven Samant

Director of Enterprise Cloud, Infostretch
Deven Samant is Director of Enterprise Cloud and Mobility solutions at Infostretch, helping enterprises accelerate Digital Initiatives with Quality Engineering, IoT Solutions and data analytics. He carries more than 18 years of IT experience in architecture and design coupled with... Read More →

Monday November 12, 2018 10:00am - 10:40am EST
07 Big Data and Analytics, Data Science (PLAZA SUITE) Big Data and Analytics, Data Science