Spark and Hadoop

 Spark & Hadoop are one of the best tools in market today for ingesting and analyzing Big Data

 Big data characteristics can

–  Volume – large amount of data

–  Velocity – near real time data flow

–  Variety – mix of structured, semi-structured and unstructured data

– Veracity – low quality data

We will present concepts used to ingest and process data on a Hadoop cluster using Spark with the most up-to-date tools and techniques.

 What are the challenges faced by Spark developers while designing and building data lake applications

 How to identify which tool is the right one to use in a given situation