Develops and maintains scalable data pipelines and builds out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using HQL and 'Big Data' technologies
Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization
Implements processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
Writes unit/integration tests, contributes to engineering wiki, and documents work.
Performs root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
A seasoned professional with 5+ years of relevant experience who is excited to apply their current skills and to grow their knowledge base.
BS or MS degree in Computer Science or a related technical field
Experience with data pipeline, data analytics, data warehousing and big data
Experience with SQL/No-SQL, schema design and dimensional data modeling
Experience with Big Data technologies stack such as HBase, Hadoop, Hive, Oozie, MapReduce
Experience in AWS/Spark/Java - must have
Python development is a plus
Familiar with Agile methodology, test-driven development, source control management and automated testing.- provided by Dice