Data Engineer (Python, Spark, PySpark, Scala, Java, or C++) - Remote/T

Apply for this position Please mention DailyRemote when applying
timePosted 14 days ago location United States salarySalary undisclosed
Before you apply - make sure the job is legit.

Attempting to apply for jobs might take you off this site to a different website not owned by us. Any consequence as a result for attempting to apply for jobs is strictly at your own risk and we assume no liability.

Job Description

Position is remote / telecommuting until July 1, 2021 -- then employee must be on-site in our Arlington TX office

JOB SUMMARY

We are expanding our efforts into complementary data technologies for decision support in areas of ingesting and processing large data sets including data commonly referred to as semi-structured or unstructured data.

Our interests are in enabling data science and search based applications on large and low latent data sets in both a batch and streaming context for processing. To that end, this role will engage with team counterparts in exploring and deploying technologies for creating data sets using a combination of batch and streaming transformation processes.

These data sets support both off-line and in-line machine learning training and model execution. Other data sets support search engine based analytics. Exploration and deployment of technologies activities include identifying opportunities that impact business strategy, collaborating on the selection of data solutions software, and contributing to the identification of hardware requirements based on business requirements.

Responsibility also includes coding, testing, and documentation of new or modified scalable analytic data systems including automation for deployment and monitoring. This role participates along with team counterparts to develop solutions in an end- to-end framework on a group of core data technologies.

JOB DUTIES

  • Contribute to the evaluation, research, experimentation efforts with batch and streaming data engineering technologies in a lab to keep pace with industry innovation
  • Work with data engineering related groups to inform on and showcase capabilities of emerging technologies and to enable the adoption of these new technologies and associated techniques
  • Contribute to the definition and refinement of processes and procedures for the data engineering practice
  • Work closely with data scientists, data architects, ETL developers, other IT counterparts, and business partners to identify, capture, collect, and format data from the external sources, internal systems and the data warehouse to extract features of interest
  • Code, test, deploy, monitor, document and troubleshoot data engineering processing and associated automation
  • Perform other duties as assigned
  • Conform with all company policies and procedures

Knowledge

  • Experience with processing large data sets using Hadoop, HDFS, Spark, Kafka, Flume, Hbase, Solr or similar distributed systems.
  • Experience with ingesting various source data formats such as JSON, Parquet, SequenceFile Cloud Databases, MQ, Relational Databases
  • Experience with Azure cloud services and AWS S3 data fabric
  • Experience with containerization including Dockers and Kubernetes
  • Experience with continuous integration and deployment (CI/CD) pipelines
  • Experience with NoSQL data stores such as MongoDB, Cassandra, HBase, Redis, Riak or other technologies that embed NoSQL with search such as MarkLogic or Lily Enterprise
  • Prior exposure to Cloudera/HPE Ezmeral Container Platform or other Big Data technologies is desirable
  • Experience or familiarity with ETL and Business Intelligence technologies such as Informatica, DataStage, Ab Initio, Cognos, Business Objects or Oracle Business Intelligence

Skills

  • Ability to quickly prototype and perform critical analysis and use creative approaches for solving complex problems
  • Excellent written and verbal communication skills

Education

  • High School Diploma or equivalent required
  • Bachelor's Degree in related field or equivalent work experience preferred

Experience

  • Minimum of 2 years of experience with software engineering to include Scala, Python, Java, C++ required
  • 2-4 years of hands-on experience with schema design, data modeling, SQL and relational databases such as Oracle, DB2 and Postgres required

GM Financial is an equal-opportunity employer, and is committed to diversity and inclusiveness in its employment practices. Employees must meet qualification standards that are job-related and consistent with business necessity and must be able to perform the "essential job functions" of the position, with or without reasonable accommodation.

- provided by Dice