As a Master Data Engineer you will provide engineering leadership to create and enhance data solutions enabling seamless integration and flow of data across our data ecosystem. Additionally you will provide senior level technical consulting to peer data engineers during design and development for highly complex and critical data projects. Some of these projects will include designing and developing data ingestion and processing/transformation frameworks leveraging open source tools such as NiFi, Sqoop, Hive, Java, Pig, Python, Spark, etc. Additionally you will work on real time processing solutions using tools such as Apex, Flink, Storm and Kafka. You will deploy application code and analytical models using CI/CD tools and techniques and provide support for deployed data applications and analytical models. In this role, you will have the following duties:
* Develop data driven solutions utilizing current and next generation technologies to meet evolving business needs.
* Ability to quickly identify an opportunity and recommend possible technical solutions.
* Utilize multiple development languages/tools such as Python, SPARK, HBase, Hive, Microsoft R, Java to build prototypes and evaluate results for effectiveness and feasibility.
* Operationalize open source data-analytic tools for enterprise use.
* Develop real-time data ingestion and stream-analytic solutions leveraging technologies such as Kafka, Apache Spark, NIFI, Python, HBase and Hadoop.
* Custom Data pipeline development (Cloud and locally hosted)
* Work heavily within the Hadoop ecosystem and migrate data from Teradata to Hadoop.
* Provide support for deployed data applications and analytical models by being a trusted advisor to Data Scientists and other data consumers by identifying data problems and guiding issue resolution with partner Data Engineers and source data providers.
* Provide subject matter expertise in the analysis, preparation of specifications and plans for the development of data processes.
* Ensure proper data governance policies are followed by implementing or validating Data Lineage, Quality checks, classification, etc.
Technical Skills / Experience
* Deep understanding of the Hadoop technology stack, preferably the HortonWorks distribution
* Experience in ETL processing preferably using Ab Initio
* Experience in migrating ETL processes (not just data) from relational warehouse Databases to Hive
* Building custom NiFI processors
* Data pipeline development
* Experience in developing Python / R applications
* Spark application coding in Scala / Python (pySpark)
* Deep knowledge and very strong in SQL, and Relational Databases
* Strong in Unix / Shell scripting
* Experience in creating very efficient HiveQL and SparkQL queries and can educate peers on the topic
* 7+ years of experience of being a lead engineer and able to coach/provide guidance to peer and junior engineers.
* Excellent written and verbal communication, presentation and professional speaking skills
* Collaborative individual who excels in working within a team and with business partners to identify, develop and deliver innovative data solutions
* Influencing skills/ability. Must be able to work with effectively with different levels of management and all business areas.
* Ability to demonstrate leadership to managers, and peer level staff.
* Ability to build and leverage external relationships.
* Decision making abilities while gathering information and then put your decisions into action.
* Passionate learner who enjoys education through class room training and self-discovery on a variety of emerging technologies
Promote a risk-aware culture, ensure efficient and effective risk and compliance management practices by adhering to required standards and processes.
#LI-KE We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, color, sex, age, national origin, religion, sexual orientation, gender identity, status as a veteran, and basis of disability or any other federal, state or local protected class.
Discover is one of the most recognized brands in U.S. financial services. We’re a direct banking and payment services company built on a legacy of innovation and customer service.