Great - and always improving – production support strategies are an essential ingredient of our current and future success. We are actively seeking talented system support and engineering specialists who can own support of applications and systems in production and can drive reliability and performance across a massive scale by mastering the full depth of the stack. As an engineering-focused Production Support Specialist, you will have the opportunity to tackle complex problems of scale which are unique to tech companies while using your expertise in delivery and support of mission critical services. With a passion for devops and continuous improvement you will be encouraged to raise the bar for other support specialists to help take their capabilities to the next level.
- Incident resolution and supporting production system deployments while ensuring SLAs are met.
- Deliver on Time to Resolve and Time to Detect reduction efforts.
- Identify and contribute to long-term solutions and preventative techniques.
- Increasing Self-Healing through closed loop automation.
- Progress, protect, and provide for the applications and sub-systems behind all of Capital One’s external and internal customer facing services with an ever-watchful eye on their availability, latency, performance, and capacity.
- Collaborating with other tech leads and support teams to ensure integrated end-to-end availability, reliability, and performance
- Support and deliver within Continuous Integration/Continuous Delivery pipelines
- Influencing resiliency and scalability in production environments in Amazon Web Services and other cloud platforms
- Equipping systems with automated monitoring and alerting
- Support the team and contribute to designing, writing and delivering technical and process automations to improve the availability, scalability, latency, and efficiency of Capital One’s services.
- Solve problems relating to mission critical services and build automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions.
- Influence and support new designs, architectures, standards and methods for large-scale distributed systems.
- Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.
- Identifying and remediating risk to critical and non-critical system KPIs
- This position will require you to troubleshoot and correct issues in urgent situations
- ETL Experts for projects being transitioned to our team for production support, including but not limited to: Ab Initio batch transformation processes, SQL database management, and Teradata and Big Data database population.
- At least 5 years of experience in one or more of the following:
- Candidates should have experience with Enterprise Monitoring Tools such as Splunk, BlueStripe, Zabbix, HPOM, Diagnostics, Sitescope, BSM, AppResponse, CA-Unicenter
- Candidates shoudl have 4 years experience working in 1 or more of the following:
- Candidates should be familiar with running web services at scale; understanding of UNIX systems internals and networking.
- Candidates should have an understanding of UNIX/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way.
- High School Diploma, GED or Equivalent Certification or military experience.
- At least 7 years of experience in Enterprise production support or 5 years of experience in software development teams.
- At least 1 year of experience with ITIL practices and principles.
- At least 1 year of experience with Enterprise Monitoring Tools
- Bachelor’s Degree or Military experience
- 1+ year experience hosting apps in Cloud
- Cloud certification or OpenSource technology certification
At this time, Capital One will not sponsor a new applicant for employment authorization for this position