About the Role and what you will be doing
* Ownership of the Entire Fleet of DevOps team
o Excellent people managerial skills and ability to implement frameworks and steps to streamline process
o Ensure appropriate levels of up-time
o Proactively recommend hardware for capacity, end-of-life, or improvements
o Resolve hardware and software issues in a timely fashion
* System Health
o You get all alerts 15 minutes before anyone else
o You resolve all alerts quickly. Ability to implement and streamline process to meet the defined SLA of 100 min.
o Regularly improving the quality of our monitoring
o Ensure all monitors are functioning as expected
* Primary Point of Contact for Production Issues
o On-Call (or have found suitable coverage) for all production issues; including but not limited to: DOS attacks, hardware failures, customer impacting software bugs
* Documentation / Knowledge Sharing
o Keep all Ops related documentation up-to-date
o Document current processes
o Train current staff on changes and improvements to previous processes prior to disrupting operations
* Tier3 Support
o Correctly and efficiently answer customer service inquires by interrogating logs and the databases
o Handle routine Tier 3 tickets. Identify, Isolate, Track and closure of issues by looping the Devs and making sure fixes go into current/future sprint cycles. Having traceability and ownership of issues
* Office Network
o Ensure appropriate configuration and reliability of the office network
o Proactively monitor for potential issues
o Resolve network issues in a timely fashion
o Reporting and dashboards for executive consumption
o Frequent and regular communication of production issues to stakeholders
Required Knowledge, Skills, and Abilities:
* Experience working in a high capacity, highly scalable mission-critical web serving environment
* Proven ability to participate with other functional teams in systems integration and design including writing operational specifications, test plans and requirements management with attention to detail
* UNIX/LINUX and Windows and server experience, including expertise in system installation, configuration, administration, troubleshooting, performance tuning, preventative maintenance, capacity planning, monitoring, and security procedures.
* Web (IIS, Apache), .Net & Java application (Tomcat, Jboss, etc) server expertise including installation, administration, configuration, troubleshooting, performance tuning, preventative maintenance, capacity planning, monitoring, and security procedures
* Database Administration - setup, configuration and basic database troubleshooting skills
* Understanding of high availability hardware and database systems design and implementation including cluster management, redundancy and failover testing
* Deep understanding of cloud technologies such as virtualization, storage and network domains in a cloud service model
* Network hardware architecting experience with load balancing equipment, switches, routers, and network troubleshooting
* Ability to produce system documentation, including writing requirements, operational specifications, system architecture, test plans and as-built documentation, all with attention to detail
* Ability to build strong relationships and influence others across the organization
* Demonstrated knowledge of agile project methodologies
* 7+ years' experience designing, supporting and deploying Internet-based products or services
* 5+ years' operating complex, large-scale Enterprise guest-facing Applications or web sites
* 2+ years leading project or functional teams
• Working knowledge of Cloud Monitoring tools like ( CloudWatch, Splunk, ELK, Application monitoring, etc.)
* Experience in at least two relevant scripting or programming languages (Ruby, Perl, Python, Shell, PowerShell, etc.)
* Experience with Configuration Management platforms (Chef, Ansible, CFEngine, Puppet, etc.)
* Understanding of internet standards such as HTTP, DNS, FTP, SSH, HTML, XML, JDBC, ODBC, SNMP and other protocols
* Experience with managing AWS cloud delivery platform.
Join TEKsystems®, a leading IT staffing, IT talent management and IT services firm, and get your career on the fast track. We have more than 100 offices worldwide, and we partner with over 6,000 clients and place over 80,000 consultants per year. At TEKsystems, we seek to understand our consultants' skills, goals and interests, allowing us to present targeted job opportunities on a contract, contract-to-hire or direct placement basis. TEKsystems' leadership in the market stems from our sincere and personal commitment to driving the success of our customers, consultants and each other.
The company is an equal opportunity employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, genetic information or any other characteristic protected by law.
If you would like to request a reasonable accommodation, such as the modification or adjustment of the job application process or interviewing process due to a disability, please call 888 472-3411 or email email@example.com for other accommodation options.
A little about us:
TEKsystems provides corporations with IT staffing, talent management expertise and IT services, enabling them to meet their business objective.