System Monitoring - Lead Engineer:
As a Monitoring Tools engineer, you will be responsible for building and enhancing Monitoring Tools infrastructure for FICO’s cloud & In premise systems. You will be part of a bigger Site Reliability Engineering team responsible for running our 24/7/365 services. Our services are hosted in our data centers as well on AWS. You must have a solid understanding of Enterprise monitoring tools and have an engineering background allowing you to propose and enhance monitoring solutions. You will work closely with our Infrastructure engineering teams as well as our development team which are developing sophisticated log mining tools built using analytics on a big data platform.
- Responsible for creating and implementing system monitoring that will drive to improve the performance and reliability of the FICO cloud solutions
- Ensure the configuration and accuracy of monitoring for existing infrastructure (physical servers, virtual machines, network and hardware)
- Working with members of the infrastructure and development teams to gather requirements and design solutions to meet monitoring needs of new and existing products and services
- Developing key business reports and metrics based on monitoring data
- Work with our automation teams to deliver monitoring as part of our standard builds
- Assist in developing a process to ensure applications are correctly instrumented when developed by our Professional services team or by our customers
- Should have relevant experience to act as SME for Application/Infra/Network Monitoring & identify gaps in terms of process.
- Comfortable with Open Source products.
- Bachelor of Science degree in CS/CE/CIS/MIS or Engineering/Technology or related field or equivalent experience/training.
- 8 + years of experience in implementing & supporting Tools infrastructure for large scale enterprise monitoring center operating on a 24x7 basis.
- Basic understanding of all infrastructure components, including system, storage, networking, security, and their role in an enterprise architecture
- Experience on administration side of Application/Infra/Network monitoring tools such as Zabbix, Nagios, New Relic, AppDynamics, Solarwinds, Oracle OEM, Dynatrace, Splunk, ELK stack.
- Strong Redhat Linux, Oracle/ SQLServer / MySql experience
- Scripting experience with Bash, Shell, Perl, Python or Java
- Demonstrated ability to set a vision and execute on the vision
- Demonstrated analytical skills, efficiently diagnosing issues and implementing or proposing solutions
- Excellent verbal and written communication, including presentation expertise
- Ability to manage conflicting priorities and customer expectations
- Good time management skills, including planning, organizing, and providing status on a variety of tasks, assignments, projects, and reports
- Effective team building and motivational skills
- The role would require work in shift.