Site Reliability Engineer
Position Summary:
We are seeking a highly skilled SRE/DevOps Engineer to join our dynamic and innovative team. As a DevOps Engineer, you will play a key role in designing, implementing, and maintaining the infrastructure and systems required to support the development and deployment of our software applications.
Responsibilities:
- Design and implement infrastructure automation using tools like Terraform or CloudFormation to streamline the provisioning and configuration of cloud resources.
- Set up and maintain CI/CD pipelines using tools like GitHubActions/Ansible/Argo CD to automate the build, test, and deployment processes.
- Manage and maintain configuration management tools like Ansible to ensure consistent and reproducible environments across different stages of the software development lifecycle.
- Implement monitoring and alerting systems using tools like AppDynamics, SolarWinds and Splunk to proactively identify and resolve issues in the infrastructure and applications.
- Implement security best practices and ensure compliance with industry standards and regulations.
- Collaborate closely with development, operations, and QA teams to foster collaboration and streamline the overall software development process.
Education, Knowledge and Skills:
- Bachelor's degree in Computer Science, Engineering, or a related field.
- 3+ years SRE/DevOps experience, building and maintaining production environments
- Container deployment and orchestration with Kubernetes and Docker
- Knowledge of network, server, database, and container architecture
- Familiarity with best practices for infrastructure design for applications using Microservices, & APIs
- Strong experience with integration tools such as GitHub for version control and code management.
- Knowledge in infrastructure automation using tools like Terraform or CloudFormation.
- Experience with CI/CD pipeline setup and maintenance using tools like GitHubActions/Argo CD Solid understanding of configuration management tools like Ansible.
- Familiarity with monitoring and alerting systems like AppDynamics, SolarWinds & Splunk
- Knowledge of security best practices and compliance standards.
- Excellent problem-solving and troubleshooting skills.
- Strong communication and collaboration skills.