Cloud Site Reliability Engineer
Ixia
(Calabasas, California)Ixia (NASDAQ: XXIA) provides testing, visibility and security solutions, strengthening applications across physical and virtual networks for enterprises and governments, service providers and network equipment manufacturers.
Site Reliability Engineers are hybrid systems and software engineers who are responsible and take ownership for reliability, automation, and other issues related to Ixia’s customer facing cloud service. We are looking for an engineer to work as part of a team delivering Ixia’s cloud products. This position requires a strong technical candidate with experience in automation and system integration. As a Site Reliability Engineer (SRE) you will work collaboratively with multiple partners to build tools that ensure this software remain on-line for all the people that rely on it. You come from a systems or development background (or are comfortable in both areas). You are self-directed and have the ability to track solutions from design through implementation, owning every step along the way. SREs combine engineering experience, a desire to improve existing systems and processes, and creative problem solving to develop novel solutions to evolving challenges. Our team strives to automate processes whenever possible, using whatever tools are best for the job.
Required:
- Strong experience developing for cloud based platforms like Amazon Web Services, or Microsoft Azure, Google Cloud, RackSpace, etc.
- Familiar with the concept and development of Docker containers and micro services
- Understanding of ServicesOriented Architecture (SOA and REST)
- Very good understanding of large scale distributed systems
- Automation – Bash, Python, REST, etc
- Continuous Integration/Delivery – Jenkins, Buildbot, OpsWorks, Chef, Puppet,Ansible, SaltStack
- Monitoring – Prometheus, BorgMon, Zabbix, ELK stack
- Work with product owner(s) to establish service level objectives for a SaaS based product offering
- Automation and system integrations that allow fully automated operations and deployments to meet SLOs
- Manage escalations with vendors and internal support teams
- Strong oral and written communication skills
- Ability to think clearly, analyze quantitatively, solve problems, scope technical requirements and prioritize work
- BS/MS in Computer Science or a related technical field, or equivalent with 7+ years of relevant experience
Preferred:
- Experience with Chef, Puppet, Salt, or Ansible in production environments
- Experience with ElasticSearch
- Knowledge of best practices and IT operations in an always-up, always-available service
- Working knowledge of HTML5
- Understanding and experience with code deployment (tagging).
- Familiarity with Serverless Computing Technologies like Amazon Lambda, Azure Functions, etc. is a plus
- Linux and Windowsbased systems administration skills in a Cloud or Virtualized environment
- Experience building sophisticated and highly automated infrastructure
- Management of continuous integration servers like Jenkins, Bamboo and TeamCity
- Proficient in C++, Java or C
- Strong scripting skills, i.e., Python, Bash, Ruby, Perl, etc.
- Experience with revision control source code repositories (Git, Perforce, SVN)
Questions
There are no answered questions, sign up or login to ask a question
- C++
- Cloud
- Java
- Linux
- Perl
- Python
- Ruby
- Amazon Web Services
- Apache Subversion (SVN)
- Bash
- ElasticSearch
- Git
- HTML5
- Jenkins
- Microsoft Windows Azure
- Perforce
- Rackspace
- REST
- SaaS
- Zabbix
- Computer Science
- Service Oriented Architecture
- Chef Software
- Docker
- SaltStack
- Distributed Systems
- SALT
- Ansible
- Bamboo
- Puppet
- TeamCity
- C Programming Language
- Windows based PC
- AWS OpsWorks
- Lambda
- Google Cloud Platform
- Buildbot
- Prometheus
- BorgMon
- ELK Stack
- Code Deployment
- Tagging
- Azure Functions

Want to see jobs that are matched to you?
DreamHire recommends you jobs that fit your
skills, experiences, career goals, and more.