Microsoft

Senior Software Engineer - SRE

Microsoft

Sydney, Australia

full time

26 Aug 2020

Microsoft

The vision of the Azure Production Infrastructure Engineering group is to make it easy for everyone to create, consume, and manage planetary-scale, reliable cloud production services and infrastructure to achieve more. As a team, we bring together significant and complementary capabilities with tooling, infrastructure, monitoring and insights in new ways to increase our perspective. Our diversity of knowledge and experience comes together for the benefit of our users, our colleagues, our business, and ourselves. 

If you enjoy analyzing complicated problems, coming up with creative solutions, and working in focused teams to build reliable and novel solutions, we want you to join our Site Reliability Engineering (SRE) team.

The Azure SRE team is looking for engineers with broad experience in distributed systems to join their team. SREs are people who take engineering-based approaches to solve operations problems: we like infrastructure, we like seeing how big, complicated things work, and most importantly, we gain great satisfaction from making them better.

You will be working across Azure with a focus on increasing quality, performance, and reliability of the most essential services within Azure. The infrastructure SRE team works iterate on grow SRE practices from idea to planetary-scale adoption.

Our team has a wide variety of backgrounds, from Computer Science, Mathematics, and Engineering to and Physics, Philosophy, Psychology, and English. Our diversity of knowledge and experience comes together for the benefit of our billions of daily users, our business, our colleagues, and ourselves.

As SREs, we are members of the Production Infrastructure Engineering (PIE) team and our vision is to make it easy for everyone to create, consume, and manage planetary-scale, reliable cloud production services and infrastructure to achieve more. As a team, we bring our complementary capabilities together with tooling and infrastructure in new ways to increase reliability through improved service health, incident response, planning, analysis, and change management.

If you are excited by this type of challenge, and you love to work in groups of people who are similarly excited, come join us. We value the input of people who aren't afraid to be learning all the time, who embrace mistakes as they show the way forward, and those who are excited to continuously improve both services and themselves. We strongly believe that diverse experiences, backgrounds, and an environment where everyone can feel safe to contribute their own insights in a data-driven, objective, and supportive way is the key to making the best workplace possible, and the best workplace makes the best products and services. Not only is it the smart thing, it's the right thing.

The vision of the Azure Production Infrastructure Engineering group is to make it easy for everyone to create, consume, and manage planetary-scale, reliable cloud production services and infrastructure to achieve more. As a team, we bring together significant and complementary capabilities with tooling, infrastructure, monitoring and insights in new ways to increase our perspective. Our diversity of knowledge and experience comes together for the benefit of our users, our colleagues, our business, and ourselves.

Responsibilities

  • Work across Azure’s internal systems and services to design, develop, and improve platforms and processes that result in improved end-to-end reliability and maintainability for all.
  • Work across Azure PIE to drive tools that help deliver insights and automation to simplify the complex world of planetary scale services.
  • Communicate effectively and partner well with other disciplines of the project team to deliver high quality solutions from ideas to production code.
  • Write clean and thorough design documents and code that exemplify quality, simplicity, and maintainability.
  • Be a mentor for design reviews, code, and test cases.
  • Design systems that prioritize the customer perspective and experience.
  • Quickly adapt and apply new technologies, tools, methods, and processes from both internal and external sources.
  • Design and influence design, implementation, and architectural direction.
  • Drive architectural consolidation and simplification.
  • Exemplify the Microsoft values of leveraging the work of others and helping others be successful through your behaviors and actions.

Qualifications

  • Bachelor of Science, Computer Science degree, or 5+ years in software development.
  • 3+ years of software development in distributed systems or automation.
  • 3+ years of experience using languages such as C, C++, C# or Java (others are acceptable).
  • 2+ years of design, build, or implementation of distributed service health and telemetry.

Preferred Qualifications

  • Collaboration to accomplish large projects with excellent communication and demonstrated initiative.
  • Awareness of, and ability to reason about, modern software & systems architectures, including load-balancing, queueing, caching, distributed systems failure modes, and microservices.
  • Associated troubleshooting skills, including the ability to follow service dependency chains across arbitrary network steps.
  • Experience running large scale cloud systems.
  • Ability to analyze, understand, and solve complex problems by leveraging and extending existing technology.
  • Willingness and ability to respectfully challenge the status quo.
  • Able to operate in ambiguity and drive clarity through partnerships.



For more jobs like this subscribe to the weekly newsletter or follow C++ JOBS on Facebook, Linkedin and Twitter for frequent social updates.