Site Reliability Expert (I, II, Sr.) at Lightspeed POS
Berlin, DE / Montreal, CA / Toronto, CA / Ottawa, CA

As a Site Reliability Expert (SRE) part of our global cloud platform & operation team, you'll be supporting Lightspeed's growing development teams with the infrastructures and tools needed to run our products in a reliable, efficient and secure manner by implementing, advising and advocating the well-known DevOps principles.

What you’ll be responsible for

  • Initiating and contributing to the continuous improvement of our software delivery processes and practices in a multi-location, multidisciplinary team to empower and accelerate product development
  • Using automation extensively to design, configure, manage, and monitor systems in support of our product development teams
  • Contributing to the development of CI/CD pipeline that adheres to performance and security standards defined by the organization, emphasizing  cloud platform integration and self-service workflows
  • Assisting with infrastructure and tooling hardening to meet business and compliance requirements Designing and architecting operational solutions with the specific goal of increasing the standardization, automation, repeatability, cost-efficiency and consistency of operational tasks
  • Working with developers and other SRE to design and build scalable,reliable and cost-efficient Cloud infrastructure
  • Writing and maintaining architectural, stakeholder, policy and processes documentation 
  • Adhering to and advocating for best practices, including Infrastructure as Code, monitoring, high availability, disaster recovery, security, and DevOps methodologies
  • Collaborating with development teams and using intuition, experience, and understanding to create SLIs, SLOs, and SLAs 
  • Providing timely assistance and remediation solutions during critical situations and production incidents to help resolve service problems (You will be part of an on-call rotation)

What you’ll be bringing to the team

  • Proficiency developing in one or more languages such as Python, Golang, Ruby, PHP, JavaScript and/or others
  • Experience delivering scalable CI/CD solutions to organizations
  • Good knowledge of Amazon Web Services and/or Google Cloud Platform
  • Good understanding of Agile development and continuous delivery best practices, software engineering tools, processes, methods, and testing
  • Strong experience with Docker, Kubernetes, Helm, Linux Systems and databases (SQL and/or NoSQL)
  • Strong experience with monitoring and alerting tools (New Relic, PMM, etc…)

Who we are

Lightspeed powers small and medium-sized businesses in over 100 countries around the world with its cloud-based commerce platform. Its smart, scalable, and dependable all-in-one Point of Sale software system helps restaurants and retailers sell across channels, manage operations, engage with consumers, accept payments, and grow their business. Founded in 2005 with offices in Canada, USA, Europe and Australia, Lightspeed recently completed its initial public offering on the Toronto Stock Exchange (TSX: LSPD).

We're passionate about enabling people to do their best work. Come work with us and find out what you can do.