Overview DCT Bogota, D.C., Capital District, Colombia Site Reliability Engineer Responsibilities Service & Infrastructure Management: Oversee and manage core platform web services, including API and database servers to ensure optimal performance and health. System Monitoring & Emergency Response: Proactively monitor application and infrastructure health using tools like Grafana, ELK, and Sentry. Participate in a compensated 24/7 on-call rotation that is professionally managed and structured for fairness, conducted virtually (no need to be on-site). You will be backed up by a senior engineer for immediate support, troubleshooting, and swift emergency resolution. Automate recurring operational tasks, system deployments, backups, and maintenance procedures to improve efficiency. Partner with the Software Development team to provide guidance and embed modern DevOps practices directly into their development workflows. Security & Compliance: Assist the IT team in implementing security policies across the entire infrastructure. Requirements 4+ years of experience in a Site Reliability, DevOps, or Software Engineering role with a primary focus on production infrastructure. Willingness and ability to participate in a compensated on-call rotation to respond to and resolve after-hours emergencies. Linux Expertise: Strong practical experience with Linux system administration , including usage of the command line, shell scripting (Bash) , and advanced system-level troubleshooting. Containerization: Good understanding of container technologies, with hands-on proficiency using Docker and Docker Compose in a production context. Web Server Configuration: Experience configuring and managing web servers, specifically NGINX, for tasks like reverse proxying, load balancing, and SSL termination. Strong analytical and problem-solving skills, with the ability to take ownership and drive complex technical challenges to resolution. Nice to Have Knowledgeable of Amazon Web Services (AWS) cloud platform. Proficiency with infrastructure and application monitoring tools (e.g., Grafana, Amplify, Sentry, ELK stack). Networking Fundamentals: Solid understanding of core networking concepts and essential protocols like HTTP/HTTPS and DNS, along with basic familiarity with firewall and interface configuration. Experience with database administration (experience with AWS Aurora and PostgreSQL are a strong plus). Experience with Redis DB. Experience building and maintaining CI/CD pipelines. Experience with modern software development workflows based on Pull Requests, Continuous Delivery, and TDD, as well as an understanding of Agile principles. Experience with container orchestration technologies (e.g., Swarm, Podman, Kubernetes/K8s). Familiarity with Infrastructure as Code (IaC) principles and tools like Terraform. Familiarity with project management tools such as Jira or ClickUp. The Team You Will Join You will join a growing Engineering team, based in Bogotá in the role of Software Engineering focused on Site Reliability . You will report directly to our SRE Lead , receiving technical guidance and mentorship. In addition, you will be paired with a dedicated Line Manager whose primary focus is to support your long-term career progression and professional development. Who we are DCT is a global leader in the Fleet Telematics Industry with over 25 years of software and hardware development with headquarters in Miami, FL - USA. Our platform is the backbone for hundreds of customers across diverse industries and countries in more than 25 countries, with a significant and strategic focus in LATAM. What we offer Career Growth & Mentorship: A dedicated Line Manager and a personal training budget are provided to ensure you have the resources and guidance to advance your professional skills and career path. A Generative & Collaborative Culture: Join a dynamic and innovative team that embraces a generative culture to achieve quality products—we encourage curiosity and an open creative mindset as part of our core principles. Flexible Work Environment: We offer a flexible work-from-home policy designed to support a healthy work-life balance for our team members. Stability & Impactful Work: Be part of a globally recognized company with a 25-year track record of financial stability and technological innovation. Your work will have a direct and meaningful impact on a platform used by hundreds of leading businesses in the fleet telematics space handling massive streams of real-world data. We want to hear from you! Even if the salary or benefits aren't exactly what you're looking for, we encourage you to apply if you believe you're a great fit for the role and the team Seniority level Mid-Senior level Employment type Full-time Job function Engineering and Information Technology Industries Technology, Information and Internet We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI. #J-18808-Ljbffr
Site Reliability Engineer
DCT
workfromhome, workfromhome
Publicado hace 21 días
Denunciar empleo