| Department: | Digital Development |
| Location: | Boston, MA |
America's Test Kitchen (ATK) is seeking a Senior Site Reliability Engineer (SRE) to focus on the stability, scalability, and performance of our core Cloud Infrastructure and Database Systems. This high-impact role is focused on applying software engineering principles to operations, reducing toil, and ensuring the reliability of our high-traffic website, app, and digital subscription platforms.
The successful candidate will be a proficient software developer and an expert in cloud architecture who thrives on designing and implementing automated infrastructure solutions, optimizing complex database performance, and collaborating closely with development teams to build resilient services.
This is a newly created role and will report to the VP, Engineering and will be a key contributor to ATK's DevOps and infrastructure strategy.
Reliability Engineering & Cloud Infrastructure:
Infrastructure-as-Code (IaC): Design, implement, and maintain our cloud infrastructure using AWS CDK. Focus on high availability, disaster recovery, and cost efficiency.
Automated Operations: Develop robust automation using code to manage infrastructure, deploy applications, handle monitoring, and execute system recovery, driving down manual effort.
Observability: Implement and manage comprehensive monitoring, logging, and alerting systems to provide deep visibility into system health, performance, and key Service Level Objectives (SLOs).
Incident Response: Lead incident response, root cause analysis (RCA), and post-mortem processes to identify and resolve systems-level issues and prevent recurrence.
Database Performance and Development:
Database Management: Own the operational health and performance tuning of critical relational and NoSQL database systems in the cloud.
Software Development: Act as a contributing developer, writing clean, well-tested code in core ATK services.
Security and Compliance: Implement and enforce security best practices across infrastructure and data layers, including network segmentation, access control (IAM), and encryption.
Technical Expertise & Development Proficiency:
SRE/DevOps Experience: 5+ years of progressive experience in an SRE, or highly technical systems engineering role.
Cloud Architecture: Expert-level, hands-on experience designing and managing production environments in AWS (e.g., EC2, Lambda, ECS/EKS, VPC, RDS).
Database Mastery: Deep understanding of database internals, performance tuning, and operational management for data stores.
Coding Skills: Proven proficiency in at least one modern programming language used for systems automation, tooling, and backend service development.
Containerization: Strong experience with container orchestration technologies.
IaC Tools: Hands-on expertise with Infrastructure-as-Code tools.
Execution and Communication:
Problem Solver: Exceptional ability to diagnose and solve complex production issues across multiple domains (network, application, database, and infrastructure).
Collaboration: Strong track record of successfully partnering with software development teams to improve service reliability and delivery pipelines.
Technical Communication: Ability to clearly and concisely communicate technical concepts, status, and post-mortems to both engineering and leadership teams.
Qualifications
Bachelor's degree in Computer Science, Engineering, or equivalent professional experience
5+ years of experience in software development and/or site reliability engineering
Extensive AWS experience
Experience with high-traffic, customer-facing websites and apps
Mastery of Node.js and Java as a bonus
Location: This role can be based in our Boston, MA headquarters or is open to qualified remote applicants.
Salary Range:
$180,000 to $225,000
The range provided is based on what we reasonably expect to pay for this job as of the time of posting. The actual salary offered will be determined based on multiple factors, including but not limited to the candidate’s relevant experience, job-related knowledge, skills, geographical location, and other job-related factors permitted by law.
About America's Test Kitchen
The mission of America's Test Kitchen (ATK) is to empower and inspire confidence, community, and creativity in the kitchen. Founded in 1992, the company is the leading multimedia cooking resource serving millions of fans with TV shows (America's Test Kitchen, Cook's Country, and America's Test Kitchen: The Next Generation), magazines (Cook's Illustrated and Cook's Country), cookbooks, a podcast (Proof), FAST channels, short-form video series, and the ATK Essential Membership for digital content. Based in a state-of-the-art 15,000-square-foot test kitchen in Boston's Seaport District, ATK has earned the trust of home cooks and culinary experts alike thanks to its one-of-a-kind processes and best-in-class techniques. Fifty full-time (admittedly very meticulous) test cooks, editors, and product testers spend their days tweaking every variable to find the very best recipes, equipment, ingredients, and techniques. Learn more at https://www.americastestkitchen.com/.
Why America's Test Kitchen:
We're passionate about cooking, and about creating the best place to work. We're small enough for your ideas to make a big impact, and large enough to offer you opportunities to grow professionally at any stage of your career. We want you to take risks and make mistakes — that's how innovation happens in our test kitchen, in our offices, and in life.
We at America's Test Kitchen believe food media can be a powerful force for social change. We are passionate about building an inclusive workforce that represents many different cultures, backgrounds, abilities, identities, and perspectives.
We welcome your application.