We are a global leader in digital assets and data center infrastructure, providing solutions that drive progress in finance and artificial intelligence. We believe that innovations in blockchain and digital assets will revolutionize the movement of value worldwide, and we are dedicated to crafting the products and services that will realize this vision. Headquartered in New York City, we have offices throughout North America, Europe, the Middle East, and Asia.
Who You Are
We are looking for a Site Reliability Engineer (SRE) to join our expanding infrastructure team in Asia. In this role, you will be crucial in ensuring the stability, scalability, and resilience of our production systems. You will help us scale our services, establish and enforce reliability standards, and continually enhance our engineering infrastructure. Your enthusiasm for contributing to the digital asset and blockchain sectors will drive innovation at the intersection of finance and technology, aligning with our mission to engineer a more open and accessible financial system through institutional-grade infrastructure and services.
What You'll Do
- Ensure reliability, uptime, and performance of critical infrastructure and services across our digital asset platform.
- Develop and maintain Infrastructure as Code (IaC) using Terraform, promoting reproducibility and automation of our environments.
- Manage and operate Kubernetes-based container deployments, ensuring efficient scaling and fault tolerance.
- Optimize our AWS cloud infrastructure for cost-effectiveness and security.
- Build, configure, and enhance monitoring and alerting pipelines with Datadog to proactively detect and resolve issues.
- Own and improve the CI / CD pipeline using Jenkins, JFrog, Flux, and GitHub Actions for fast, safe, and auditable deployments.
- Collaborate with security, platform, and product engineering teams to guarantee high availability, disaster recovery, and incident response capabilities.
- Participate in on-call rotations, incident retrospectives, and operational reviews to continually enhance service delivery.
What Were Looking For
8+ years of experience in an SRE, DevOps, or related role in a high-availability production environment.Strong experience with Terraform and Kubernetes for managing and scaling infrastructure.Proficient in AWS services such as EC2, EKS, S3, RDS, and IAM.Hands-on experience with CI / CD tools like Jenkins, JFrog Artifactory, Flux, and GitHub workflows.Solid understanding of system monitoring, alerting, and performance management using tools like Datadog.Knowledge of scripting languages (e.g., Python, Bash) for automation tasks.Familiarity with GitOps practices and version-controlled infrastructure.Strong communication and collaboration skills for cross-functional teamwork.Bonus Points
Experience in a regulated financial environment or with crypto-native infrastructure.Background in systems security, performance tuning, or cost optimization.Contributions to open-source projects or community engagement in the DevOps / SRE field.