Job Description
What you will be responsible for
As a Staff Data Platform Engineer, Cyber Data Science, you will:
- Collaborate across a variety of teams to enable our Data Platform and MLOps needs and design, implement, and maintain a secure and scalable infrastructure platform spanning across AWS/Azure and our Data Center
- Use Infrastructure as Code and containerization to create immutable reproducible deployments and establish best practices to scale that Infrastructure as Code (IaC) project in a maintainable way
- Own and ensure that internal and external SLA’s meet and exceed expectations, System centric KPIs are continuously monitored
- Create tools for automating deployment, monitoring, alerting and operations of the overall platform and establish best practices for CI/CD environments and methodologies such as GitOps
- Analyze our AWS/Azure Resource usage to optimize for a balance of performance vs cost
- Work on our data lake, data warehouse, and stream processing systems to create a unified query engine, multi-model databases, analytics extracts and reports, as well as dashboard and visualizations
- Design and build petabyte scale systems for high availability, high throughput, data consistency, security, and end user privacy, defining our next generation of data analytics tooling
- You will mentor other engineers and promote software engineering best practices across the organization designing systems with monitoring, auditing, reliability, and security at their core.
- Come up with solutions for scaling data systems for various business needs and collaborate in a dynamic and consultative environment.
Education & Qualifications
Minimum Qualifications
- Minimum 8+ years of software development or DevOps experience with minimum bachelor’s degree in Computer Science or Engineering.
- 8+ years of Cloud IaC with deep expertise in Terraform/Cloudformation & Ansible/Salt deployment and 2+ years of Kubernetes experience focused on DevOps
- Practical experience with Data Engineering and the accompanying DevOps & DataOps workflows.
- A deep understanding of CI/CD tools and a strong desire to help teams release frequently to production with a focus on creating reliable high-quality results.
- Extensive experience building large scale distributed systems and data analytics processes on cloud
native, in-memory, and fit-for-purpose hybrid infrastructure. Experience with cybersecurity data and
globally distributed log & event processing systems with data mesh and data federation as the
architectural core is highly desirable.
- Expertise in DevOps, DevSecOps and emergent experience with DataSecOps and Data Governance
practices — deep experience with managing and scaling container-based infrastructure-as-code
technologies from the CNCF and related orbits.
- Experience in big data technologies like Presto/Trino, Spark & Flink, Airflow & Prefect, RedPanda & Kafka, Iceberg & Delta Lake, Cassandra & Scylla, PlanetScale Vitess & CockroachDB, Snowflake & Databricks,
MemGraph & Neo4J as well as modern security tooling like Splunk, Panther, Datadog, Elastic, Arcsight etc.
- Experience designing and building data warehouse, data lake or lake house using batch, streaming,
lambda and data mesh solutions and with improving efficiency, scalability, and stability of system
- Knowledge of Dbt, Airflow, Ansible, Argo, or other data pipeline systems; ideally experience building and maintaining a data warehouse and understanding of simple data science workflows and terminology.
- Expertise with either AWS, GCP, Azure, and Services/Tooling such as or similar to: Terraform, Packer, Docker, Kubernetes, Helm, Prometheus, Grafana, Fluent Bit, Istio (Service Mesh)
- Strong background integrating continuous delivery(CD) with Kubernetes using tools such as Argo, GitLab, Spinnaker and strong Git experience, development methodologies, trunk-based develop vs. git flow, etc.
- Strong end-to-end ownership and a good sense of urgency to enable proper self-prioritization
Preferred Experience
- 10+ years of experience with Python, Java, or similar languages, with cloud infrastructure (e.g. AWS, GCP, Azure), and deep experience working with big data processing infrastructures and cloud architecture
- Deep experience with cloud devops tooling and expertise in container native systems and associated
security and scaling considerations – ability to work with and build tooling that works in a multi/hybrid
cloud environment with modern CI/CD, IaC, DataOps, and DevSecOps best practices.
- Have built and scaled hybrid cloud and streaming data and ML platforms from scratch and are up-to-date with tools and methods for deploying, managing and connecting data mesh in complex enterprise environments
- Experience working or developing Kubernetes autoscaling tools and deep experience with container and cloud security principles
- Deep experience with designing and delivering data platforms with in-built lineage, federation, governance, compliance, security, and privacy.
- Experience with building and delivering MLOps and Data Ops platforms and familiarity with processes for MRM coordination, deployment, and monitoring of production grade ML models in a regulated high-growth tech environment
Job ID: 123330