Data Engineer - Quality Assurance
Transcenda
Key Responsibilities: Automate data quality and reconciliation checks across varied storage layers, including Snowflake, SQL, and RDF/SPARQL databases
Test and verify data lineage, governance, and visualization components using Snowflake, data catalogs (ie. DataHub), Thoughtspot, and other visualization tools
Integrate test suites into the core infrastructure orchestrated by Apache Airflow and utilizing Iceberg table formats, while monitoring data pipeline health, alerting, and observability metrics using Prometheus and Grafana Cloud
Establish AI Evaluation Loops (Evals) and Guardrails: Build rigorous verification protocols— including structural tests, checks, and watchdog agents—to validate AI-generated artifacts, catch false positives, and ensure all automated outputs are secure, reliable, and free from hallucinations.
Integrate automated testing workflows into CI/CD pipelines using GitHub Actions, ensuring continuous stability and quality gates across all deployment environments
Validate ETL and dbt transformations across Data Lakehouses, rigorously testing data progression through a Medallion Architecture
Test and automate complex API workflows, validating data payloads across OpenAPI integrations, 3rd party APIs, GraphQL, and AWS APIs (specifically S3)
Must Haves:
Data engineering & data testing: dbt, data lakehouse concepts, Medallion architecture
Databases & storage testing: SQL, Snowflake, AWS S3, Iceberg
Integrating quality check into data pipelines: Apache Airflow
API testing & automation: REST/OpenAPI, GraphQL
Integrating test automation into CI/CD: GitHub Actions (or similar like ArgoCD/GoCD)
Cloud / infrastructure and observability basics: Kubernetes (K8s), Prometheus, Grafana
Nice to Have:
Graph databases: RDF / SPARQL
Data governance & analytics tools: DataHub, Thoughtspot
AI/ML testing & MLOps: AI evals, guardrails, RAG, vector databases, AI drift monitoring
Advanced / emerging data tech: StarRocks, DuckDB
Regulated environments: GxP, 21 CFR Part 11, HIPAA
Clinical / domain-specific data standards: CDISC, ODM, FHIR
AI-native tooling: Cursor, Claude Code, Copilot, QA Wolf
Vaga publicada Há 2 meses atrás
Deseja receber mais vagas?
Assine e receba vagas semelhantes a Data Engineer - Quality Assurance. Seja o primeiro a se candidatar!
