whoami
Yiming /i' miŋ/ Peng

I build data platforms that power decisions at scale.

Senior Data Engineer with a PhD in Machine Learning and 8+ years building scalable DataOps/MLOps platforms. Currently at Wētā FX, contributing to the operational data infrastructure behind world-class visual effects. Apache Airflow contributor.

Yiming Peng

$ ls ./featured-projects

DataOps Platform @ Wētā FX

Python Airflow ClickHouse Ansible

Production on-premises data platform powering BI and ML workflows across Wētā FX. Built and maintains 25+ ETL pipelines ingesting from disparate sources — solving the persistent problem of unreliable, undocumented data hand-offs between departments.

Impact: Eliminated recurring pipeline failures; stakeholder teams now rely on it as production-critical infrastructure.

Apache Airflow — OSS Contributions

Python Open Source Apache Airflow

Contributor to Apache Airflow — the de facto standard for data pipeline orchestration used by thousands of organizations worldwide. Contributions focus on stability, usability, and operator improvements drawn from real production experience.

Links: GitHub → [PR links — add when available]

Data Quality & Observability System

Great Expectations Python GitLab CI

Designed and built a platform-wide data quality framework from the ground up using Great Expectations — moving the team from reactive fire-fighting to proactive anomaly detection. Integrated into the CI/CD pipeline so quality checks run automatically on every deploy.

Impact: Dramatically reduced data incidents; stakeholders gained confidence to act on data without manual verification.

Kubernetes MLOps Platform

Kubernetes Kubeflow AWS EKS

Co-designed and built a Kubernetes-based MLOps platform at Chorus NZ to close the gap between data science experimentation and production deployment. Enabled model training, versioning, and serving pipelines within a unified, reproducible infrastructure.

Impact: First production ML infrastructure at the organisation — made model deployment a routine operation rather than a heroic effort.

$ cat experience.log

Senior Software Engineer — Data Wētā FX Oct 2023 – Present
Data Architect MBIE (via Emergence CLI) Jul 2022 – Jul 2023
Principal Data & Integration Architect IHC New Zealand Jul 2021 – Jul 2022
Senior Data Engineer Chorus New Zealand Dec 2019 – Jun 2021
Data Engineer KPMG New Zealand Mar 2019 – Nov 2019
Research Scientist & Research Assistant Victoria University of Wellington Sep 2015 – Mar 2019
Research & Teaching Assistant Unitec Institute of Technology Sep 2011 – Jun 2015

Certifications

Certified Kubernetes Administrator (CKA) Linux Foundation 2025 [link TBD]
Certified Kubernetes Application Developer (CKAD) Linux Foundation 2025 [link TBD]

$ ls ./writing

$ Articles on data engineering, platform design, and lessons from production systems. Coming soon — will be published from Obsidian.

$ cat publications.bib

[Conference TBD] 2018

[Paper title — please add real title here]

Yiming Peng, [Co-authors TBD]

Add PDF / DOI link when available
[Conference TBD] 2017

[Paper title — please add real title here]

Yiming Peng, [Co-authors TBD]

Add PDF / DOI link when available

Let's Connect

Open to research collaborations, consulting projects, and open source contributions