...
- Recent talks
- Ross, “What Is Data Lineage and Why Should I Care?”
- Maciej & Paweł, “OpenLineage & Airflow: Data Lineage has never been Easier”
- Willy, “Automating Airflow Backfills with Marquez”
- Michael C., “Data Lineage with Apache Airflow and Apache Spark”
- Ross & Michael R., “An Introduction to Data Lineage with Airflow and Marquez”
- Julien, “Observability for Data Pipelines with OpenLineage”
- Michael C., “Cross-platform Lineage with OpenLineage"
- Release 0.10.0
Added:
- Extend SaveIntoDataSourceCommandVisitor to extract schema from LocalRelation and LogicalRdd in Spark integration (#794) @pawel-big-lebowski
- Add InMemoryRelationInputDatasetBuilder for InMemory datasets to Spark integration (#818) @pawel-big-lebowski
- Add SnowflakeOperatorAsync extractor support to Airflow integration (#869) @denimalpaca
- Add PMD analysis to proxy project (#889) @howardyoo
- Add static code analysis tool mypy to run in CI against all Python modules (#802) @howardyoo
- Add copyright to source files (#755) @merobi-hub
Changed:
- Skip FunctionRegistry.class serialization in Spark integration (#828) @mobuchowski
- Reduce OL event payload size by excluding local data and including output node in start events (#881) @collado-mike
- Install new rust-based SQL parser by default in Airflow integration (#835) @mobuchowski
- Improve overall pytest and integration tests for Airflow integration (#851, #858) @denimalpaca
- Split Spark integration into submodules (#834, #890) @tnazarew @mobuchowski
June 9th, 2022 (10am PT)
Attendees:
...