Page History

...

Announcements:
- We recently removed support for Airflow 1.x
- Ross gave a talk on OpenLineage at ApacheCon in New Orleans last week
- Upcoming opportunities to give talks about OpenLineage:
- Giving a talk on data lineage soon? Ping Michael R. on Slack to let us know.
Recent release 0.15.1 [Michael R.]
- Added
  - Airflow: improve development experience #1101 @JDarDagran
  - Documentation: update issue templates for proposal & add new integration template #1116 @rossturk
  - Spark: add description for URL parameters in readme, change overwriteName to appName #1130 @tnazarew
  Changed
  - Airflow: lazy load BigQuery client #1119 @mobuchowski
  Fixed
  - Spark: fix column lineage #1069 @pawel-big-lebowski
  - Spark: set log level of Init OpenLineageContext to DEBUG #1064 **new contributor @varuntestaz**
  - Java client: update version of SnakeYAML #1090 **new contributor Lukáš AKA @TheSpeedding**
  - CI: build macos release package on medium resource class #1131 @mobuchowski
  Additional bug fixes and more details: https://github.com/OpenLineage/OpenLineage/blob/main/CHANGELOG.md
Project roadmap review [Harel]
- Improved understanding of Airflow
- Increased adoption of OpenLineage consumers
- Coverage by event producers
- Continue experimenting with a Flink integration, streaming in general
- Increased support of column level lineage (e.g., SQL operators)
Column-level lineage workshop [Howard]
- Tutorial available in the OpenLineage/workshops GitHub repo
- Uses Jupyter and Spark
- Covers:
  - Installing Marquez and Jupyter
  - Using column lineage feature in a Jupyter notebook
- Requires:
  - Docker 17.05+
  - Docker Compose 1.29.1+
  - Git (preinstalled on most versions of MacOS; verify with git version)
  - 4 GB of available memory (the minimum for Docker — more is strongly recommended)
- Preconfigured, including a token for Jupyter
- Notebook contains scripts to set up environment, run Marquez, start Spark session
- Allows you to see Marquez in action and understand how the APIs work
  - scripts return the JSON payloads
- Other features are also well-suited to Jupyter notebooks, so more tutorials will be forthcoming
- We welcome your contribution of additional tutorials!

September 8, 2022 (10am PT)

...

Page tree

Versions Compared

Old Version 128

New Version 129

Key

September 8, 2022 (10am PT)