...
- TSC:
- Mandy Chessel: Egeria Project Lead
- Maciej Obuchowski: Software Engineer, GetInData, OpenLineage contributor
- Willy Lulciuc: Co-creator of Marquez
- Mike Collado: Staff Software Engineer, Datakin
- And:
- Ernie Ostic, SVP of Product, Manta
- Šimon Rajčan, Senior Business Intelligence Consultant, Profinit
- Sheeri Cabral: Technical Product Manager, Lineage, Collibra
- Ross Turk, Senior Director of Community, Astronomer
- Howard Yoo, Staff Product Manager, Astronomer
- Minkyu Park, Senior Software Engineer, Astronomer
- Peter Hicks, Senior Software Engineer, Astronomer
- Jakub Moravec, Software Architect, Manta
- Michael Robinson, Software Engineer, Dev. Rel., Astronomer
...
Widget Connector | ||
---|---|---|
|
Notes:
- Release 0.9.00 [Michael R.]
- We added:
- Spark: Column-level lineage introduced for Spark integration (#698, #645) @pawel-big-lebowski
- Java: Spark to use Java client directly (#774) @mobuchowski
- Clients: Add OPENLINEAGE_DISABLED environment variable which overrides config to NoopTransport (#780) @mobuchowski
- For the bug fixes and more information, see the Github repo.
- Shout out to new contributor Jakub Dardziński, who contributed a bug fix to this release!
- We added:
- Snowflake Blog Post [Ross]
- topic: a new integration between OL and Snowflake
- integration is the first OL extractor to process query logs
- design:
- an Airflow pipeline processes queries against Snowflake
- separate job: pulls access history and assembles lineage metadata
- two angles: Airflow sees it, Snowflake records it
- the meat of the integration: a view that does untold SQL madness to emit JSON to send to OL
- result: you can study the transformation by asking Snowflake AND Airflow about it
- required: having access history enabled in your Snowflake account (which requires special access level)
- Q & A
- Howard: is the access history task part of the DAG?
- Ross: yes, there's a separate DAG that pulls the view and emits the events
- Howard: what's the scope of the metadata?
- Ross: the account level
- Michael C: in Airflow integration, there's a parent/child relationship; is this captured?
- Ross: there are 2 jobs/runs, and there's work ongoing to emit metadata from Airflow (task name)
May 19th, 2022 (10am PT)
Agenda:
...