Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

All are welcome.

Table of Contents

Next meeting: November 10, 2022 (10am PT)

October 13, 2022 (10am PT)

...

  • Announcements:
    • We recently removed support for Airflow 1.x
    • Ross gave a talk on OpenLineage at ApacheCon in New Orleans last week
    • Upcoming opportunities to give talks about OpenLineage: 
      • Data Teams Summit (January 2023)
      • Subsurface Live (January 2023)
      • Data Council Austin  (March 2023)
    • Giving a talk on data lineage soon? Ping Michael R. on Slack to let us know.
  • Recent release 0.15.1 [Michael R.]
  • Project roadmap review [Harel]
    • Improved understanding of Airflow
      • Track DAG runs
      • Native lineage in operators
    • Increased adoption of OpenLineage consumers
      • Collaborate with data catalogs
    • Coverage by event producers
      • Increased support for Snowflake access history using tags
      • Data quality frameworks
      • Start thinking about data consumption integrations (e.g., on the BI layer)
    • Continue experimenting with a Flink integration, streaming in general
    • Increased support of column level lineage (e.g., SQL operators)
  • Column-level lineage workshop [Howard]
    • Tutorial by Pawel Leszczynski available in the OpenLineage/workshops GitHub repo
    • Uses Jupyter and Spark
    • Covers:
      • Installing Marquez and Jupyter
      • Using column lineage feature in a Jupyter notebook
    • Requires:
      • Docker 17.05+
      • Docker Compose 1.29.1+
      • Git (preinstalled on most versions of MacOS; verify with git version)
      • 4 GB of available memory (the minimum for Docker — more is strongly recommended)
    • Preconfigured, including a token for Jupyter
    • Notebook contains scripts to set up environment, run Marquez, start Spark session
    • Allows you to see Marquez in action and understand how the APIs work
      • scripts return the JSON payloads
    • Other features are also well-suited to Jupyter notebooks, so more tutorials will be forthcoming
    • We welcome your contribution of additional tutorials!

...