Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • TSC:
    • Willy Lulciuc: Co-creator of Marquez
    • Mike Collado: Staff Software Engineer, Astronomer
    • Julien Le Dem: OpenLineage Project lead
  • And:
    • Ernie Ostic, SVP of Product, Manta
    • Ross Turk, Senior Director of Community, Astronomer
    • Minkyu Park, Senior Software Engineer, Astronomer
    • Peter Hicks, Senior Software Engineer, Astronomer
    • Michael Robinson, Software Engineer, Dev. Rel., Astronomer
    • Sandeep Adwankar: Senior Technical Product Manager, AWS
    • Will Johnson: Senior Cloud Solution Architect, Azure Cloud, Microsoft
    • John Thomas: Software Engineer, Dev. Rel., Astronomer
    • Chandru Sugunan: Product Manager, Azure Cloud, Microsoft
    • Petr Hajek, Information Management Professional, Profinit
    • Colin Schaub, Lead API Engineer, API Platform Lead, Cargill
    • Mark Chiarelli, Senior Consultant, MarkLogic
    • Sam Holmberg, Software Engineer, Astronomer
    • Paweł Leszczyński, Software Engineer, GetInData

Agenda:

  • Recent talks [Julien]
  • Recent release: 0.10.0 [Michael R.]
  • Flink integration [Paweł, Maciej]
  • New docs site [Ross]
  • Discuss: streaming services in Flink integration [Will]
  • Open discussion
    • OL philosophy for streaming in general

...

  • Recent talks
  • Release 0.10.0
  • Flink integration
    • Entry point: built Flink example app to find out if metadata, schema extractable
    • Maciej also successfully read data from Iceberg
    • Flink provides two APIs
    • Created integration tests for all use cases, added them to CircleCI
    • New Java client: different configs for HTTP, Kafka endpoints
    • Missing feature: make sure crashing integration doesn't kill a Flink job
    • Coming soon: experimental version
      • not focused on streaming currently
      • focus: how to extract info from Flink
      • feedback from community desired
    • Q & A
      • Will: is the code an extension of OL or an integration?
        • an integration akin to the dbt integration
      • Willy: any changes to the spec/schema? Is the state part of the payload?
        • new state should be added (currently "other")
  • New docs site
    • Up until today, docs have been on the website and spread throughout READMEs
    • Docusaurus deployment now available
    • Changes to structure as well as content welcome
    • Not currently live but will be soon
    • Can be hosted at docs.openlineage.io
    • Everything is in Markdown
    • Another motivation: Keboola use case not part of the codebase, so a docs site could describe it
    • Next milestone: we all decide to publish it
    • Q & A
      • Willy: let's add a section on defining custom facets
      • Ross: feel free to add another page stub
      • Ross: also need a FAQ
      • Julien: we could autogenerate some docs
      • Ross: there are downsides to such an approach
      • Julien: let's open issues when answers aren't good enough
      • Willy: descriptions of facets could be improved
      • Julien: we could version them
      • Ross: I'll look for signs that people are not finding docs on the version they are using
  • Streaming in Flink integration
    • Has there been any evolution in the thinking on support for streaming?
      • Julien: start event, complete event, snapshots in between limited to certain number per time interval
      • Paweł: we can make the snapshot volume configurable
    • Does Flink support sending data to multiple tables like Spark?
      • yes, multiple outputs supported by OpenLineage model

June 9th, 2022 (10am PT)

Attendees:

...