Updated catalog browse, search and dataset view

Description

Problem Statement
The current catalog just allows users to browse data. It doesnt provide much value other than a simple data preview. This can be updated, leveraging a lot of our existing code to make it much more useful and feature rich

Goals

  • Make the Catalog more than a data browser (TBK-162)

  • Dataset centric UI Full-view schema view with metadata descriptions, samples, lineage, column profiling, links to feeds and Projects*(new feature)

  • Allow users to annotate datasets (add descriptions to datasets, tag data)

  • Allow users to update column metadata descriptions

  • View column profiles of the data (leverage existing code that does this in the data wrangler)

  • View column summary analysis (leverage existing code that does this in the data wrangler)

  • Jump to wrangler from the catalog dataset

  • Ability to add items to the catalog.

  • When browsing view what datasets have been curated by Kylo already

  • Better searching for data

    • Index names into Elastic search for catalog browsing/searching.

Framework model changes

  • Add URN to a data set. Unique name of the datasource regardless of the connection

  • ie. if you and I have different data sources representing the same physical database, we would want them to tie to the same dataset and metadata

  • Unique name that can be set (and changed) for curating / de-duping datasources.

Considerations

  • We may need a simplified "browse" view vs the detailed view above.

  • The dataset picker used in the data wrangler should be simple (as it is today), whereas the catalog should show the details

Status

Assignee

Unassigned

Reporter

Scott Reisdorf

Labels

None

Reviewer

None

Priority

Medium

Epic Name

Enhanced Catalog
Configure