Member-only story

3 Open Source Data Catalogs That You Should Definitely Try Out

Tanmay Deshpande
4 min readMay 18, 2021
Image Source — https://undraw.co/

Data Discovery has become one of the most important capabilities that Enterprise Data Platforms must provide. Over the years many big companies like Airbnb, LinkedIn, Uber, Netflix, Lyft, etc. have talked about how they solved the data discovery problem by building an in-house metadata search engine.

With the rise in Analytics, Machine Learning & Data Science projects, data discovery has got the top priority in many data teams.

Even modern-day enterprise data architectures like Data Mesh talks about the importance of Data Catalogs.

Source — https://martinfowler.com/articles/data-monolith-to-mesh.html

As quoted in the Data Mesh article —

This centralized discoverability service allows data consumers, engineers and scientists in an organization, to find a dataset of their interest easily.

In general Data Catalog provides the following features —

  • Data Discovery/Search
  • Data Classification
  • Data Lineage
  • Data Governance
  • Etc.

--

--

Tanmay Deshpande
Tanmay Deshpande

Written by Tanmay Deshpande

I write about technology in simple words!

No responses yet