Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
What is the awslabs/deequ GitHub project? Description: "Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.". Written in Scala. Explain what it does, its main use cases, key features, and who would benefit from using it.
Question is copied to clipboard — paste it after the AI opens.
Clone via HTTPS
Clone via SSH
Download ZIP
Download master.zipReport bugs or request features on the deequ issue tracker:
Open GitHub Issues