481 Forks
2734 Stars
2734 Watchers

deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

How to download and setup deequ

Open terminal and run command
git clone https://github.com/awslabs/deequ.git
git clone is used to create a copy or clone of deequ repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with deequ https://github.com/awslabs/deequ/archive/master.zip

Or simply clone deequ with SSH
[email protected]:awslabs/deequ.git

If you have some problems with deequ

You may open issue on deequ support forum (system) here: https://github.com/awslabs/deequ/issues