127 Forks
274 Stars
274 Watchers

e2e-data-engineering

An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All components are containerized with Docker for easy deployment and scalability.

How to download and setup e2e-data-engineering

Open terminal and run command
git clone https://github.com/airscholar/e2e-data-engineering.git
git clone is used to create a copy or clone of e2e-data-engineering repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with e2e-data-engineering https://github.com/airscholar/e2e-data-engineering/archive/master.zip

Or simply clone e2e-data-engineering with SSH
[email protected]:airscholar/e2e-data-engineering.git

If you have some problems with e2e-data-engineering

You may open issue on e2e-data-engineering support forum (system) here: https://github.com/airscholar/e2e-data-engineering/issues