There are times when a feed is botched, or as the result of a pre-production smoke test, when the operator role wants to rollback one more more data ingestion jobs and the data that was loaded.
Allow user to rollback a data ingestion job. Give the user a button, that once pressed, will roll back all of the data that was loaded into the cluster for that job.
Additionally, give the user ability to roll back a (time based) range of jobs.
To maintain provenance, the loaded jobs would have to remain recorded in the metadata repo but the data applied to HDFS would have to be removed. This includes all of the zones...
To implement this feature an whole rollback standard flow has to be implemented and executed.
The high-water mark will also need to be reset to the event time of the last remaining load.