Code
The ActiveClean codebase is written in Python and includes the core ActiveClean algorithm and a data cleaning benchmark.
The Data Cleaning Benchmark automatically injects data errors into a datasets to test the robustness of a machine learning model against data errors. It can be installed using pip:
pip install cleaningbenchmark
To reproduce the results and run the code, simply download the files in the following link ([login to view URL]) and run the python file using:
python [login to view URL]
The script is quite simple, so you can read it to see everything in action.
We want this to be accomplished using Python 3x, placed in a Jupyter Notebook, and well documented explaining what is approximately happening in each cell/step.
We can provide additional references to help with the above (papers/links).
Our aim is to find someone who can achieve the above, enjoys the work, and would be helping us extend code resulting in more ongoing work.
I am an experienced software engineer with 3+ years in the software engineering sector, working as software. I have a bachelors of engineering degree in Computer Engineering from AITR Indore ,India and I am also a Certified microsoft technology assoiciate .
My main skills are: development of productivity-driven scripts/utilities; optimisation/automation of processes and operations; software architecture; digital assets management and end-to-end (from delivery to ingest) workflow design. My applications are typically single-window utilities, built with a single purpose in mind. I do take care to ensure the UI in my utilities behave properly.
I'm responsible and always punctual to deadlines. My goal is to make every client satisfied. Thank you!