Generic Cloning Utility

Published Apr 09, 2018Last updated May 21, 2018
Generic Cloning Utility

About me

I am Full Stack Engineer who has been working with Django for almost two years now.

The problem I wanted to solve

We had multiple actions where data needed to be copied from one model or a clone of entries in the same model had to be made. In all such actions, we had conditions where sometimes only a few fields had to be copied, or some fields had to be ignored, or some different value needed to be provided for some fields. It also meant we needed to write the same code for different field types. Every time we had to do a cloning or copying action, we ended up writing a lot of code, which was not only redundant, but also difficult to maintain and scale.

What is Generic Cloning Utility?

I set out to develop a generic cloning utility that could copy data from any model object to any other model object. It can copy all kinds of fields in Django — from simple CharField to the complicated TreeForeignKey. You can provide what field need to be copied or cloned, any value that needs to be set or any fields that need to be ignored. Also, you can do bulk copy and bulk clone to optimize the load of creation on the database.

Tech stack

I used Python and Django since we used that in our project. I say Python and Django because this utility was monkey patched on Django models class and used some of the methods defined with models and fields classes internally.

The process of building Generic Cloning Utility

The first and foremost step in writing the cloning utility was identifying how the different types of fields available in Django can be copied from one model object to another. Some of the fields can simply be copied using setattr method, but some fields such as ManyToMany field and TreeForeignKey required some processing before they can be copied.

Next, I looked at how could I customize the values during copying, i.e, how could I provide some other value to a field instead of the source object field's value. Also, I added the functionality to ignore some fields while copying for which the default value defined on the model would be picked up.

After getting copying of all fields working, I looked at optimizing the queries that were being made to the database. I coupled this with the bulk copying/cloning action.

I did this because copying of most fields wasn't database friendly or could not work with bulk_create. For example, the most followed way of adding files using ContentFile object for FileField required a database call exclusively for that field.

This also had the other problem that the file was first loaded into memory on the server. I solved this thing by making an copy API call to Amazon S3 (this was what we used to store files) to copy the files within S3 and just return us the file path. If we have the file path of the new file, then we can treat FileField as any normal CharField.

Lastly, when the utility was ready, I monkey-patched clone and copy methods on the Django models class so that these methods are now available on all models in the project.

Challenges I faced

The biggest challenge that I faced was copying fields like FileField, ManyToManyField, and TreeForeignKey. Writing copying action for these was not complicated, but when I had to look at minimizing database query, it became a lot more difficult. I had to spend a lot of time brainstorming and going through the code of these fields to try and optimize the queries.

Key learnings

The biggest lesson from this was to always have an idea of all of the use cases for your project. If I had not spent time on understanding and listing down all of the use cases, I would have had to keep adding stuff every time something new was added. I believe the more changes you make to something that was intended to be generic, the greater are the chances of bugs being introduced in it.

Tips and advice

My advice to someone would be to spend ample time with pen and paper designing what you are going to implement, what are the use cases, and designing test cases for this project. Whenever you start something new, we desire to quickly jump into coding, but having some written work would only make coding more fun and smoother.

Final thoughts and next steps

I won't be working on it any time soon because the Cloning Utility is complete and can handle cloning of all kinds of fields available.

Discover and read more posts from Apoorva Somani
get started