Codementor Events

How and why I built a list of every domain and IP address on the internet

Published Nov 01, 2019

About me

I'm a web developer who started in the mid 1990's and learning, growing and loving it every day since!

The problem I wanted to solve

I wanted to build a list of every domain name on the internet and it's associated IP address. Originally, the reason was to create a tool for checking if a domain name was registered or not. Since that original purpose, I've discovered a number of other exciting uses for the project.

What is this project about?

The project is an Asynchronous DNS resolver storing over 255 million domain names and their related IP addresses.

Tech stack

I originally used Python for the backend programming language, however I later switched to using node js as asynchronous programming is much nicer to work with using JavaScript.

For a database, I'm currently using mySQL with some configuration modifications to speed up access times with such a large dataset.

The system runs on a Dell r710 rackmount server with 64 GB of ram and dual 4 core Xeon processors.

The process of building a list of every domain name on the internet

The first step was gathering domain names. Next, I had to design a database schema. From there I wrote the programming code to retrieve results from the database, lookup IP addresses and perform other network functions to see if a domain name is being used or not.

Challenges I faced

When working with a database with over 255 million rows, changes to the database schema can sometimes take a long time. As do backups, restores, and other operations on a large database. However, that's one of the things I love about the project is overcoming interesting big data challenges.

Key learnings

I would have started with Node.js rather than Python. While Python is a great language I I prefer it to JavaScript most of the time asynchronous development is much simpler in node/JavaScript.

Tips and advice

When backing up very large databases, it is quicker to directly copy the MySQL database from their filesystem location rather than using DB tools. This can save days of processing time.

Final thoughts and next steps

My next steps are to build out a number of business opportunities I have discovered while building this dataset and system.

Discover and read more posts from Ken Smith
get started
post commentsBe the first to share your opinion
Ken Smith
4 years ago

I don’t plan to publish all the code to OSS at this point as I’d like to make a business out of it as well. I have plans to make a number of different tools such as check domain existence using an API and such.

Shyam Makwana
5 years ago

Are you going to publish it’s code to OSS community ? or Do you have hosted it somewhere to check domain existence ?

Show more replies