Linux – fast file system search with locate and updatedb
Typically
find
command is the most commonly used search utility in Linux. GNU find searches the directory tree rooted at each given starting-point by evaluating the given expression from left to right, according to the rules of precedence.
There is an alternative and fast way of searching for files and directories in Linux though and that is the
locate
command, and it goes hand in had with the
updatedb
utility which keeps an indexed database of the files in your system. The locate tools simply reads the database created by updatedb.
Installation –
sudo apt-get -y install mlocate [Debian/Ubuntu] sudo yum -y install mlocate [CentOS/Redhat]
updatedb is usually has a daily cron job to update the default database(‘/var/lib/mlocate/mlocate.db’). To manually update the database, you can manually run the ‘updatedb’ command first. That will take a while depending on the number of files you have on your system, the last time updatedb ran or other file related changes.
First time – update the default database, run any of the below command depending on your requirements. Most likely, the first and/or third command is what you need.
updatedb updatedb -U /some/path # if interested in indexing specific directory that you will search frequently. updatedb -v # verbose mode
Time to search
locate command is the utility to search for entries in a mlocate database.
Some examples –
locate cron # any file or directory with cron in its name locate -i cron # case insensitive locate -c cron # only print number of found entries locate -r 'cron$' # regex - only files or directories with names ending in cron. locate -r '/usr/.*ipaddress.*whl$' # regex for eg. /usr/share/python-wheels/ipaddress-0.0.0-py2.py3-none-any.whl
locate can also print the statistics on count of files, directories, size used by updatedb default directory.
root@cloudclient:/tmp# locate -S Database /var/lib/mlocate/mlocate.db: 28,339 directories 185,661 files 11,616,040 bytes in file names 4,481,938 bytes used to store database
Customizing updatedb
updatedb can be customized to output the search database to a different file than the default db, in addition to this we can change the directory to index other than the default root tree. We can then tell locate to use the custom db.
In the below example, I am indexing the files under home directory in /tmp/home.db database, and then run locate to use this custom DB. As you can see the number of files and directories is way lower and thus the search much faster although since it has to scan specific directory.
$ updatedb -U ~ -o /tmp/home.db $ locate -d /tmp/home.db cron $ locate -d /tmp/home.db -S Database /tmp/home.db: 3,530 directories 29,943 files 2,635,675 bytes in file names 762,621 bytes used to store database
References –