James Haver

Coding for 10 years, with 5 years of functional programming focusing on full-stack web and testing. Prior to that Machine Translation focusing on Chinese, Japanese, Korean and English. Interested in gaining experience as a tutor.

Finding Files on the Command Line

Published Oct 20, 2019Last updated Apr 16, 2020

When I first started programming, I used a GUI program to find files. I thought
it was a great tool. It even supported regular expressions, but one day it
started crashing and I could no longer depend on it. Then I started using a
combination of unix command line tools and they became my first choice for
file search. Searching directly in IDEs and Emacs is still useful.

We will look at how to use find and grep to find files.

Find files by name

Open your command line program and go to a directory with not too many files.
Run the following command.

find .

This will recursively print out all of the files (directories, hidden files,
files, symbolic links, etc.) in the current directory and subdirectories. They
will be shown as relative paths. find with no arguments will do the same
thing.

We can change it to return the contents of the current directory with the
absolute path.

find "$(pwd)"

"$()" means run the bash command inside and return it as the string. pwd
returns the present working directory. It will print out all of the files it
finds with the absolute path.

Now let's find only files that have the .js extension.

find . -iname "*.js"

-iname filters files by the provided string argument and case insensitive.
The case sensitive version is -name. * is a wild card. It consumes 0 or more
characters, then matches strings that end with .js. This is a very common
pattern in find.

What if we want to find a file that starts with i and ends with .js and
may have other characters in between.

find . -iname "i*.js"

Finally, we can also look for words in the middle of the file name.

find . -iname "*index*"

This does not restrict the file to any particular extension. What would the
query look like if we want to find something that has index somewhere in the name
and ends with the extension .js. Also try finding files that end in another
extension in your directory.

Find files by content

grep searches plain text with regular expressions. It performs a global search
and prints all lines that match. It is a great tool for searching the contents
of one or more files. We can perform a simple search on a file with the following
pattern grep "regexp" /file/path. Here is an example.

grep "Hello" main.js

This returns any lines from main.js that has the word Hello exactly.

We can make it context insensitive by adding -i.

grep -i "Hello" main.js

If we want it to print out the line number, add the -n flag.

grep -n "Hello" main.js

We can search a directory recursively by adding the -r flag. It will print
out the filename where it was found. Add the -n flag as well if you want the
line number.

grep -r "div" src/

We can even restrict the files we search by extension.

grep -r --include=*.js "div" src

And we can restrict multiple file extensions.

grep -r --include=*.{css,html,js} "div" src

Combining find and grep

There is some overlap in finding files by name in find and grep but I
like to use find to filter the file names, then grep to filter the
file contents with the pipe | operator. It a nice way to separate concerns
even though it might not be necessary.

First example, find all the javascript files that have "hello" in them.

find . -iname "*.js" | xargs grep -i "hello"

What is xargs? It helps grep evaluate the file path strings into actual
files and perform grep on each file. Otherwise, if you do not add xargs,
grep will perform its queries just on the file names.

More find commands

find can exclude hidden files.

find . -not -path "*/\.*"

find can also include multiple search queries. However, this is a bit more
verbose than --include=*.{html,js} from grep. -o is used to connect
the queries.

find . -name "*.html" -o -name "*.js"

find can exclude directories from your search. Unfortunately this option is
very verbous, for each option you have to add -path "./exclude/path" -prune,
and then connect it with -o and make sure the query ends in -print.

Search current directory excluding ./node_modules.

find . -path "./node_modules" -prune -o -print

Search current directory excluding ./node_modules and ./lib.

find . -path "./node_modules" -prune -o -path "./lib" -prune -o -print

There is a simpler way to remember by combining it with grep, but it has
worse performance because find still search all the files, then grep
removes it.

find . -name "*.js" | grep -v "node_modules"

Finally we can combine a couple of the things we learned here. Exclude
node_modules and search for files that have .html or .js extension.

find . -name "*.html" -o -name "*.js" -path "./node_modules" -prune -o -print

Learn more

I suggest you try some queries on your own and write them down. The more you
practice the easier it will be to recall the syntax, but if you forget, you can
take a look at what you have down previously.

The reverse-i-search (CTRL-r) is a
great way to build searches incrementally by making a small search, running it,
then getting it again from the command line and refining it.

If there are some search queries you use repeatedly you can create a bash alias
to save time.

find and grep are easy to get started with, but have a lot of options. To
learn more check out their man pages,
stackoverflow, or
Digital Ocean Tutorials.

Other tools

Here are some other search tools that are worth looking at.

Linux Cli Search algorithms

Report

Enjoy this post? Give James Haver a like if it's helpful.

James Haver

I am new to the teaching world but excited about the experience. I offer the following kind of lesson: (1) guide the student through their problem and show them how to break it down into smaller pieces, (2) with enough background ...

Discover and read more posts from James Haver

get started

1Reply

Chris

6 years ago

cool usage examples. If you can install rg and fd give those a try. I’d suggest installing those via cargo.

I’ve been really enjoying using rg instead of grep, and NO ripgrep does not mean RIP grep even though you could make solid case for it.