A mini-tutorial on the Unix/Linux find command

por | 14 Abril, 2007

Locating Files:

The find command is used to locate files on a Unix or Linux system.  find will search any set of directories you specify for files that match the supplied search criteria.  You can search for files by name, owner, group, type, permissions, date, and other criteria.  The search is recursive in that it will search all subdirectories too.  The syntax looks like this:

find where-to-look criteria what-to-do

All arguments to find are optional, and there are defaults for all parts.  (This may depend on which version of find is used.  Here we discuss the freely available GNU version of find, which is the version available on YborStudent.)  For example where-to-look defaults to . (that is, the current working directory), criteria defaults to none (that is, show all files), and what-to-do (known as the find action) defaults to -print (that is, display found files to standard output).

For example:

find

will display all files in the current directory and all subdirectories.  The commands

find . -print find .

do the exact same thing.  Here’s an example find command using a search criteria and the default action:

find / -name foo

will search the whole system for any files named foo and display them.  Here we are using the criteria -name with the argument foo to tell find to perform a name search for the filename foo. The output might look like this:

/home/wpollock/foo /home/ua02/foo /tmp/foo

If find doesn’t locate any matching files, it produces no output.

The above example said to search the whole system, by specifying the root directory (“/”) to search.  If you don’t run this command as root, find will display a error message for each directory on which you don’t have read permission.  This can be a lot of messages, and the matching files that are found may scroll right off your screen.  A good way to deal with this problem is to redirect the error messages so you don’t have to see them at all:

find / -name foo 2>/dev/null

Other Features And Applications:

The “-print” action lists the files separated by a space when the output is piped to another command.  This can lead to a problem if any found files contain spaces in their names, as the output doesn’t use any quoting.  In such cases, when the output of find contains a file name such as “foo bar” and is piped into another command, that command “sees” two file names, not one file name containing a space.

In such cases you can specify the action “-print0” instead, which lists the found files separated not with a space, but with a null character (which is not a legal character in Unix or Linux file names).  Of course the command that reads the output of find must be able to handle such a list of file names.  Many commands commonly used with find (such as tar or cpio) have special options to read in file names separated with nulls instead of spaces.

You can use shell-style wildcards in the -name search argument:

find . -name foo\*bar

This will search from the current directory down for foo*bar (that is, any filename that begins with foo and ends with bar).  Note that wildcards in the name argument must be quoted so the shell doesn’t expand them before passing them to find.  Also, unlike regular shell wildcards, these will match leading periods in filenames.  (For example “find -name \*.txt”.)

You can search for other criteria beside the name.  Also you can list multiple search criteria.  When you have multiple criteria any found files must match all listed criteria.  That is, there is an implied Boolean AND operator between the listed search criteria.  find also allows OR and NOT Boolean operators, as well as grouping, to combine search criteria in powerful ways (not shown here.)

Here’s an example using two search criteria:

find / -type f -mtime -7 | xargs tar -rf weekly_incremental.tar gzip weekly_incremental.tar

will find any regular files (i.e., not directories or other special files) with the criteria “-type f”, and only those modified seven or fewer days ago (“-mtime -7”).  Note the use of xargs, a handy utility that coverts a stream of input (in this case the output of find) into command line arguments for the supplied command (in this case tar, used to create a backup archive). 1

Another use of xargs is illustrated below.  This command will efficiently remove all files named core from your system (provided you run the command as root of course):

find / -name core | xargs /bin/rm -f find / -name core -exec ‘/bin/rm -f {} ;’ # same thing find / -name core -delete # same if using Gnu find

(The last two forms run the rm command once per file, and are not as efficient as the first form.)

One of my favorite find criteria is to locate files modified less than 10 minutes ago.  I use this right after using some system administration tool, to learn which files got changed by that tool:

find / -mmin -10

(This search is also useful when I’ve downloaded some file but can’t locate it.)

Another common use is to locate all files owned by a given user (“-user username“).  This is useful when deleting user accounts.

You can also find files with various permissions set.  “-perm +permissions” means to find files with any of the specified permissions on, “-perm –permissions” means to find files with all of the specified permissions on, and “-perm permissions” means to find files with exactly permissionsPermisisons can be specified either symbolically (preferred) or with an octal number.  The following will locate files that are writeable by “others”:

find . -perm +o=w

When using find to locate files for backups, it often pays to use the “-depth” option, which forces the output to be depth-first—that is, files first and then the directories containing them.  This helps when the directories have restrictive permissions, and restoring the directory first could prevent the files from restoring at all (and would change the time stamp on the directory in any case).

When specifying time with find options such as -mmin (minutes) or -mtime (24 hour periods, starting from now), you can specify a number “n” to mean exactly n, “-n” to mean less than n, and “+n” to mean more than n. 2  For example:

find . -mtime 0 # find files modified within the past 24 hours find . -mtime -1 # find files modified within the past 24 hours find . -mtime 1 # find files modified between 24 and 48 hours ago find . -mtime +1 # find files modified more than 48 hours ago find . -mmin +5 -mmin -10 # find files modifed between 6 and 9 minutes ago

The following displays non-hidden (no leading dot) files in the current directory only (no subdirectopries), with an arbitrary output format (see the man page for the dozens of possibilities with the -printf action):

find . -maxdepth 1 -name ‘[!.]*’ -printf ‘Name: %16f Size: %6s\n’

As a system administrator you can use find to locate suspicious files (e.g., world writable files, files with no valid owner and/or group, SetUID files, files with unusual permissions, sizes, names, or dates).  Here’s a final more complex example (which I save as a shell script):

find / -noleaf -path ‘/proc’ -prune \ -o -path ‘/sys’ -prune \ -o -path ‘/dev’ -prune \ -o -path ‘/windows-C-Drive’ -prune \ -o -perm -2 ! -type l ! -type s \ ! \( -type d -perm -1000 \) -print

This says to seach the whole system, skipping the directories /proc, /sys, /dev, and /windows-C-Drive (presumably a Windows partition on a dual-booted computer).  The -noleaf option tells find to not assume all remaining mounted filesystems are Unix file systems (you might have a mounted CD for instance).  The “-o” is the Boolean OR operator, and “!” is the Boolean NOT operator (applies to the following criteria).  So this criteria says to locate files that are world writable (“-perm -2”) and NOT symlinks (“! -type l”) and NOT sockets (“! -type s”) and NOT directories with the sticky (or text) bit set (“! \( -type d -perm -1000 \)”).  (Symlinks, sockets and directories with the sticky bit set are often world-writable and generally not suspicious.)

The find command can be amazingly useful.  See the man page to learn all the criteria and options you can use.