Tuesday, July 8, 2008

Recursive search for a pattern

I Wish I could use a command "grep -R" like other commands with an -R flag, viz., rm -R,chmod -R and chgrp -R etc.

This is a situation most of us would have come across, when you know a particular file has a specific pattern but you do not know where exactly is the file located. This becomes worse when you have the file about 5 levels down the path, is that easy ??? or make it 10,15,20 levels down.

Let us have an elaborate understanding of this. We are currently at "/user/oracle/dba/tst/tst1/" as the Present Working Directory(pwd).Let us see the files and direcories here with a simple ls -ltra.


HOST : /user/oracle/dba/tst/tst1> ls -ltr
total 6
drwxrwxr-x 3 oracle oracle 512 Mar 3 15:36 ./
drwxrwxr-x 3 oracle oracle 512 Mar 29 20:46 tst2/
drwxrwxr-x 3 oracle oracle 512 Mar 29 20:48 ../


Let us now change to tst2 and do a ls -ltr


HOST : /user/oracle/dba/tst/tst1> cd tst2
HOST : /user/oracle/dba/tst/tst1/tst2> ls -ltr
total 8
drwxrwxr-x 3 oracle oracle 512 Mar 3 15:36 ../
drwxrwxr-x 3 oracle oracle 512 Mar 3 15:36 tst3/
drwxrwxr-x 3 oracle oracle 512 Mar 29 20:46 ./
-rw-rw-r-- 1 oracle oracle 45 Mar 29 20:46 one


Under tst2 there is a file and a directory. This file "one" has the following contents.


HOST : /user/oracle/dba/tst/tst1/tst2> cat one
This is file named one under tst2/ directory


The directory "tst3" can further be navigated to "tst4" which has another directory "tst5", which in turn has a file named "two" and its contents are shown below.


HOST : /user/oracle/dba/tst/tst1/tst2> cd tst3
HOST : /user/oracle/dba/tst/tst1/tst2/tst3> ls -ltr
total 6
drwxrwxr-x 3 oracle oracle 512 Mar 3 15:36 tst4/
drwxrwxr-x 3 oracle oracle 512 Mar 3 15:36 ./
drwxrwxr-x 3 oracle oracle 512 Mar 29 20:46 ../


HOST : /user/oracle/dba/tst/tst1/tst2/tst3> cd tst4
HOST : /user/oracle/dba/tst/tst1/tst2/tst3/tst4> ls -tlr
total 6
drwxrwxr-x 3 oracle oracle 512 Mar 3 15:36 ../
drwxrwxr-x 3 oracle oracle 512 Mar 3 15:36 ./
drwxrwxr-x 2 oracle oracle 512 Mar 29 20:44 tst5/


HOST : /user/oracle/dba/tst/tst1/tst2/tst3/tst4> cd tst5
HOST : /user/oracle/dba/tst/tst1/tst2/tst3/tst4/tst5> ls -tlr
total 6
drwxrwxr-x 3 oracle oracle 512 Mar 3 15:36 ../
drwxrwxr-x 2 oracle oracle 512 Mar 29 20:44 ./
-rw-rw-r-- 1 oracle oracle 46 Mar 29 20:45 two


HOST : /user/oracle/dba/tst/tst1/tst2/tst3/tst4/tst5> cat two


This is file named two under tst5/ directory.

Let us not bother about anything else but the files one, two, their contents and their locations. File "one" is located under "tst2/" directory and file "two" is located about 3 directories down the path under "tst5/" directory.

Now from "/user/oracle/dba/tst/tst1/" directory, we are looking for a file having a pattern "file named" under the current directory and further in the recursive occuring directories. The crux of our exercise is we do not know how far we should go ???. We do not know whether to go about 5 levels down the path or more than that

Say for an instance we have a file sitting about 30 levels down the path, that sure makes me a little jittery if I were to go through each directory and do a grep of the required pattern on all the files. However, I would grin at it and use a simple "find" command to achieve the results as shown below.


HOST : /user/oracle/dba/tst/tst1> find ./ -print -exec grep "file named" {} \;
./
./tst2
./tst2/tst3
./tst2/tst3/tst4
./tst2/tst3/tst4/tst5
./tst2/tst3/tst4/tst5/two
This is file named two under tst5/ directory
./tst2/one
This is file named one under tst2/ directory


To get the exact filename that you are looking for is a little wary job. You get the pattern displayed on the screen with a couple of files above and below the contents of the files. Thank goodness !!! "grep" outputs the pattern which is being looked for like it always does.

Now, the file name just above the pattern is the actual source of the contents. As shown above, "./tst2/tst3/tst4/tst5/two" is the file that has the pattern we are looking for. Besides, file "./tst2/one" is an easy one to know since it as at the end of the output.

USAGE : This method can be used in situations where one needs to find a file with a known pattern that is to be changed for a fix. I have indeed come across this situation quite a lot of times, hope you would have too.

No comments: