File Searching Using Python



File Searching in Python can be done using various methods, like using the os module with os.walk() function as it takes a specific path as input and generates a 3-tuple involving dirpath, dirnames, and filenames.

Another method is using the pathlib module which provides an object-oriented interface while working on file systems paths. In Python there are some widely used methods to search for specific files, some of them are as follows -

  • Os Module: This module allows us to interact with the operating system.

  • Glob Module: It finds all the path names matching with a specific pattern.

  • pathlib Module: This module provides an object-oriented approach to handle file system paths.

Using 'os' Module

By using the os Module we can navigate the file system, manage directories and access the system-level information can be done.

The os.walk() function will generate the file names in the directory tree, and provide a tuple containing (dirpath, dirnames, filenames) for each directory.

Example

import os

def find_files(filename, search_path):
   result = []

# Recursively walks through the directory
   for root, dir, files in os.walk(search_path):
      if filename in files:
	  # If the file is found, add its full path to the result list
         result.append(os.path.join(root, filename))
   return result

print(find_files("smpl.htm","D:"))

Output

['D:TP\smpl.htm', 'D:TP\spyder_pythons\smpl.htm']

Using 'glob' Module

By using this glob module, we can find all the pathnames matching a specified pattern according to the rules used by the Unix shell.

This method supports wildcards like ' * ' and ' ? ', the ' * ' will match any number of characters where ' ? ' matches a single character.

Example

import glob

def find_files(pattern):

     # finding all files matching the pattern
    return glob.glob(pattern, recursive=True)

files = find_files('/path/to/directory/**/*.txt')
for file in files:
    print(file)

Output

/path/to/directory/file1.txt
/path/to/directory/subdir1/file2.txt
/path/to/directory/subdir2/subsubdir/file3.txt

Using 'pathlib' Module

The pathlib module provides an object-oriented method for handling file system paths. The Path class will encapsulate file system paths and provide methods to interact with files.

Example

The example below creates a path object from the provided directory string representing the filesystem path.

from pathlib import Path

def find_files(directory, extension):

# Create a Path object for the given directory
    path = Path(directory)
	
# Finding all files matching the extension recursively	
    return list(path.rglob(f'*{extension}'))  


files = find_files('/path/to/directory', '.txt')
for file in files:
    # Print each found file's path
    print(file)

Output

/path/to/directory/file1.txt
/path/to/directory/subdir1/file2.txt
/path/to/directory/subdir2/subsubdir/file3.txt
Updated on: 2024-11-20T17:41:08+05:30

40K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements