File Operations
Learning Objectives
- By the end of this lesson, you will be able to:
- - Use the `os.path` module for path operations
- - Use the `pathlib` module (modern approach)
- - Manipulate file and directory paths
- - Join and split paths correctly
- - Check if paths exist
- - Get path components (directory, filename, extension)
- - Work with absolute and relative paths
- - Handle cross-platform path differences
- - Navigate directory structures
- - Understand when to use `os.path` vs `pathlib`
Lesson 9.2: Working with File Paths
Learning Objectives
By the end of this lesson, you will be able to:
- Use the
os.pathmodule for path operations - Use the
pathlibmodule (modern approach) - Manipulate file and directory paths
- Join and split paths correctly
- Check if paths exist
- Get path components (directory, filename, extension)
- Work with absolute and relative paths
- Handle cross-platform path differences
- Navigate directory structures
- Understand when to use
os.pathvspathlib
Introduction to File Paths
File paths specify the location of files and directories in a file system. Python provides tools to work with paths in a cross-platform way.
Why Path Handling Matters
- Cross-platform compatibility: Windows uses
\, Unix uses/ - Path manipulation: Join, split, normalize paths
- Path validation: Check if paths exist
- Path information: Get directory, filename, extension
Path Types
- Absolute paths: Full path from root
- Relative paths: Path relative to current directory
The os.path Module
The os.path module provides functions for path manipulation (traditional approach).
Common os.path Functions
| Function | Description |
|---|---|
join() |
Join path components |
split() |
Split path into directory and filename |
dirname() |
Get directory name |
basename() |
Get filename |
splitext() |
Split filename and extension |
exists() |
Check if path exists |
isfile() |
Check if path is a file |
isdir() |
Check if path is a directory |
abspath() |
Get absolute path |
normpath() |
Normalize path |
Joining Paths
import os
# Join path components
path = os.path.join("folder", "subfolder", "file.txt")
print(path) # Output: folder/subfolder/file.txt (or folder\subfolder\file.txt on Windows)
# Multiple components
path = os.path.join("/", "home", "user", "documents", "file.txt")
print(path)
Splitting Paths
import os
path = "/home/user/documents/file.txt"
# Split into directory and filename
directory, filename = os.path.split(path)
print(f"Directory: {directory}") # Output: /home/user/documents
print(f"Filename: {filename}") # Output: file.txt
# Get just directory name
dirname = os.path.dirname(path)
print(dirname) # Output: /home/user/documents
# Get just filename
basename = os.path.basename(path)
print(basename) # Output: file.txt
Splitting Extension
import os
path = "document.txt"
# Split filename and extension
filename, extension = os.path.splitext(path)
print(f"Filename: {filename}") # Output: Filename: document
print(f"Extension: {extension}") # Output: Extension: .txt
# Works with full paths
path = "/home/user/document.txt"
filename, extension = os.path.splitext(path)
print(filename) # Output: /home/user/document
print(extension) # Output: .txt
Checking Path Existence
import os
# Check if path exists
if os.path.exists("file.txt"):
print("File exists")
# Check if it's a file
if os.path.isfile("file.txt"):
print("It's a file")
# Check if it's a directory
if os.path.isdir("folder"):
print("It's a directory")
Absolute and Relative Paths
import os
# Get absolute path
relative_path = "file.txt"
absolute_path = os.path.abspath(relative_path)
print(absolute_path) # Output: /current/directory/file.txt
# Normalize path
path = "/home/user/../user/documents/./file.txt"
normalized = os.path.normpath(path)
print(normalized) # Output: /home/user/documents/file.txt
# Check if absolute
print(os.path.isabs("/home/user")) # Output: True
print(os.path.isabs("user")) # Output: False
Path Information
import os
path = "/home/user/documents/report.txt"
# Get directory
directory = os.path.dirname(path)
print(directory) # Output: /home/user/documents
# Get filename
filename = os.path.basename(path)
print(filename) # Output: report.txt
# Get filename without extension
name = os.path.splitext(filename)[0]
print(name) # Output: report
# Get extension
ext = os.path.splitext(filename)[1]
print(ext) # Output: .txt
The pathlib Module (Modern Approach)
The pathlib module provides an object-oriented approach to path handling (Python 3.4+).
Path Objects
from pathlib import Path
# Create Path object
path = Path("file.txt")
print(path) # Output: file.txt
# Path objects are cross-platform
path = Path("folder") / "subfolder" / "file.txt"
print(path) # Works on all platforms
Creating Paths
from pathlib import Path
# From string
path = Path("file.txt")
# From multiple components
path = Path("folder", "subfolder", "file.txt")
# Using / operator
path = Path("folder") / "subfolder" / "file.txt"
# Absolute path
path = Path("/home/user/file.txt")
# Current directory
current = Path.cwd()
print(current)
# Home directory
home = Path.home()
print(home)
Path Properties
from pathlib import Path
path = Path("/home/user/documents/report.txt")
# Get parts
print(path.parts) # Output: ('/', 'home', 'user', 'documents', 'report.txt')
# Get parent directory
print(path.parent) # Output: /home/user/documents
# Get filename
print(path.name) # Output: report.txt
# Get stem (filename without extension)
print(path.stem) # Output: report
# Get suffix (extension)
print(path.suffix) # Output: .txt
# Get all suffixes
path = Path("archive.tar.gz")
print(path.suffixes) # Output: ['.tar', '.gz']
Path Methods
from pathlib import Path
path = Path("file.txt")
# Check existence
print(path.exists()) # True or False
print(path.is_file()) # True if file
print(path.is_dir()) # True if directory
# Get absolute path
abs_path = path.absolute()
print(abs_path)
# Resolve (absolute + normalize)
resolved = path.resolve()
print(resolved)
# Check if absolute
print(path.is_absolute()) # True or False
Joining Paths with pathlib
from pathlib import Path
# Using / operator
path = Path("folder") / "subfolder" / "file.txt"
# Multiple levels
path = Path("a") / "b" / "c" / "d.txt"
# With absolute paths
path = Path("/home") / "user" / "file.txt"
# Join method
path = Path("folder").joinpath("subfolder", "file.txt")
Reading and Writing with pathlib
from pathlib import Path
path = Path("file.txt")
# Read text
content = path.read_text()
print(content)
# Read bytes
data = path.read_bytes()
# Write text
path.write_text("Hello, World!")
# Write bytes
path.write_bytes(b"Binary data")
# Read lines
lines = path.read_text().splitlines()
Directory Operations
from pathlib import Path
# Create directory
path = Path("new_folder")
path.mkdir() # Create directory
path.mkdir(exist_ok=True) # Don't error if exists
# Create parent directories
path = Path("a/b/c")
path.mkdir(parents=True, exist_ok=True)
# Remove directory (must be empty)
path.rmdir()
# List directory contents
directory = Path(".")
for item in directory.iterdir():
print(item)
# List files only
for file in directory.iterdir():
if file.is_file():
print(file)
# List directories only
for dir_item in directory.iterdir():
if dir_item.is_dir():
print(dir_item)
Path Pattern Matching
from pathlib import Path
directory = Path(".")
# Find files matching pattern
for file in directory.glob("*.txt"):
print(file)
# Recursive search
for file in directory.rglob("*.txt"):
print(file)
# Find specific files
for file in directory.glob("**/*.py"):
print(file)
Comparing os.path and pathlib
os.path (Traditional)
import os
# Join paths
path = os.path.join("folder", "file.txt")
# Split
directory, filename = os.path.split(path)
# Check existence
if os.path.exists(path):
print("Exists")
# Get extension
_, ext = os.path.splitext(path)
pathlib (Modern)
from pathlib import Path
# Create path
path = Path("folder") / "file.txt"
# Get parts
directory = path.parent
filename = path.name
# Check existence
if path.exists():
print("Exists")
# Get extension
ext = path.suffix
When to Use Each
os.path: Legacy code, compatibility with older Pythonpathlib: New code, Python 3.4+, more intuitive
Practical Examples
Example 1: File Information
from pathlib import Path
def file_info(filepath):
path = Path(filepath)
if path.exists():
print(f"Path: {path}")
print(f"Absolute: {path.absolute()}")
print(f"Directory: {path.parent}")
print(f"Filename: {path.name}")
print(f"Stem: {path.stem}")
print(f"Extension: {path.suffix}")
print(f"Size: {path.stat().st_size} bytes")
else:
print("File does not exist")
file_info("example.txt")
Example 2: Finding Files
from pathlib import Path
def find_files(directory, pattern="*.txt"):
"""Find all files matching pattern in directory."""
path = Path(directory)
if path.is_dir():
return list(path.glob(pattern))
return []
# Find all .txt files
txt_files = find_files(".", "*.txt")
for file in txt_files:
print(file)
# Find all .py files recursively
py_files = find_files(".", "**/*.py")
for file in py_files:
print(file)
Example 3: Organizing Files
from pathlib import Path
import shutil
def organize_files(source_dir, target_dir):
"""Organize files by extension."""
source = Path(source_dir)
target = Path(target_dir)
# Create target directory
target.mkdir(exist_ok=True)
# Process files
for file in source.iterdir():
if file.is_file():
# Create extension directory
ext_dir = target / file.suffix[1:] # Remove dot
ext_dir.mkdir(exist_ok=True)
# Move file
dest = ext_dir / file.name
shutil.move(str(file), str(dest))
print(f"Moved {file.name} to {ext_dir}")
organize_files(".", "organized")
Example 4: Path Validation
from pathlib import Path
def validate_path(filepath):
"""Validate and normalize path."""
path = Path(filepath)
# Resolve to absolute
try:
resolved = path.resolve()
print(f"Resolved path: {resolved}")
# Check if exists
if resolved.exists():
if resolved.is_file():
print("Valid file path")
elif resolved.is_dir():
print("Valid directory path")
else:
print("Path does not exist")
return resolved
except Exception as e:
print(f"Invalid path: {e}")
return None
validate_path("example.txt")
Example 5: Building Paths
from pathlib import Path
def build_project_structure(base_path):
"""Create project directory structure."""
base = Path(base_path)
# Create directories
directories = [
base / "src",
base / "tests",
base / "docs",
base / "data" / "raw",
base / "data" / "processed"
]
for directory in directories:
directory.mkdir(parents=True, exist_ok=True)
print(f"Created: {directory}")
# Create files
files = [
base / "README.md",
base / "requirements.txt",
base / ".gitignore"
]
for file in files:
file.touch()
print(f"Created: {file}")
build_project_structure("my_project")
Cross-Platform Path Handling
Platform Differences
from pathlib import Path
import os
# Windows uses backslashes
# Unix uses forward slashes
# pathlib handles this automatically
path = Path("folder") / "file.txt" # Works on all platforms
# os.path.join also handles this
path = os.path.join("folder", "file.txt") # Works on all platforms
Path Separators
import os
from pathlib import Path
# Get platform separator
print(os.sep) # Output: / (Unix) or \ (Windows)
# pathlib uses appropriate separator
path = Path("a") / "b" / "c"
print(path) # Uses correct separator for platform
Common Operations
Getting Current Directory
import os
from pathlib import Path
# os.path way
current = os.getcwd()
print(current)
# pathlib way
current = Path.cwd()
print(current)
Changing Directory
import os
# Change directory
os.chdir("/path/to/directory")
# Get current directory
current = os.getcwd()
Listing Directory Contents
from pathlib import Path
directory = Path(".")
# List all items
for item in directory.iterdir():
print(item)
# List only files
files = [item for item in directory.iterdir() if item.is_file()]
# List only directories
dirs = [item for item in directory.iterdir() if item.is_dir()]
Recursive Directory Traversal
from pathlib import Path
def find_all_files(directory):
"""Find all files recursively."""
path = Path(directory)
files = []
for item in path.rglob("*"):
if item.is_file():
files.append(item)
return files
all_files = find_all_files(".")
for file in all_files:
print(file)
Common Mistakes and Pitfalls
1. String Concatenation for Paths
# WRONG: String concatenation
path = "folder" + "/" + "file.txt" # Platform-specific
# CORRECT: Use os.path.join or pathlib
import os
path = os.path.join("folder", "file.txt")
# OR
from pathlib import Path
path = Path("folder") / "file.txt"
2. Hardcoding Path Separators
# WRONG: Hardcoded separator
path = "folder\\file.txt" # Windows only
# CORRECT: Use path functions
from pathlib import Path
path = Path("folder") / "file.txt" # Cross-platform
3. Not Checking Path Existence
# WRONG: Assume path exists
path = Path("file.txt")
content = path.read_text() # May fail
# CORRECT: Check first
path = Path("file.txt")
if path.exists():
content = path.read_text()
else:
print("File not found")
4. Confusing pathlib Path with String
# pathlib Path is not a string
path = Path("file.txt")
# WRONG: Treating as string
# with open(path, "r") as file: # May work but not ideal
# CORRECT: Use pathlib methods
content = path.read_text()
# OR convert to string
with open(str(path), "r") as file:
content = file.read()
Best Practices
1. Use pathlib for New Code
# Preferred: pathlib
from pathlib import Path
path = Path("folder") / "file.txt"
2. Always Check Existence
from pathlib import Path
path = Path("file.txt")
if path.exists():
# Process file
pass
3. Use resolve() for Absolute Paths
from pathlib import Path
path = Path("file.txt")
absolute = path.resolve() # Absolute + normalized
4. Use parents for Navigation
from pathlib import Path
path = Path("/a/b/c/file.txt")
parent = path.parent # /a/b/c
grandparent = path.parent.parent # /a/b
5. Use glob() for Pattern Matching
from pathlib import Path
# Find files matching pattern
for file in Path(".").glob("*.txt"):
print(file)
Practice Exercise
Exercise: Path Manipulation
Objective: Create a Python program that demonstrates path operations.
Instructions:
-
Create a file called
path_manipulation_practice.py -
Write a program that:
- Uses
os.pathfor path operations - Uses
pathlibfor path operations - Manipulates file paths
- Checks path existence
- Gets path components
- Works with directories
- Uses
-
Your program should include:
- Path joining and splitting
- Path information extraction
- Directory operations
- File finding
- Cross-platform examples
Example Solution:
"""
Path Manipulation Practice
This program demonstrates path operations using os.path and pathlib.
"""
import os
from pathlib import Path
print("=" * 60)
print("PATH MANIPULATION PRACTICE")
print("=" * 60)
print()
# 1. os.path.join()
print("1. os.path.join()")
print("-" * 60)
path = os.path.join("folder", "subfolder", "file.txt")
print(f"Joined path: {path}")
path = os.path.join("/", "home", "user", "file.txt")
print(f"Absolute path: {path}")
print()
# 2. os.path.split()
print("2. os.path.split()")
print("-" * 60)
path = "/home/user/documents/file.txt"
directory, filename = os.path.split(path)
print(f"Directory: {directory}")
print(f"Filename: {filename}")
print()
# 3. os.path.dirname() and basename()
print("3. os.path.dirname() and basename()")
print("-" * 60)
path = "/home/user/documents/report.txt"
print(f"Path: {path}")
print(f"Directory: {os.path.dirname(path)}")
print(f"Filename: {os.path.basename(path)}")
print()
# 4. os.path.splitext()
print("4. os.path.splitext()")
print("-" * 60)
path = "document.txt"
filename, extension = os.path.splitext(path)
print(f"Filename: {filename}")
print(f"Extension: {extension}")
path = "/home/user/document.txt"
filename, extension = os.path.splitext(path)
print(f"Full path filename: {filename}")
print(f"Extension: {extension}")
print()
# 5. os.path.exists()
print("5. os.path.exists()")
print("-" * 60)
test_path = "example.txt"
if os.path.exists(test_path):
print(f"{test_path} exists")
else:
print(f"{test_path} does not exist")
print()
# 6. os.path.isfile() and isdir()
print("6. os.path.isfile() and isdir()")
print("-" * 60)
current_dir = "."
if os.path.isdir(current_dir):
print(f"{current_dir} is a directory")
test_file = "example.txt"
if os.path.isfile(test_file):
print(f"{test_file} is a file")
print()
# 7. os.path.abspath()
print("7. os.path.abspath()")
print("-" * 60)
relative_path = "file.txt"
absolute_path = os.path.abspath(relative_path)
print(f"Relative: {relative_path}")
print(f"Absolute: {absolute_path}")
print()
# 8. os.path.normpath()
print("8. os.path.normpath()")
print("-" * 60)
path = "/home/user/../user/documents/./file.txt"
normalized = os.path.normpath(path)
print(f"Original: {path}")
print(f"Normalized: {normalized}")
print()
# 9. pathlib Path creation
print("9. pathlib Path creation")
print("-" * 60)
path = Path("file.txt")
print(f"Path: {path}")
path = Path("folder") / "subfolder" / "file.txt"
print(f"Joined path: {path}")
path = Path("/home") / "user" / "file.txt"
print(f"Absolute path: {path}")
print()
# 10. pathlib Path properties
print("10. pathlib Path properties")
print("-" * 60)
path = Path("/home/user/documents/report.txt")
print(f"Path: {path}")
print(f"Parent: {path.parent}")
print(f"Name: {path.name}")
print(f"Stem: {path.stem}")
print(f"Suffix: {path.suffix}")
print(f"Parts: {path.parts}")
print()
# 11. pathlib exists() and is_file()
print("11. pathlib exists() and is_file()")
print("-" * 60)
path = Path("example.txt")
print(f"Path: {path}")
print(f"Exists: {path.exists()}")
print(f"Is file: {path.is_file()}")
print(f"Is directory: {path.is_dir()}")
print()
# 12. pathlib absolute() and resolve()
print("12. pathlib absolute() and resolve()")
print("-" * 60)
path = Path("file.txt")
print(f"Path: {path}")
print(f"Absolute: {path.absolute()}")
print(f"Resolved: {path.resolve()}")
print()
# 13. pathlib read_text() and write_text()
print("13. pathlib read_text() and write_text()")
print("-" * 60)
path = Path("test_file.txt")
path.write_text("Hello, World!\nThis is a test file.")
print(f"Written to {path}")
content = path.read_text()
print(f"Content:\n{content}")
# Cleanup
path.unlink()
print(f"Deleted {path}")
print()
# 14. pathlib directory operations
print("14. pathlib directory operations")
print("-" * 60)
# Create test directory
test_dir = Path("test_directory")
test_dir.mkdir(exist_ok=True)
print(f"Created directory: {test_dir}")
# Create file in directory
test_file = test_dir / "test.txt"
test_file.write_text("Test content")
print(f"Created file: {test_file}")
# List directory contents
print(f"Contents of {test_dir}:")
for item in test_dir.iterdir():
print(f" {item.name} ({'file' if item.is_file() else 'directory'})")
# Cleanup
test_file.unlink()
test_dir.rmdir()
print(f"Cleaned up {test_dir}")
print()
# 15. pathlib glob()
print("15. pathlib glob()")
print("-" * 60)
current_dir = Path(".")
print("Finding .txt files:")
txt_files = list(current_dir.glob("*.txt"))
if txt_files:
for file in txt_files:
print(f" {file}")
else:
print(" No .txt files found")
print("\nFinding .py files recursively:")
py_files = list(current_dir.rglob("*.py"))
if py_files:
for file in py_files[:5]: # Show first 5
print(f" {file}")
if len(py_files) > 5:
print(f" ... and {len(py_files) - 5} more")
else:
print(" No .py files found")
print()
# 16. Path.cwd() and Path.home()
print("16. Path.cwd() and Path.home()")
print("-" * 60)
current = Path.cwd()
print(f"Current directory: {current}")
home = Path.home()
print(f"Home directory: {home}")
print()
# 17. Building paths
print("17. Building paths")
print("-" * 60)
base = Path("project")
src = base / "src"
tests = base / "tests"
readme = base / "README.md"
print(f"Project structure:")
print(f" {base}")
print(f" {src.relative_to(base)}")
print(f" {tests.relative_to(base)}")
print(f" {readme.relative_to(base)}")
print()
# 18. Path comparison
print("18. Path comparison")
print("-" * 60)
path1 = Path("folder/file.txt")
path2 = Path("folder") / "file.txt"
print(f"path1: {path1}")
print(f"path2: {path2}")
print(f"Equal: {path1 == path2}")
print()
# 19. Getting file size
print("19. Getting file size")
print("-" * 60)
test_file = Path("example.txt")
if test_file.exists():
size = test_file.stat().st_size
print(f"{test_file.name}: {size} bytes")
else:
print(f"{test_file} does not exist")
print()
# 20. Path with multiple extensions
print("20. Path with multiple extensions")
print("-" * 60)
path = Path("archive.tar.gz")
print(f"Path: {path}")
print(f"Suffix: {path.suffix}") # Only last extension
print(f"Suffixes: {path.suffixes}") # All extensions
print(f"Stem: {path.stem}") # Without last extension
print()
print("=" * 60)
print("PRACTICE COMPLETE!")
print("=" * 60)
Expected Output (truncated):
============================================================
PATH MANIPULATION PRACTICE
============================================================
1. os.path.join()
------------------------------------------------------------
Joined path: folder/subfolder/file.txt
Absolute path: /home/user/file.txt
[... rest of output ...]
Challenge (Optional):
- Create a file organizer by extension
- Build a path validator utility
- Create a directory tree generator
- Implement a file search utility
Key Takeaways
os.pathprovides traditional path functionspathlibprovides modern object-oriented path handling (Python 3.4+)- Always use path functions instead of string concatenation
os.path.join()joins path components cross-platformPath /operator joins paths in pathlibpath.exists()checks if path existspath.is_file()andpath.is_dir()** check path typepath.parentgets parent directorypath.namegets filename,path.stemgets name without extensionpath.suffixgets file extensionpath.glob()finds files matching patternpath.resolve()gets absolute normalized path- Use
pathlibfor new code,os.pathfor legacy compatibility - Always check path existence before operations
- Path operations are cross-platform when using proper functions
Quiz: File Paths
Test your understanding with these questions:
-
What module provides modern path handling?
- A)
os.path - B)
pathlib - C)
filepath - D)
path
- A)
-
What does
os.path.join()do?- A) Splits paths
- B) Joins path components
- C) Checks existence
- D) Gets extension
-
What does
Path /operator do?- A) Divides paths
- B) Joins paths in pathlib
- C) Splits paths
- D) Checks equality
-
What does
path.parentreturn?- A) Filename
- B) Parent directory
- C) Extension
- D) Full path
-
What does
path.stemreturn?- A) Filename with extension
- B) Filename without extension
- C) Extension only
- D) Directory
-
What does
path.suffixreturn?- A) Filename
- B) Directory
- C) File extension
- D) Full path
-
What does
path.glob()do?- A) Creates files
- B) Finds files matching pattern
- C) Deletes files
- D) Moves files
-
What does
os.path.splitext()return?- A) Single string
- B) Tuple of (filename, extension)
- C) List of parts
- D) Path object
-
What does
path.resolve()do?- A) Checks existence
- B) Gets absolute normalized path
- C) Gets relative path
- D) Creates path
-
When should you use pathlib vs os.path?
- A) Always use os.path
- B) Use pathlib for new code, os.path for legacy
- C) Always use pathlib
- D) Never use either
Answers:
- B)
pathlib(modern path handling module) - B) Joins path components (cross-platform path joining)
- B) Joins paths in pathlib (pathlib path joining operator)
- B) Parent directory (directory containing the path)
- B) Filename without extension (stem is name without suffix)
- C) File extension (suffix is the extension)
- B) Finds files matching pattern (pattern matching for files)
- B) Tuple of (filename, extension) (splits filename and extension)
- B) Gets absolute normalized path (resolves to absolute and normalizes)
- B) Use pathlib for new code, os.path for legacy (pathlib is preferred for new code)
Next Steps
Excellent work! You've mastered file path operations. You now understand:
- How to use
os.pathfor path operations - How to use
pathlibfor modern path handling - How to manipulate and navigate paths
- How to work with directories and files
What's Next?
- Lesson 9.3: Context Managers
- Learn about the
withstatement - Create custom context managers
- Understand resource management
Additional Resources
- pathlib: docs.python.org/3/library/pathlib.html
- os.path: docs.python.org/3/library/os.path.html
- Path Objects: docs.python.org/3/library/pathlib.html#pathlib.Path
Lesson completed! You're ready to move on to the next lesson.