Using pathlib in Python
Handling file paths with the pathlib library

Have you looked at Python code that handles file paths?
It's not unusual to find code that uses either os.path or pathlib. If you haven't, don't worry! In this post, I'll introduce you to pathlib and why it has become the primary library for this over os.path.
The os.path library was the traditional module for handling path maniputlation for quite some time. It is strings-based and it provides functions like os.path.join(). It works well with older versions of Python.
In modern code, it can be argued that pathlib has become the primary library for handling file paths in Python. It was introduced in Python 3.4 and it provides an object-oriented approach to filesystem paths. One of its key features is that it automatically handles differences between operating systems (e.g., Windows vs. Unix).
For new projects, pathlib is usually the better choice because it is easier to use and has a more modern design. Well take a quick look at how to compose paths and what the best approach is currently to establishing a base directory.
Absolute vs Relative Paths
In Python, pathlib represents file system paths as objects, and it handles both absolute and relative paths.
Relative Paths
A relative path is interpreted relative to the current working directory (the directory from which the Python process is running).
from pathlib import Path
p = Path("data/input.csv")
print(p)
# data/input.csv
Here, when we print the path, it doesn't specify the full location of the csv file. Python will look for:
<current working directory>/data/input.csv
Here's how you can use pathlib to see the current working directory by defining it using Path.cwd():
from pathlib import Path
cwd = Path.cwd()
print(cwd)
For example, if your current working directory is C:\projects\my_app
Path("data/input.csv")
Then this path to the input.csv file really refers to:
C:\projects\my_app\data\input.csv
Common relative path examples
Here are some ways you can refer to a subdirectory, parent directory, etc.
Path("file.py") # current directory
Path("logs/app.log") # subdirectory
Path("../config.json") # parent directory
Path("../../data.csv") # two levels up
Absolute paths
An absolute path specifies a file's exact location starting from the root directory of the filesystem.
In Windows
Path(r"C:\Users\tmp\data\input.csv")
# or it can also be like this
Path("C:\Users\tmp\data\input.csv")
In Linux/macOS
Path("/home/tmp/data/input.csv")
Notice that the path starts at the root directory. Unlike a relative path, an absolute path does not change based on the current working directory. An absolute path works no matter which directory your program is currently in. No matter where the script is run from, it refers to the same location.
Downside of Absolute Paths
However, hardcoding absolute paths is often discouraged because they can make your code less portable and more fragile. Absolute paths may break when:
The project is moved to a different directory on the same machine.
Another developer checks out the project to a different location on their computer.
The code is deployed to a server where the directory structure is different.
A user runs the application on a different operating system, where path formats and root locations may not match.
Files are relocated during refactoring or reorganization of the project.
For example, a path like /Users/johndoe/projects/myapp/data/config.json works only as long as that exact directory structure exists. If the project is moved or shared with someone else, the path becomes invalid.
Relative paths, configuration files, environment variables, or paths constructed dynamically with pathlib are often more flexible and maintainable alternatives. But first, we need to take a look at how we would define the current working directory (cwd).
Using the Current Working Directory
Earlier in this article, we already saw how we can use Path.cwd()to define the cwd:
from pathlib import Path
cwd = Path.cwd()
print(cwd)
With Path.cwd() , we can then build a file path from it and access it.
Note: the "
/" operator is used to join paths, making it easier to construct file paths in a readable way
data_file = Path.cwd() / "data" / "sales.csv"
This approach works well when you know exactly where the program will be launched from. If Path.cwd() is the project root, the path resolves correctly.
CWD Can Be Fragile
The challenge is that Path.cwd() is determined by how the program is started, not where the script itself lives.
Suppose your project looks like this:
project/
βββ data/
β βββ sales.csv
βββ scripts/
βββ report.py
You can run the script from the project root:
python scripts/report.py
In this scenario, Path.cwd() points to project/, and the path resolves correctly.
However, if you first change into the scripts directory and run:
cd scripts
python report.py
Path.cwd() now points to project/scripts/, causing the program to look for:
project/scripts/data/sales.csv
This file path is not correct!
The file doesn't exist.
It's important to understand this behavior as it can lead to bugs when you run your code from:
an IDE
test runner
deployment environment
another script that uses a different working directory
Note: Using
cwd = Path.cwd()is still appropriate when you intentionally choose the working directory. For example, command-line tools that operate on files in the directory where they're run. The fragility appears whenPath.cwd()is used to locate application resources that are part of the project itself.
Defining a Reliable Base Path
In modern Python code, a more reliable approach is to define a base path relative to the source file itself rather than the current working directory.
Consider our previous scenario:
project/
βββ data/
β βββ sales.csv
βββ scripts/
βββ report.py
Let's say we define the following withinreport.py :
from pathlib import Path
# Determine the project's base directory:
# Path(__file__) gives the path of the current Python file
# resolve() converts it to an absolute path
# parent gets the directory containing the file
# parent.parent moves up one more level
BASE_DIR = Path(__file__).resolve().parent.parent
# Build the path to the sales.csv file:
# Using the "/" operator with Path objects
# creates platform-independent paths
data_file = BASE_DIR / "data" / "sales.csv"
Important to know: The
__file__variable in Python is a special attribute that contains the path to the current script being executed. It is set by the Python interpreter automatically when a script runs or a module is imported.
The __file__refers to the location of the current Python file, this approach works regardless of where the program is launched from. The path remains tied to the project's structure rather than the user's current directory.
Path(__file__).resolve()- give us the absolute path of the current fileproject/scripts/report.py
the first
.parentgives usproject/scripts/
the second
.parentgives usproject/
Defining our BASE_DIR based on Path(__file__) allows us to guard our code from the bugs that comes from relying on Path.cwd().
Using Path(__file__) helps make Python scripts more portable and adaptable by enabling file paths to be constructed relative to the script's location rather than the current working directory.
This approach is particularly useful when creating reliable paths to project resources. For applications, libraries, and scripts that need to access files included with the project, defining a base directory from Path(__file__) is typically the most robust and platform-independent solution.





