Skip to content

Directory

DirectoryLoader #

Bases: BaseLoader

Loads files from a directory, optionally filtering by file extension and allowing recursive directory traversal.

Attributes:

Name Type Description
required_exts list[str]

List of file extensions to filter by. Only files with these extensions will be loaded. Must start with a dot. Defaults to [".pdf", ".docx", ".html"].

recursive bool

Whether to recursively search subdirectories for files. Defaults to False.

file_loader dict[str, Type[BaseFileLoader]] | None

Custom mapping of file extensions to loader classes. If None, default loaders will be used.

input_dir str

Directory path from which to load the documents.

Example
from novastack.core.loaders import DirectoryLoader

# Using default loaders
directory_loader = DirectoryLoader(input_dir="/path/to/directory")
documents = directory_loader.load_data()

# Using custom extensions
directory_loader = DirectoryLoader(
    input_dir="/path/to/directory",
    required_exts=[".pdf", ".txt"],
    recursive=True,
)
documents = directory_loader.load_data()

load_data #

load_data() -> list[Document]

Loads data and returns a list of documents.