Docling
% pip install novastack-loaders-docling
DoclingLoader #
Bases: BaseFileLoader
A document loader that uses the docling library to extract and structure content from various file types
including PDF, DOCX, and HTML.
For more information, see Docling
Attributes:
| Name | Type | Description |
|---|---|---|
detached_tables |
bool
|
If True, separates extracted tables from the main document text and treats them as individual documents. Default is False. |
export_table_format |
str
|
Format used when exporting tables. Applicable only if |
input_file |
str
|
File path to load. |
Example
from novastack.loaders.docling import DoclingLoader
docling_loader = DoclingLoader(input_file="path/to/file.pdf")
documents = docling_loader.load_data()