langchain_community.document_loaders.parsers.pdf.PyMuPDFParser¶
- class langchain_community.document_loaders.parsers.pdf.PyMuPDFParser(text_kwargs: Optional[Mapping[str, Any]] = None, extract_images: bool = False)[source]¶
Parse PDF using PyMuPDF.
Initialize the parser.
- Parameters
text_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass to
fitz.Page.get_text().extract_images (bool) –
Methods
__init__([text_kwargs, extract_images])Initialize the parser.
lazy_parse(blob)Lazily parse the blob.
parse(blob)Eagerly parse the blob into a document or documents.
- __init__(text_kwargs: Optional[Mapping[str, Any]] = None, extract_images: bool = False) None[source]¶
Initialize the parser.
- Parameters
text_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass to
fitz.Page.get_text().extract_images (bool) –
- Return type
None