Skip to content

Streamlit based application to load, parse and answer questions about user uploaded PDFs

License

Notifications You must be signed in to change notification settings

samson-bakos/ChatPDF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ChatPDF: Interactive Document Conversational Interface

ChatPDF is a Streamlit-based web application designed to facilitate interactive querying of PDF documents.

Features

  • PDF Text Extraction: Utilizes PyMuPDF (fitz) to accurately extract text from uploaded PDF documents, ensuring that the full content is available for querying.
  • Natural Language Querying: Integrates OpenAI's language models via LangChain to understand and respond to user queries in natural language, making document exploration intuitive and efficient.
  • Adaptive Text Chunking: Implements a recursive function to split document text into manageable chunks. This method ensures that large documents are processed effectively, enhancing the model's ability to provide accurate and relevant responses.
  • Intelligent Summarization: To avoid repetitive or fragmented responses from individual text chunks, the app employs a summarization step. This step consolidates responses into a single, coherent answer, enhancing user experience by providing clear and concise information.
  • Streamlit Interface: Offers a user-friendly web interface built with Streamlit, enabling easy upload of PDF files and seamless interaction with the document content through natural language queries.

How It Works

  1. Upload a PDF: Users start by uploading a PDF document through the Streamlit interface.
  2. Extract Text: The application extracts text from the PDF using PyMuPDF, displaying the extracted content for user reference.
  3. Enter Queries: Users can then enter natural language queries regarding the content of the uploaded document.
  4. Receive Answers: The app processes the document text in chunks, queries OpenAI's language models for each chunk, and then summarizes the responses to provide a clear and concise answer to the user's question.

Local Setup and Usage

  • Ensure you have Python and Streamlit installed.
  • Install the required Python packages, see the requirements
  • Run the app locally using Streamlit: streamlit run src/app.py.

License

MIT License

Acknowledgements

About

Streamlit based application to load, parse and answer questions about user uploaded PDFs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages