Skip to content

Latest commit

 

History

History
151 lines (98 loc) · 7.01 KB

README.md

File metadata and controls

151 lines (98 loc) · 7.01 KB

VoxNovel

VoxNovelLogo

VoxNovel: GPT-4 powered audiobooks with unique character voices. Only compatable with Ubuntu linux at the moment

-Created Notebook versions of this to be run in google colab for free if you don't have Ubuntu

Overview

VoxNovel is an innovative program that leverages the capabilities of GPT-4 to analyze literature, attribute quotations to specific characters, and generate a tailored audiobook where each character has a distinct voice. This not only provides an immersive audiobook experience but also brings each character to life with a unique voice, making the listening experience much more engaging.

How to Run VoxNovel

  1. Setup:

    • Clone the repository: git clone https://github.com/DrewThomasson/VoxNovel.
    • Navigate to the directory: cd VoxNovel.
    • Install the balacoon package via pip.
    • pip install -i https://pypi.fury.io/balacoon/ balacoon-tts
    • pip install numpy
    • pip install huggingface_hub
    • pip install openai
    • Install the necessary packages:
  2. Run full VOXNOVEL GUI:

    • In terminal: run_gui.py.
    • Close out of each GUI when it says "Complete", and the next GUI in order will automatically run.

About Files

image

image

  1. gui_create_quotes_files.py:

    • This program accepts a .txt file of your book. Ensure it is consistently formatted regarding how quotes are presented.
    • It will either automatically detect or allow manual input of delimiters used in your book for character quotes.
    • Outputs a quotes.csv file containing all character quotes with specified start and end delimiters.
    • Outputs a nonquotes.csv file for narration, containing everything that isn't a character quote.
    • Finally, displays a merged view of narration and character quotes as a single CSV file.

    Requirements:

    • Your book in .txt format.
    • Your OpenAI key.

image

image

  1. speaker_find_attribute.py:

    • Designed to identify the speaker of quotes using OpenAI's GPT-4.

    Functionality: - Input your OpenAI API key and select the .txt file. - Specify the number of requests or allow the program to decide based on quotes.csv length. - Real-time progress updates. - Observes a 60-second pause every 20 requests due to API limits. - Appends speaker names to the quotes.csv file upon completion.

    Requirements: - Text in .txt format. - OpenAI API key. - quotes.csv with columns for each quote's start and end locations.

image

  1. create_book_csv.py:

    • Merges quotes.csv and non_quotes.csv to produce a unified book.csv, ensuring chronological order.

    Functionality: - On start, a GUI button initiates processing. - Reads quotes.csv and non_quotes.csv, marking quotes with 'True' and non-quotes with 'False'. - Combines data, sorts by start location, and outputs book.csv. - GUI progress bar fills as processing progresses, signaling completion with a message.

    Requirements: - quotes.csv and non_quotes.csv with columns for text content, start/end locations, and speaker (speaker in 4th column). - A GUI-compatible system.

image

image

  1. book_display_and_generate_with_preview.py:

    • This program provides a GUI interface to display book content and generate voice-over for it.
    • Books should be in .csv format with at least two columns: Speaker and Text.
    • Allows users to select a TTS model and map unique speakers in the book to different voice-over characters.
    • Supports voice preview for selected characters to help users make an informed choice.
    • Outputs audio files (in .wav format) for each row in the book, synthesized using the selected voices.
    • Features a progress bar to show completion status while processing the CSV to generate voiceovers.
    • Displays the book content in a color-coded format, with different colors for different speakers. Hovering over the colored sections will display the name of the speaker.
    • Users can click on any section in the book display to play the corresponding audio.

    Requirements:

    • Your book in .csv format with columns Speaker and Text.
    • Internet connection to download the TTS models.
    • Required Python libraries: csv, wave, random, os, subprocess, pandas, pygame, tkinter, threading, balacoon_tts, huggingface_hub.

image

image

  1. book_display_and_combine_gui.py.py:

    • A multifunctional GUI application allowing:
      1. Book display with speaker-based background colors.
      2. Audio previews for book sections.
      3. Combining audio chunks with specified silence durations.

    Functionality: - Book Display: Loads book.csv, differentiating speakers by background color. Hover to see speaker names; click to play audio. - Audio Preview: Links text segments to corresponding audio_INDEX.wav files. Click to play/stop audio. - Audio Combination: Specify silence duration between chunks and combine them into combined_audio.wav with a progress bar.

    Requirements: - book.csv with columns for text and speaker. - Audio files named audio_INDEX.wav. - Libraries: pandas, torch, torchaudio, tkinter, pygame.

  2. Enjoy!:

    • Relish your auto-generated audiobook, with each character uniquely voiced.

    DEMO

High Quality Tortoise Demos

HIgh.qual.Gardians.ch.1.demo.Tortoise.mp4

Super fast audio Balacoon Demos

guardians.of.ga.hoole.chapter.one.mp4
Harry_potter.mp4

Contributing

We welcome VoxNovel contributions! Open an issue or submit a pull request for suggestions, improvements, or feature additions.

The revised README should now be more organized and easier to understand for users looking to engage with the VoxNovel project.