Skip to content

A set of scripts to capture pictures from a webcam feed and then getting the description of them from openai. After that, we covert the text to speech using Elevenlabs.

License

Notifications You must be signed in to change notification settings

Aavato-c/20231124_ai_voice_camera_demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Elevenlabs webcam descriptor

The source is quite simple. Go ahead and check it out, I don't think it's necessary to explain everything here but in short:

  • This will save frames from your webcam and store them in the media -folder
    • The most current frame will be updated
  • The other app makes calls to openai to get a description of the image, then send a request to elevenlabs to make a dub.
    • The sound files will also be stored

To start using this little thing do the following:

  • Make a venv using python3.8
    • python3.8 venv -m venv
  • Activate the environment
    • source venv/bin/activate
  • Install dependencies
    • pip install -r requirements.txt
  • Make your own env
    • You can use the env.example provided, remove the .example extension and fill in your api keys
  • Run the save_video_frames.py
    • python3 save_video_frames.py
  • Run the main_app.py
    • python3 main_app.py
  • Enjoy

Made with a mac M1

About

A set of scripts to capture pictures from a webcam feed and then getting the description of them from openai. After that, we covert the text to speech using Elevenlabs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages