-
Notifications
You must be signed in to change notification settings - Fork 0
Audiobooks for Quarto
AI narration solutions are sophisticated enough to be viable for audiobook creation. This project would create a quarto extension which would do the following:
- Interface with text-to-speech APIs to generate audio files from Quarto markdown files (this will require breaking the text up into chunks that are small enough to fit within API limits).
- Link the audio files into the compiled HTML page using a Quarto/Pandoc filter.
- Generate an audiobook playlist in a format suitable for download.
The R openai package provides a way to interface with the OpenAI TTS endpoint, but does not facilitate Quarto integration.
We are not aware of any comprehensive solutions for automatically generating an audiobook using Pandoc. This functionality is fairly new, and most audiobook generation services exist behind steep paywalls and assume static text, rather than the dynamic and frequently updated content of many books authored with Quarto and hosted on the web.
During the coding period, a contributor will:
- Develop a Quarto extension with the following functionality:
- Based on the content of Quarto markdown files (.qmd), generate mp3 files using a TTS API.
- Modify the Quarto markdown files or Pandoc-generated HTML files in order to link the mp3 files into the page.
(Note: This functionality doesn't need to be implemented from scratch if something like gitbook audio will work.) - Provide markup modifications to indicate the presence of built-in accessible content.
- Document the Quarto extension with a README, vignette, and usage examples.
- Provide tests and error handling which will identify where in the pipeline the process fails, if it does so, to facilitate debugging of this admittedly somewhat complex pipeline.
- (Stretch goal) Leverage Quarto's cache system or git to identify when chunks need to be re-generated in order to be more efficient with API calls/resources.
This project will make Quarto-generated products more accessible not only for vision impaired individuals but also for people who prefer to ingest content via audio. It will also provide features that will set Quarto apart from other ebook-generation platforms.
MENTORS: fill in this part. each project needs 2 mentors. One should be an expert R programmer with previous package development experience, and the other can be a domain expert in some other field or application area (optimization, bioinformatics, machine learning, data viz, etc). Ideally one of the two mentors should have previous experience with GSOC (either as a contributor or mentor). Please provide contact info for each mentor, along with qualifications.
IMPORTANT: you MUST write "EVALUATING" for one mentor, who will be required to do the three evaluations of the contributor during the summer. In previous years we have had issues with mentors who do not fill in evaluations, and when this happens R project is penalized (money is taken away), although contributors are not penalized (contributors are passed by default if no mentor eval is submitted). Therefore one mentor must take responsibility for doing the evaluations, and you must indicate that here, and your contributor must indicate that as well in the application. If it is not clear which mentor will be the EVALUATING mentor then your project will not be accepted. Example:
Contributors, please contact mentors below after completing at least one of the tests below.
-
EVALUATING MENTOR: Susan Vanderplas [email protected] is a seasoned R-project GSOC mentor and former R-project GSOC participant who has written several niche R packages and contributed to more well-known R packages including
animint
,ggpcp
,youdrawitR
,x3ptools
, andbulletxtrctr
. -
Mine Çetinkaya-Rundel [email protected] works with Posit, PBC as a Developer Educator with the Tidyverse and Quarto teams.
-
Heike Hofmann [email protected] is an R-project GSOC mentor for projects including
ggpcp
andyoudrawitR
.
-
Easy:
- Create a Quarto book that will be compiled to HTML.
-
Medium:
- Sign up for a Text-to-speech API and demonstrate use of the API to convert a paragraph of text to mp3 with R code. This does not have to be a pay-for-use API like OpenAI's TTS option - there are free options available that sound more mechanical but will demonstrate the skill (VoiceRSS, Large Text-to-Speech) required to complete this task.
- Create a function (with tests and documentation) that takes a Quarto markdown file path as input and outputs audio files to a folder. Use this code with pre- and post- render scripts to systematically generate audio files for a Quarto book or webpage.
-
Hard:
- Write a lua filter which will modify rendered HTML to add an mp3 filename corresponding to a paragraph of text.
- Incorporate a JavaScript library into a Quarto book which will play linked audio files inside the rendered HTML book/website (using an extension or linking in the JS file directly).
Contributors, please post a link to your test results here.
-
EXAMPLE CONTRIBUTOR 1 NAME, LINK TO GITHUB PROFILE, LINK TO TEST RESULTS.
-
Lydia Gibson | GitHub Profile | Quarto Book | GitHub Repo
-
Quynh Nguyen | GitHub Profile | Quarto Book | GitHub Repo