Injecting or disrupting cadence in tts output

Single spaces in text to be transformed to speech can have two impacts that may result from this. Under voice clone, pitch might increase due to the 'time of a space' aspect when translating to speech. 
(This can be adjusted with `ffmpeg` if built with `rubberband`, though this query hopes to avoid this additional post-processing step.) 
The uniformity of single spacing between words also seems unnatural to the ear. Several experiments suggest that one can only increase the number of spaces between words from 1 -> 2 on the text to be processed side. At 3 and above, the output becomes chaotic (single letters pronounced interspersed between whole word produced). The following work on the text to be sent through tts process, limited to changing from 1->2 spaces in some instances:

cadence_for2 <- \(sent) {  # sentence(s), something happens on 'some' of the spaces in, a paragraph
space_where <- unlist(gregexpr(' ', sent) )
sent_split <- unlist(strsplit(sent, split = ''))
space_mod3 <- space_where[which(space_where %% 3 == 0)] 
for (k in 1:length(space_mod3) ) {
sent_split[space_mod3[k]] <- sub(' ', '  ', sent_split[space_mod3[k]]) # swap to 2 spaces
}
sent_for2 <- paste0(sent_split, collapse = '')
return(sent_for2)
}
then use returned text object in `tts_chunked`

cadence2 <- \(sent) {  # sentence(s), a replacement is possible upon every space encountered
spaces <- vector(mode = 'character', length = 2)
spaces <- c(' ','  ') # 1 or 2
space_where <- unlist(gregexpr(' ', sent) )
num_space <- length(unlist(gregexpr(' ', sent) ) )
set.seed(42)
inter <- suppressWarnings(rbind(unlist(strsplit(sent, split = ' ') ), sample(spaces, num_space, replace = TRUE) ) )
attributes(inter) <- NULL # interleave, pads ending, useful tts_chunked for between sentence space
sent_cad2 <- paste0(inter, collapse = '')
return(sent_cad2)
}
then use returned text object in `tts_chunked`

Are there model side variants of the special character [SPACE] that could be invoked to introduce more variability in interword spacing?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Injecting or disrupting cadence in tts output #5

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Injecting or disrupting cadence in tts output #5

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions