Skip to content

Modifying session variable breaks get_trackdata() #260

@kirbyj

Description

@kirbyj

I imagine there is some good reason for this, but I don't understand this behaviour:

> create_emuRdemoData(dir = tempdir())
> demo_data_dir = file.path(tempdir(), "emuR_demoData")
> tg_col_dir = file.path(demo_data_dir, "TextGrid_collection")
> path2directory = file.path(tempdir(), "my-first_emuDB")
> convert_TextGridCollection(dir = tg_col_dir,
                            dbName = "my-first",
                            targetDir = tempdir(),
                            tierNames = c("Word", "Syllable", "Phoneme", "Phonetic"))
> db_handle = load_emuDB(path2directory, verbose = FALSE)

Now create a seglist:

> sl_vowels = query(db_handle, "Phonetic == @")
> sl_vowels
> sl_vowels
# A tibble: 28 × 16
   labels start   end db_uuid  session bundle start_item_id end_item_id level attribute start_item_seq_… end_item_seq_idx type  sample_start sample_end
   <chr>  <dbl> <dbl> <chr>    <chr>   <chr>          <int>       <int> <chr> <chr>                <int>            <int> <chr>        <int>      <int>
 1 @      1506. 1548. af022ed… 0000    msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124      30968
 2 @      1715. 1741. af022ed… 0000    msajc…           107         107 Phon… Phonetic                22               22 SEGM…        34309      34828
 3 @      1967. 2034. af022ed… 0000    msajc…           112         112 Phon… Phonetic                27               27 SEGM…        39334      40673
 4 @      2303. 2362. af022ed… 0000    msajc…           117         117 Phon… Phonetic                32               32 SEGM…        46059      47238
 5 @      2447. 2506. af022ed… 0000    msajc…           119         119 Phon… Phonetic                34               34 SEGM…        48949      50125
 6 @      1917. 1958. af022ed… 0000    msajc…           118         118 Phon… Phonetic                26               26 SEGM…        38340      39155
 7 @      2022. 2078. af022ed… 0000    msajc…           120         120 Phon… Phonetic                28               28 SEGM…        40439      41569
 8 @      2382. 2431. af022ed… 0000    msajc…           126         126 Phon… Phonetic                34               34 SEGM…        47650      48619
 9 @       330.  380. af022ed… 0000    msajc…            91          91 Phon… Phonetic                 3                3 SEGM…         6609       7590
10 @      1472. 1490. af022ed… 0000    msajc…           108         108 Phon… Phonetic                20               20 SEGM…        29441      29808
# … with 18 more rows, and 1 more variable: sample_rate <int>

Get trackdata:

> td_vowels = get_trackdata(db_handle,
                              seglist = sl_vowels,
                              onTheFlyFunctionName = "forest",
                              verbose = F)
> td_vowels
# A tibble: 287 × 24
   sl_rowIdx labels start   end db_uuid   session bundle start_item_id end_item_id level attribute start_item_seq_… end_item_seq_idx type  sample_start
       <int> <chr>  <dbl> <dbl> <chr>     <chr>   <chr>          <int>       <int> <chr> <chr>                <int>            <int> <chr>        <int>
 1         1 @      1506. 1548. af022edb… 0000    msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124
 2         1 @      1506. 1548. af022edb… 0000    msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124
 3         1 @      1506. 1548. af022edb… 0000    msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124
 4         1 @      1506. 1548. af022edb… 0000    msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124
 5         1 @      1506. 1548. af022edb… 0000    msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124
 6         1 @      1506. 1548. af022edb… 0000    msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124
 7         1 @      1506. 1548. af022edb… 0000    msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124
 8         1 @      1506. 1548. af022edb… 0000    msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124
 9         1 @      1506. 1548. af022edb… 0000    msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124
10         2 @      1715. 1741. af022edb… 0000    msajc…           107         107 Phon… Phonetic                22               22 SEGM…        34309
# … with 277 more rows, and 9 more variables: sample_end <int>, sample_rate <int>, times_orig <dbl>, times_rel <dbl>, times_norm <dbl>, T1 <int>,
#   T2 <int>, T3 <int>, T4 <int>

So far, so good. Now I want session to have some other value:

sl_vowels$session<-recode(sl_vowels$session, `0000`="F1")

# A tibble: 28 × 16
   labels start   end db_uuid  session bundle start_item_id end_item_id level attribute start_item_seq_… end_item_seq_idx type  sample_start sample_end
   <chr>  <dbl> <dbl> <chr>    <chr>   <chr>          <int>       <int> <chr> <chr>                <int>            <int> <chr>        <int>      <int>
 1 @      1506. 1548. af022ed… F1      msajc…           103         103 Phon… Phonetic                18               18 SEGM…        30124      30968
 2 @      1715. 1741. af022ed… F1      msajc…           107         107 Phon… Phonetic                22               22 SEGM…        34309      34828
 3 @      1967. 2034. af022ed… F1      msajc…           112         112 Phon… Phonetic                27               27 SEGM…        39334      40673
 4 @      2303. 2362. af022ed… F1      msajc…           117         117 Phon… Phonetic                32               32 SEGM…        46059      47238
 5 @      2447. 2506. af022ed… F1      msajc…           119         119 Phon… Phonetic                34               34 SEGM…        48949      50125
 6 @      1917. 1958. af022ed… F1      msajc…           118         118 Phon… Phonetic                26               26 SEGM…        38340      39155
 7 @      2022. 2078. af022ed… F1      msajc…           120         120 Phon… Phonetic                28               28 SEGM…        40439      41569
 8 @      2382. 2431. af022ed… F1      msajc…           126         126 Phon… Phonetic                34               34 SEGM…        47650      48619
 9 @       330.  380. af022ed… F1      msajc…            91          91 Phon… Phonetic                 3                3 SEGM…         6609       7590
10 @      1472. 1490. af022ed… F1      msajc…           108         108 Phon… Phonetic                20               20 SEGM…        29441      29808
# … with 18 more rows, and 1 more variable: sample_rate <int>

Note that session is still a <chr>. But now:

> td_vowels = get_trackdata(db_handle,
+                              seglist = sl_vowels,
+                              onTheFlyFunctionName = "forest",
+                              verbose = F)
Error in get_trackdata(db_handle, seglist = sl_vowels, onTheFlyFunctionName = "forest",  : 
  Following utts entry not found: 
In addition: Warning messages:
1: In get_trackdata(db_handle, seglist = sl_vowels, onTheFlyFunctionName = "forest",  :
  The emusegs/emuRsegs object passed in refers to bundles with in-homogeneous sampling rates in their audio files! Here is a list of all refered to bundles incl. their sampling rate: 
[1] session        bundle         media_file     sample_rate    md5_annot_json
<0 rows> (or 0-length row.names)
2: Unknown or uninitialised column: `utts`. 

I am willing to accept that part of the solution here is "don't mess with session, but at the very least, this seems like the wrong error message: nothing has touched the audio files or their sample rates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions