Skip to content

Commit

Permalink
feat(library): add new property _title for sources data
Browse files Browse the repository at this point in the history
  • Loading branch information
rhahao authored May 25, 2024
1 parent cba2446 commit 8e5dc5d
Show file tree
Hide file tree
Showing 5 changed files with 100 additions and 97 deletions.
114 changes: 58 additions & 56 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# JW EPUB Parser

[![CI](https://github.com/sws2apps/jw-epub-parser/actions/workflows/ci.yml/badge.svg)](https://github.com/sws2apps/jw-epub-parser/actions/workflows/ci.yml)
[![CD](https://github.com/sws2apps/jw-epub-parser/actions/workflows/deploy.yml/badge.svg)](https://github.com/sws2apps/jw-epub-parser/actions/workflows/deploy.yml)
[![semantic-release: angular](https://img.shields.io/badge/semantic--release-angular-e10079?logo=semantic-release)](https://github.com/semantic-release/semantic-release)
Expand All @@ -11,6 +12,7 @@
[![Vulnerabilities](https://sonarcloud.io/api/project_badges/measure?project=sws2apps_jw-epub-parser&metric=vulnerabilities)](https://sonarcloud.io/summary/new_code?id=sws2apps_jw-epub-parser)

![epub-badge@3x](https://github.com/sws2apps/jw-epub-parser/assets/80993061/c7d7c280-f838-4ff3-a021-d669de4e195c)

#### An EPUB Parser to extract the needed source materials from Meeting Workbook and Watchtower Study EPUB files.

## Install
Expand Down Expand Up @@ -60,29 +62,38 @@ By calling the `loadEPUB` function, it will return an array of objects with the
| mwb_weekly_bible_reading | string | Weekly Bible Reading |
| mwb_song_first | integer | First song |
| mwb_tgw_talk | string | 10 min talk title of the Treasures from God’s Word |
| mwb_tgw_talk_title\* | string | 10 min talk full title of the Treasures from God’s Word |
| mwb_tgw_bread | string | Bible Reading for student |
| mwb_tgw_bread_title\* | string | Bible Reading assignment full title for student |
| mwb_ayf_count | integer | Number of parts in Apply Yourself to the Field Ministry |
| mwb_ayf_part1 | string | Part 1 in Apply Yourself to the Field Ministry |
| mwb_ayf_part1_time\* | integer | Timing of Part 1 in Apply Yourself to the Field Ministry |
| mwb_ayf_part1_type\* | string | Type of Part 1 in Apply Yourself to the Field Ministry |
| mwb_ayf_part1_title\* | string | Assignment full title of Part 1 in Apply Yourself to the Field Ministry |
| mwb_ayf_part2 | string | Part 2 in Apply Yourself to the Field Ministry. This property will not be available if `mwb_ayf_count` is 1 |
| mwb_ayf_part2_time\* | integer | Timing of Part 2 in Apply Yourself to the Field Ministry. This property will not be available if `mwb_ayf_count` is 1 |
| mwb_ayf_part2_type\* | string | Type of Part 2 in Apply Yourself to the Field Ministry. This property will not be available if `mwb_ayf_count` is 1 |
| mwb_ayf_part2_title\* | string | Assignment full title of Part 2 in Apply Yourself to the Field Ministry |
| mwb_ayf_part3 | string | Part 3 in Apply Yourself to the Field Ministry. This property will not be available if `mwb_ayf_count` is less than 3 |
| mwb_ayf_part3_time\* | integer | Timing of Part 3 in Apply Yourself to the Field Ministry. This property will not be available if `mwb_ayf_count` is less than 3 |
| mwb_ayf_part3_type\* | string | Type of Part 3 in Apply Yourself to the Field Ministry. This property will not be available if `mwb_ayf_count` is less than 3 |
| mwb_ayf_part3_title\* | string | Assignment full title of Part 3 in Apply Yourself to the Field Ministry |
| mwb_ayf_part4 | string | Part 4 in Apply Yourself to the Field Ministry. This property will not be available if `mwb_ayf_count` is less than 4 |
| mwb_ayf_part4_time\* | integer | Timing of Part 4 in Apply Yourself to the Field Ministry. This property will not be available if `mwb_ayf_count` is less than 4 |
| mwb_ayf_part4_type\* | string | Type of Part 4 in Apply Yourself to the Field Ministry. This property will not be available if `mwb_ayf_count` is less than 4 |
| mwb_ayf_part4_title\* | string | Assignment full title of Part 4 in Apply Yourself to the Field Ministry |
| mwb_song_middle | integer | Middle song |
| mwb_lc_count | integer | Number of parts in Living as Christians |
| mwb_lc_part1 | string | Part 1 in Living as Christians |
| mwb_lc_part1_time\* | integer | Timing of Part 1 in Living as Christians |
| mwb_lc_part1_content\* | string | Content of Part 1 in Living as Christians |
| mwb_lc_part1_title\* | string | Full title of Part 1 in Living as Christians |
| mwb_lc_part2 | string | Part 2 in Living as Christians. This property will not be available if `mwb_lc_count` is 1 |
| mwb_lc_part2_time\* | integer | Timing of Part 2 in Living as Christians. This property will not be available if `mwb_lc_count` is 1 |
| mwb_lc_part2_content\* | string | Content of Part 2 in Living as Christians. This property will not be available if `mwb_lc_count` is 1 |
| mwb_lc_part2_title\* | string | Full title of Part 2 in Living as Christians |
| mwb_lc_cbs | string | Congregation Bible Study source material |
| mwb_lc_cbs_title\* | string | Congregation Bible Study assignment full title |
| mwb_song_conclude | integer or string | Concluding song. When the song number is out of range, it will be the default text from the Meeting Workbook. |

#### Watchtowet Study Data
Expand Down Expand Up @@ -127,34 +138,31 @@ Here are how the results of this module look like:
```js
[
{
mwb_week_date: '2023/09/04',
mwb_week_date_locale: 'September 4-10',
mwb_weekly_bible_reading: 'ESTHER 1-2',
mwb_song_first: 137,
mwb_tgw_talk: '“Strive to Be Modest Like Esther”',
mwb_tgw_bread: 'Es 1:13-22 (th study 10)',
mwb_ayf_count: 3,
mwb_ayf_part1:
'Discussion. Play the video Initial Call: Kingdom​—Mt 6:9, 10. Stop the video at each pause, and ask the audience the questions that appear in the video.',
mwb_ayf_part1_time: 5,
mwb_ayf_part1_type: 'Initial Call Video',
mwb_ayf_part2: 'Begin with the sample conversation topic. Offer the Enjoy Life Forever! brochure. (th study 1)',
mwb_ayf_part2_time: 3,
mwb_ayf_part2_type: 'Initial Call',
mwb_ayf_part3: 'w20.11 12-14 ¶3-7​—Theme: Help From Jesus and the Angels. (th study 14)',
mwb_ayf_part3_time: 5,
mwb_ayf_part3_type: 'Talk',
mwb_song_middle: 106,
mwb_lc_count: 2,
mwb_lc_part1: 'What Your Peers Say​—Body Image',
mwb_lc_part1_time: 5,
mwb_lc_part1_content:
'Discussion. Play the video. Then ask the audience: Why can it be difficult to have a balanced view of our appearance?',
mwb_lc_part2: 'Organizational Accomplishments',
mwb_lc_part2_time: 10,
mwb_lc_part2_content: 'Play the Organizational Accomplishments video for September.',
mwb_lc_cbs: 'lff lesson 56 and endnotes 6 and 7',
mwb_song_conclude: 101,
mwb_week_date: '2024/07/01',
mwb_week_date_locale: 'JULY 1-7',
mwb_weekly_bible_reading: 'PSALMS 57-59',
mwb_song_first: 148,
mwb_tgw_talk: 'Jehovah Frustrates Those Who Oppose His People',
mwb_tgw_talk_title: '1. Jehovah Frustrates Those Who Oppose His People',
mwb_tgw_bread: 'Ps 59:1-17 (th study 12)',
mwb_tgw_bread_title: '3. Bible Reading',
mwb_ayf_count: 2,
mwb_ayf_part1: 'Discussion. Play the VIDEO, and then discuss lmd lesson 7 points 1-2.',
mwb_ayf_part1_time: 7,
mwb_ayf_part1_type: 'Perseverance​—What Paul Did',
mwb_ayf_part1_title: '4. Perseverance​—What Paul Did',
mwb_ayf_part2: 'Discussion based on lmd lesson 7 points 3-5 and “See Also.”',
mwb_ayf_part2_time: 8,
mwb_ayf_part2_type: 'Perseverance​—Imitate Paul',
mwb_ayf_part2_title: '5. Perseverance​—Imitate Paul',
mwb_song_middle: 65,
mwb_lc_count: 1,
mwb_lc_part1: 'Local Needs',
mwb_lc_part1_time: 15,
mwb_lc_part1_title: '6. Local Needs',
mwb_lc_cbs: 'bt chap. 12 ¶1-6, box on p. 96',
mwb_lc_cbs_title: '7. Congregation Bible Study',
mwb_song_conclude: 78
},
...
]
Expand All @@ -165,11 +173,11 @@ Here are how the results of this module look like:
```js
[
{
w_study_date: '2023/11/06',
w_study_date_locale: 'Study Article 37: November 6-12, 2023',
w_study_title: 'Rely on Jehovah, as Samson Did',
w_study_opening_song: 30,
w_study_concluding_song: 3,
w_study_date: '2024/09/09',
w_study_date_locale: 'Study Article 27: September 9-15, 2024',
w_study_title: 'Be Courageous Like Zadok',
w_study_opening_song: 73,
w_study_concluding_song: 126
},
...
]
Expand All @@ -182,25 +190,19 @@ Here are how the results of this module look like:
```js
[
{
mwb_week_date: '4-10 de septiembre',
mwb_weekly_bible_reading: 'ESTER 1, 2',
mwb_song_first: 137,
mwb_tgw_talk: '“Esfuércese por ser modesto como Ester” (10 mins.)',
mwb_tgw_bread: 'Lectura de la Biblia (4 mins.): Est 1:13-22 (th lec. 10).',
mwb_ayf_count: 3,
mwb_ayf_part1:
'Video de la primera conversación (5 mins.): Análisis con el auditorio. Ponga el video Primera conversación: El Reino (Mt 6:9, 10). Detenga el video en cada pausa y haga las preguntas que aparecen en él.',
mwb_ayf_part2:
'Primera conversación (3 mins.): Use el tema de las ideas para conversar. Luego ofrezca el folleto Disfrute de la vida (th lec. 1).',
mwb_ayf_part3: 'Discurso (5 mins.): w20.11 12-14 párrs. 3-7. Título: Jesús y los ángeles nos ayudan (th lec. 14).',
mwb_song_middle: 106,
mwb_lc_count: 2,
mwb_lc_part1:
'Lo que opinan otros jóvenes: La apariencia (5 mins.): Análisis con el auditorio. Ponga el video. Luego pregunte: ¿por qué puede ser difícil mantener una actitud equilibrada sobre nuestra apariencia física?',
mwb_lc_part2:
'Logros de la organización (10 mins.): Ponga el video Logros de la organización para el mes de septiembre.',
mwb_lc_cbs: 'Estudio bíblico de la congregación (30 mins.): lff lección 56 y notas 6 y 7.',
mwb_song_conclude: 101,
mwb_week_date: '7月1-7日',
mwb_weekly_bible_reading: '诗篇57-59篇',
mwb_song_first: 148,
mwb_tgw_talk: '1.耶和华不会让反对我们的人得逞 (10分钟)',
mwb_tgw_bread: '3.经文朗读 (4分钟)诗59:1-17(《教导》第12课)',
mwb_ayf_count: 2,
mwb_ayf_part1: '4.坚持不懈——保罗怎么做 (7分钟)节目包括讨论。先观看短片,然后讨论《爱心》第7课1-2点。',
mwb_ayf_part2: '5.坚持不懈——向保罗学习 (8分钟)讨论《爱心》第7课3-5点以及“请看”。',
mwb_song_middle: 65,
mwb_lc_count: 1,
mwb_lc_part1: '6.本地需要 (15分钟)',
mwb_lc_cbs: '7.会众研经班 (30分钟)《作见证》第12章1-6段以及96页的附栏',
mwb_song_conclude: 78
},
...
]
Expand All @@ -211,10 +213,10 @@ Here are how the results of this module look like:
```js
[
{
w_study_date: 'Artículo de estudio 37 (del 6 al 12 de noviembre de 2023)',
w_study_title: 'Apóyese en Jehová, tal como lo hizo Sansón',
w_study_opening_song: 30,
w_study_concluding_song: 3,
w_study_date: '研究班课文27:2024年9月9-15日',
w_study_title: '效法撒督,显出勇气',
w_study_opening_song: 73,
w_study_concluding_song: 126
},
...
]
Expand Down
39 changes: 14 additions & 25 deletions src/common/html_utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -128,36 +128,25 @@ export const getWStudyDate = (htmlItem: HTMLElement) => {
return result!;
};

export const getWSTudySongs = async ({ htmlItem, zip }: { htmlItem: HTMLElement; zip: JSZip }) => {
const articleLink = htmlItem.nextElementSibling!.querySelector('a')!.getAttribute('href') as string;
const article = await getHTMLWTArticleDoc(zip, articleLink);

if (article) {
let songText;
const themeScrp = article.querySelector('.themeScrp')!;
songText = themeScrp.nextElementSibling;

if (songText === null) {
const firstSongContainer = article.querySelector('.du-color--textSubdued')!;
songText = firstSongContainer.querySelector('p');
}
export const getWSTudySongs = (content: HTMLElement) => {
const pubRefs = content.querySelectorAll('.pubRefs');

const WTOpeningSong = extractSongNumber(songText!.textContent);
const openingSongText = pubRefs.at(0)!;
const w_study_opening_song = extractSongNumber(openingSongText.textContent) as number;

const blockTeach = article.querySelector('.blockTeach');
if (blockTeach !== null) {
songText = blockTeach.nextElementSibling;
}
let concludingSongText = <HTMLElement>pubRefs.at(-1);

if (blockTeach === null) {
const artDivs = article.querySelectorAll('.du-color--textSubdued');
songText = artDivs.slice(-1)[0].querySelector('p');
}
if (pubRefs.length === 2) {
const blockTeach = content.querySelector('.blockTeach');
concludingSongText = blockTeach!.nextElementSibling!;
}

const WTConcludingSong = extractSongNumber(songText!.textContent);
const w_study_concluding_song = extractSongNumber(concludingSongText.textContent) as number;

return { WTOpeningSong, WTConcludingSong };
}
return {
w_study_opening_song,
w_study_concluding_song,
};
};

export const getWStudyTitle = (htmlItem: HTMLElement) => {
Expand Down
33 changes: 18 additions & 15 deletions src/common/parser.ts
Original file line number Diff line number Diff line change
Expand Up @@ -123,15 +123,19 @@ export const parseMWBSchedule = (htmlItem: HTMLElement, mwbYear: number, mwbLang
// 10min TGW Source
tmpSrc = splits[3].trim();
if (isEnhancedParsing) {
weekItem.mwb_tgw_talk = extractSourceEnhanced(tmpSrc, mwbLang).type;
const enhanced = extractSourceEnhanced(tmpSrc, mwbLang);
weekItem.mwb_tgw_talk = enhanced.type;
weekItem.mwb_tgw_talk_title = enhanced.fulltitle;
} else {
weekItem.mwb_tgw_talk = tmpSrc;
}

//Bible Reading Source
tmpSrc = splits[7].trim();
if (isEnhancedParsing) {
weekItem.mwb_tgw_bread = extractSourceEnhanced(tmpSrc, mwbLang).src;
const enhanced = extractSourceEnhanced(tmpSrc, mwbLang);
weekItem.mwb_tgw_bread = enhanced.src;
weekItem.mwb_tgw_bread_title = enhanced.fulltitle;
} else {
weekItem.mwb_tgw_bread = tmpSrc;
}
Expand All @@ -149,6 +153,7 @@ export const parseMWBSchedule = (htmlItem: HTMLElement, mwbYear: number, mwbLang
weekItem.mwb_ayf_part1 = partEnhanced.src;
weekItem.mwb_ayf_part1_time = partEnhanced.time;
weekItem.mwb_ayf_part1_type = partEnhanced.type;
weekItem.mwb_ayf_part1_title = partEnhanced.fulltitle;
} else {
weekItem.mwb_ayf_part1 = tmpSrc;
}
Expand All @@ -161,6 +166,7 @@ export const parseMWBSchedule = (htmlItem: HTMLElement, mwbYear: number, mwbLang
weekItem.mwb_ayf_part2 = partEnhanced.src;
weekItem.mwb_ayf_part2_time = partEnhanced.time;
weekItem.mwb_ayf_part2_type = partEnhanced.type;
weekItem.mwb_ayf_part2_title = partEnhanced.fulltitle;
} else {
weekItem.mwb_ayf_part2 = tmpSrc;
}
Expand All @@ -174,6 +180,7 @@ export const parseMWBSchedule = (htmlItem: HTMLElement, mwbYear: number, mwbLang
weekItem.mwb_ayf_part3 = partEnhanced.src;
weekItem.mwb_ayf_part3_time = partEnhanced.time;
weekItem.mwb_ayf_part3_type = partEnhanced.type;
weekItem.mwb_ayf_part3_title = partEnhanced.fulltitle;
} else {
weekItem.mwb_ayf_part3 = tmpSrc;
}
Expand All @@ -187,6 +194,7 @@ export const parseMWBSchedule = (htmlItem: HTMLElement, mwbYear: number, mwbLang
weekItem.mwb_ayf_part4 = partEnhanced.src;
weekItem.mwb_ayf_part4_time = partEnhanced.time;
weekItem.mwb_ayf_part4_type = partEnhanced.type;
weekItem.mwb_ayf_part4_title = partEnhanced.fulltitle;
} else {
weekItem.mwb_ayf_part4 = tmpSrc;
}
Expand All @@ -210,6 +218,7 @@ export const parseMWBSchedule = (htmlItem: HTMLElement, mwbYear: number, mwbLang
const lcEnhanced = extractSourceEnhanced(tmpSrc, mwbLang);
weekItem.mwb_lc_part1 = lcEnhanced.type;
weekItem.mwb_lc_part1_time = lcEnhanced.time;
weekItem.mwb_lc_part1_title = lcEnhanced.fulltitle;
if (lcEnhanced.src && lcEnhanced.src !== '') {
weekItem.mwb_lc_part1_content = lcEnhanced.src;
}
Expand All @@ -226,6 +235,7 @@ export const parseMWBSchedule = (htmlItem: HTMLElement, mwbYear: number, mwbLang
const lcEnhanced = extractSourceEnhanced(tmpSrc, mwbLang);
weekItem.mwb_lc_part2 = lcEnhanced.type;
weekItem.mwb_lc_part2_time = lcEnhanced.time;
weekItem.mwb_lc_part2_title = lcEnhanced.fulltitle;
if (lcEnhanced.src && lcEnhanced.src !== '') {
weekItem.mwb_lc_part2_content = lcEnhanced.src;
}
Expand All @@ -239,7 +249,9 @@ export const parseMWBSchedule = (htmlItem: HTMLElement, mwbYear: number, mwbLang
tmpSrc = splits[nextIndex].trim();

if (isEnhancedParsing) {
weekItem.mwb_lc_cbs = extractSourceEnhanced(tmpSrc, mwbLang).src;
const enhanced = extractSourceEnhanced(tmpSrc, mwbLang);
weekItem.mwb_lc_cbs = enhanced.src;
weekItem.mwb_lc_cbs_title = enhanced.fulltitle;
} else {
weekItem.mwb_lc_cbs = tmpSrc;
}
Expand Down Expand Up @@ -273,19 +285,10 @@ export const parseWSchedule = (article: HTMLElement, content: HTMLElement, wLang
const studyTitle = getWStudyTitle(article);
weekItem.w_study_title = studyTitle;

const pubRefs = content.querySelectorAll('.pubRefs');
const songs = getWSTudySongs(content);

const openingSongText = pubRefs.at(0)!;
weekItem.w_study_opening_song = extractSongNumber(openingSongText.textContent) as number;

let concludingSongText = <HTMLElement>pubRefs.at(-1);

if (pubRefs.length === 2) {
const blockTeach = content.querySelector('.blockTeach');
concludingSongText = blockTeach!.nextElementSibling!;
}

weekItem.w_study_concluding_song = extractSongNumber(concludingSongText.textContent) as number;
weekItem.w_study_opening_song = songs.w_study_opening_song;
weekItem.w_study_concluding_song = songs.w_study_concluding_song;

return weekItem;
};
Expand Down
2 changes: 1 addition & 1 deletion src/common/parsing_rules.ts
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ export const extractSourceEnhanced = (src: string, lang: string) => {

assignment = assignment.replace(regexStartColumn, '').replace(regexEndColumn, '').trim();

result = { type: assignment, time: duration, src: source };
result = { type: assignment, time: duration, src: source, fulltitle: tmpAssignment };
}
}

Expand Down
Loading

0 comments on commit 8e5dc5d

Please sign in to comment.