Skip to content

fix: follow up issue #13 data ingestion and parsing#16

Open
nrvim wants to merge 5 commits intomainfrom
dev
Open

fix: follow up issue #13 data ingestion and parsing#16
nrvim wants to merge 5 commits intomainfrom
dev

Conversation

@nrvim
Copy link
Copy Markdown
Owner

@nrvim nrvim commented Apr 1, 2026

Summary

Follow-up for #13.

v0.1.8 fixed the weight date-format bug, but users still reported missing weight rows, empty calories, empty body battery payloads, and bad activity rows. This PR focuses on the collector and normalization path that v0.1.8 did not change.

What changed since v0.1.8

Collector changes

  • Remove the activities_range GraphQL fetch from the sync path
  • Add optional raw-response capture with --save-raw
  • Preserve multiple raw captures instead of overwriting them
  • Update Garmin Connect URL from /modern/ to /app/ (old one now 302-redirects)

Ingestion and persistence fixes

  • Reject wrapper and error payloads when ingesting activities
  • Unwrap nested activity and weight payload shapes before insert
  • Improve weight persistence to retain timestamp and source data
  • Backfill and merge calories from daily summary data
  • Extract training status from the newer nested Garmin payload shape
  • Persist HRV timeline data and correct reading counts
  • Persist lactate threshold data
  • Extract running dynamics from activity detail payloads

Code quality (Copilot review follow-ups)

  • Gate startup backfill/cleanup routines — only run when rows actually need it
  • Document Garmin's hearRate API typo in lactate threshold endpoint
  • Remove unused import

Local validation

  • Incremental sync completes successfully
  • Full re-authentication with MFA tested
  • ruff check and ruff format pass
  • All imports OK

Refs #13

Copilot AI review requested due to automatic review settings April 1, 2026 23:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is a follow-up to #13 focused on fixing Garmin data ingestion/normalization issues (missing weight rows, empty calories, bad activity rows, and incomplete training/HRV payload handling) and adding optional raw response capture for debugging.

Changes:

  • Add --save-raw support to preserve raw API payloads (without overwriting prior captures) to aid debugging/replay.
  • Harden ingestion/normalization for activities, weight, calories, training status, HRV timeline, lactate threshold, and running dynamics; update MCP tools to surface new fields.
  • Extend SQLite schema and add migrations/backfills/cleanup steps to correct existing persisted data.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
garmin_mcp/sync.py Adds save_raw option to incremental sync and forwards it into the client fetch path.
garmin_mcp/server.py Exposes new persisted fields (training status loads, running dynamics) and improves weight ordering/fields returned.
garmin_mcp/db.py Adds new tables/columns, new upsert/backfill/cleanup logic, and stronger payload unwrapping/extraction for ingestion.
garmin_givemydata.py Adds --save-raw CLI flag and plumbs it into direct-to-DB fetching.
garmin_client/endpoints.py Updates endpoint set: adds lactate threshold, nutrition meals, HRV timeline; removes activities_range GraphQL query.
garmin_client/client.py Improves session refresh behavior and implements raw response capture to debug/raw.
.gitignore Ignores debug/ output directory.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

garmin_mcp/db.py Outdated
Comment on lines +659 to +665
conn.executescript(_SCHEMA_SQL)
migrate_weight_table(conn)
migrate_device_table(conn)
migrate_training_status_table(conn)
cleanup_invalid_rows(conn)
backfill_hrv_timeline_counts(conn)
backfill_calories_from_daily_summaries(conn)
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init_db() runs cleanup/backfill routines (cleanup_invalid_rows, backfill_hrv_timeline_counts, backfill_calories_from_daily_summaries) on every startup. At least backfill_hrv_timeline_counts does a full-table UPDATE, which can cause unnecessary writes/WAL growth and slow startup as the DB grows. Consider gating these operations (e.g., only when a migration is needed, only for rows with NULL/0 values, or behind an explicit one-time migration flag/version).

Copilot uses AI. Check for mistakes.
garmin_mcp/db.py Outdated
Comment on lines +648 to +654
if "acute_load" not in cols:
log.info("Migrating training_status table...")
try:
conn.execute("ALTER TABLE training_status ADD COLUMN acute_load REAL")
conn.execute("ALTER TABLE training_status ADD COLUMN chronic_load REAL")
except Exception as e:
log.debug("Migration note: %s", e)
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrate_training_status_table() only checks for the presence of acute_load; if a DB is in a partially-migrated state (acute_load exists but chronic_load does not), this will skip migration and later queries/selects for chronic_load will fail. Consider checking/adding each column independently (and not treating the table as fully migrated until both columns exist).

Suggested change
if "acute_load" not in cols:
log.info("Migrating training_status table...")
try:
conn.execute("ALTER TABLE training_status ADD COLUMN acute_load REAL")
conn.execute("ALTER TABLE training_status ADD COLUMN chronic_load REAL")
except Exception as e:
log.debug("Migration note: %s", e)
missing_cols = []
if "acute_load" not in cols:
missing_cols.append("acute_load")
if "chronic_load" not in cols:
missing_cols.append("chronic_load")
if missing_cols:
log.info(
"Migrating training_status table, adding columns: %s",
", ".join(missing_cols),
)
try:
for col in missing_cols:
conn.execute(
f"ALTER TABLE training_status ADD COLUMN {col} REAL"
)
except sqlite3.OperationalError as e:
log.debug("Migration note (training_status): %s", e)

Copilot uses AI. Check for mistakes.
garmin_mcp/db.py Outdated
Comment on lines +674 to +681
if "battery_status" not in cols:
log.info("Migrating device table...")
try:
conn.execute("ALTER TABLE device ADD COLUMN software_version TEXT")
conn.execute("ALTER TABLE device ADD COLUMN battery_status TEXT")
conn.execute("ALTER TABLE device ADD COLUMN battery_voltage REAL")
except Exception as e:
log.debug("Migration note: %s", e)
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrate_device_table() treats the table as migrated based solely on battery_status. If software_version already exists but battery_status/battery_voltage do not (partial migration), the first ALTER TABLE will raise and the remaining columns won't be added, leaving the schema inconsistent. Consider adding missing columns one-by-one (each in its own try/except) based on PRAGMA table_info results.

Suggested change
if "battery_status" not in cols:
log.info("Migrating device table...")
try:
conn.execute("ALTER TABLE device ADD COLUMN software_version TEXT")
conn.execute("ALTER TABLE device ADD COLUMN battery_status TEXT")
conn.execute("ALTER TABLE device ADD COLUMN battery_voltage REAL")
except Exception as e:
log.debug("Migration note: %s", e)
# Add any missing columns one-by-one so partial migrations are safely completed.
missing_cols = [
name
for name in ("software_version", "battery_status", "battery_voltage")
if name not in cols
]
if missing_cols:
log.info("Migrating device table...")
if "software_version" not in cols:
try:
conn.execute("ALTER TABLE device ADD COLUMN software_version TEXT")
except Exception as e:
log.debug("Migration note (software_version): %s", e)
if "battery_status" not in cols:
try:
conn.execute("ALTER TABLE device ADD COLUMN battery_status TEXT")
except Exception as e:
log.debug("Migration note (battery_status): %s", e)
if "battery_voltage" not in cols:
try:
conn.execute("ALTER TABLE device ADD COLUMN battery_voltage REAL")
except Exception as e:
log.debug("Migration note (battery_voltage): %s", e)

Copilot uses AI. Check for mistakes.
(
d,
record.get("speed"),
record.get("hearRate") or record.get("heartRate"),
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key hearRate looks like a typo; if the Garmin payload uses heartRate, this branch is unnecessary/confusing and risks masking real schema issues. Prefer using the correctly spelled heartRate key (and add a comment only if you’ve confirmed Garmin sometimes returns the misspelled key).

Suggested change
record.get("hearRate") or record.get("heartRate"),
record.get("heartRate"),

Copilot uses AI. Check for mistakes.
Activity IDs that already have detail data (splits, HR zones, weather).
These will be skipped during per-activity detail fetching.
save_raw : bool, default False
Whether to save raw JSON responses to the 'raw' directory.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The save_raw parameter docs say it saves to the 'raw' directory, but _save_raw() writes to <profile_dir>/../debug/raw (i.e., debug/raw under the data dir). Update the docstring to match the actual location so users can find captures.

Suggested change
Whether to save raw JSON responses to the 'raw' directory.
Whether to save raw JSON responses under the ``debug/raw`` directory
(i.e., ``<profile_dir>/../debug/raw``).

Copilot uses AI. Check for mistakes.
ISO date string (``YYYY-MM-DD``) to treat as "today". Defaults to
the actual current date.
save_raw:
Whether to save raw JSON responses to the 'raw' directory.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The save_raw parameter docs say it saves to the 'raw' directory, but the client implementation writes to debug/raw under the data dir (next to browser_profile). Update the docstring/help text to match the actual output path so users can locate captures.

Suggested change
Whether to save raw JSON responses to the 'raw' directory.
Whether to save raw JSON responses under the ``debug/raw`` directory
in the data directory (next to ``browser_profile``).

Copilot uses AI. Check for mistakes.
Comment on lines +684 to +693
def migrate_weight_table(conn: sqlite3.Connection) -> None:
"""Migrate the weight table to include the 'source' column and 'timestamp' PK if needed."""
cursor = conn.execute("PRAGMA table_info(weight)")
cols = {row["name"] for row in cursor.fetchall()}

# Check if we need to migrate (either missing 'source' or wrong PK)
# The new schema has 'timestamp' as PK.
cursor = conn.execute("PRAGMA index_list(weight)")
is_legacy = "timestamp" not in cols

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrate_weight_table() decides whether to migrate solely based on the presence of the timestamp column. If a DB ever ends up with timestamp present but source missing (partial/older schema), this migration will be skipped and upsert_weight() will later fail because it inserts into source. Consider handling missing source via an ALTER TABLE ... ADD COLUMN source path independently of the PK migration. (Also, the PRAGMA index_list(weight) cursor is currently unused and can be removed.)

Copilot uses AI. Check for mistakes.
garmin_mcp/db.py Outdated
Comment on lines +643 to +647
def migrate_training_status_table(conn: sqlite3.Connection) -> None:
"""Migrate the training_status table to include acute and chronic load columns."""
cursor = conn.execute("PRAGMA table_info(training_status)")
cols = {row["name"] for row in cursor.fetchall()}

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrate_*_table() uses row["name"] from PRAGMA results, which only works when the connection’s row_factory is sqlite3.Row. init_db() can be (and in CI is) called with a plain sqlite3.connect(':memory:') connection, where PRAGMA rows are tuples—this will raise at runtime. Consider either setting conn.row_factory = sqlite3.Row inside init_db(), or extracting PRAGMA column names using positional indexes (e.g., row[1]) to work with both tuple and Row connections.

Copilot uses AI. Check for mistakes.
nrvim added 3 commits April 8, 2026 20:55
…Rate typo

- Only run cleanup_invalid_rows, backfill_hrv_timeline_counts, and
  backfill_calories_from_daily_summaries when rows actually need it
- Add comment explaining Garmin's "hearRate" spelling in lactate threshold
- Remove unused Any import from client.py
nrvim added 2 commits April 8, 2026 21:04
/modern/ now 302-redirects to /app/. Extract into CONNECT_URL constant
to avoid 5 hardcoded URLs and an unnecessary redirect on every call.
Without GARMIN_DATA_DIR, the MCP server falls back to
~/.garmin-givemydata/ which may not match the git clone location.

Refs #13 (SPooKfpv report)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants