feat(firestore-bigquery-export): Add Gemini agent to gen-schema-view #2242

cabljac · 2024-12-18T14:34:32Z

This PR adds the option to use a Gemini Agent to generate the schema files, fixes some tests, and corrects a parameter description.

firestore-bigquery-export/scripts/gen-schema-view/src/schema/index.ts

Gustolandia · 2024-12-19T14:57:11Z

Testing

Deeply Nested Documents:

No explicit test cases for collections with deeply nested fields, such as maps within arrays or arrays within maps.

Large Collections

There are no tests for large collections that exceed the agent-sample-size. A test case should simulate the agent gracefully sampling the first N documents and stopping without errors.

test("should handle collections with more than 100 documents", async () => {
  const largeSampleData = Array.from({ length: 200 }, (_, i) => ({ id: i, value: `doc${i}` }));
  const response = await runAgent(apiKey, "./schemas", "large_collection", "test_table", largeSampleData);
  expect(response).toBeDefined();
  expect(response.success).toBeTruthy();
});

cabljac · 2024-12-19T15:11:09Z

Deeply Nested Documents:

No explicit test cases for collections with deeply nested fields, such as maps within arrays or arrays within maps.

Suggestion: Add a test in e2e.test.ts with a deeply nested document like:

const sampleData = [
    {
        user: {
            profile: {
                name: "Alice",
                age: 25,
            },
            settings: {
                notifications: {
                    email: true,
                    push: false,
                },
            },
        },
    },
];

Large Collections:

There are no tests for large collections that exceed the agent-sample-size. A test case should simulate the agent gracefully sampling the first N documents and stopping without errors.

test("should handle collections with more than 100 documents", async () => {
    const largeSampleData = Array.from({ length: 200 }, (_, i) => ({ id: i, value: `doc${i}` }));
    const response = await runAgent(apiKey, "./schemas", "large_collection", "test_table", largeSampleData);
    expect(response).toBeDefined();
    expect(response.success).toBeTruthy();
});

I think i'll leave these out, they're fair points but not relevant to this PR in particular.

Actually the "Large Collections" comment isn't correct; we're taking just a sample

Gustolandia · 2024-12-19T15:23:26Z

Actually the "Large Collections" comment isn't correct; we're taking just a sample

The current tests seem to be missing a verification for whether the sampling behavior correctly handles large collections. The tests should explicitly check that only the specified number of samples (agent-sample-size) are being processed, regardless of the total size of the collection.

What I wrote should be like this instead (apologies):

  test("should handle collections with more than 100 documents by sampling", async () => {
      const largeSampleData = Array.from({ length: 200 }, (_, i) => ({ id: i, value: `doc${i}` }));
      const agentSampleSize = 50; // Example sample size
      const response = await runAgent(apiKey, "./schemas", "large_collection", "test_table", largeSampleData);

      // Verify correct sampling behavior
      const sampledData = largeSampleData.slice(0, agentSampleSize); // Expected sample
      expect(response).toBeDefined();
      expect(response.success).toBeTruthy();

      // Assert sampled data is passed for schema generation
      expect(response.schemaGeneratedFrom).toEqual(sampledData); // Assuming `response.schemaGeneratedFrom` reflects sampled data
  });

firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md

pr-Mais · 2024-12-23T16:06:10Z

firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md

+
+```bash
+# Interactive mode
+npx @firebaseextensions/fs-bq-schema-views


I think its better to include a copy-paste minimal usage here, e.g. add --use-gemini-agent and other required params

We have a non-interactive example below, can you explain what you mean? confused

pr-Mais · 2024-12-23T16:10:20Z

firestore-bigquery-export/scripts/gen-schema-view/src/index.ts

-      config.schemas[schemaName]
-    );
+
+  if (config.useGemini) {


should we check if the schema file already exist and exit?

yes, good catch

…ript

…n schema agent

cabljac · 2025-03-10T09:14:48Z

firestore-bigquery-export/scripts/gen-schema-view/src/__tests__/e2e/e2e.legacy.test.ts

+    expect(result[0].name).toBe("test");
+  }, 80000);
+
+  test("should generate a schame view based on a nestedMapSchema dataset and schema", async () => {


Suggested change

test("should generate a schame view based on a nestedMapSchema dataset and schema", async () => {

test("should generate a schema view based on a nestedMapSchema dataset and schema", async () => {

…ndex.ts

Co-authored-by: Mais Alheraki <[email protected]>

…s and update gemini model import

… validating existing files when using gemini

…n gemini

cabljac commented Dec 19, 2024

View reviewed changes

firestore-bigquery-export/scripts/gen-schema-view/src/schema/index.ts Outdated Show resolved Hide resolved

cabljac marked this pull request as ready for review December 19, 2024 12:12

cabljac requested a review from a team as a code owner December 19, 2024 12:12

cabljac changed the title ~~@invertase/gen schema agent~~ feat(firestore-bigquery-export: add Gemini agent to gen-schema-view Dec 19, 2024

Ehesp changed the title ~~feat(firestore-bigquery-export: add Gemini agent to gen-schema-view~~ feat(firestore-bigquery-export): Add Gemini agent to gen-schema-view Dec 19, 2024

pr-Mais reviewed Dec 23, 2024

View reviewed changes

firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md Outdated Show resolved Hide resolved

pr-Mais suggested changes Dec 23, 2024

View reviewed changes

cabljac force-pushed the next branch 2 times, most recently from 32631be to bc45fb1 Compare February 3, 2025 13:01

cabljac mentioned this pull request Mar 3, 2025

[gen-schema-views]: use Gemini to automatically create schema file #2313

Closed

cabljac added 4 commits March 3, 2025 11:03

chore: run npm audit fix --force

41978c4

chore: run npm audit fix --force again

6967401

feat(firestore-bigquery-export): add AI agent option to gen-schema-sc…

b238d33

…ript

feat(firestore-bigquery-export): add human-in-the-loop and docs to ge…

0ccb55e

…n schema agent

cabljac force-pushed the @invertase/gen-schema-agent branch from cbe021b to 39e704e Compare March 3, 2025 11:05

CorieW force-pushed the @invertase/gen-schema-agent branch 2 times, most recently from fdebfd5 to 9c95a3f Compare March 6, 2025 07:23

cabljac force-pushed the @invertase/gen-schema-agent branch 2 times, most recently from 9d846a2 to 07624aa Compare March 7, 2025 14:55

cabljac commented Mar 10, 2025

View reviewed changes

cabljac and others added 7 commits March 10, 2025 09:34

chore(gen-schema-view): add todo for checking the table prefix option

08c7439

test(firestore-bigquery-export): update schema-loader-utils tests

0c2fa91

test(firestore-bigquery-export): fix e2e tests

c8d34ff

Update firestore-bigquery-export/scripts/gen-schema-view/src/schema/i…

2085a5b

…ndex.ts

chore(gen-schema-view): format

44f1724

refactor(firestore-bigquery-export): update gen-schema gemini approach

f37cc30

chore(firestore-bigquery-export): remove traces of ai agent wording

b90661f

cabljac and others added 7 commits March 10, 2025 09:34

Update firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md

1be7755

Co-authored-by: Mais Alheraki <[email protected]>

refactor(gen-schema-view): extract config parsing to their own module…

f7a9aae

…s and update gemini model import

refactor(gen-schema-view): simplify genkit flow

3940aba

fix(gen-schema-view): made some changes

d8f28cb

fix(gen-schema-view): get rid of some redundancy and fix problem with…

3016862

… validating existing files when using gemini

WIP

d2e1901

test(gen-schema-view): fix e2e testing

7f395b6

cabljac force-pushed the @invertase/gen-schema-agent branch from 07624aa to 7f395b6 Compare March 10, 2025 09:35

cabljac mentioned this pull request Mar 10, 2025

refactor(gen-schema-view): extract config parsing to their own modules #2316

Closed

cabljac added 2 commits March 10, 2025 15:35

test(gen-schema-view): add config testing

6d19d5b

fix(gen-schema-view): update complete gen log and point to filename i…

2a98f18

…n gemini

cabljac closed this Mar 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(firestore-bigquery-export): Add Gemini agent to gen-schema-view #2242

feat(firestore-bigquery-export): Add Gemini agent to gen-schema-view #2242

Uh oh!

cabljac commented Dec 18, 2024 •

edited

Loading

Uh oh!

Uh oh!

Gustolandia commented Dec 19, 2024 •

edited by dackers86

Loading

Uh oh!

cabljac commented Dec 19, 2024 •

edited

Loading

Uh oh!

Gustolandia commented Dec 19, 2024 •

edited

Loading

Uh oh!

Uh oh!

pr-Mais Dec 23, 2024

Uh oh!

cabljac Dec 23, 2024

Uh oh!

pr-Mais Dec 23, 2024

Uh oh!

cabljac Dec 23, 2024

Uh oh!

cabljac Mar 10, 2025

Uh oh!

Uh oh!

	test("should generate a schame view based on a nestedMapSchema dataset and schema", async () => {
	test("should generate a schema view based on a nestedMapSchema dataset and schema", async () => {

feat(firestore-bigquery-export): Add Gemini agent to gen-schema-view #2242

feat(firestore-bigquery-export): Add Gemini agent to gen-schema-view #2242

Uh oh!

Conversation

cabljac commented Dec 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Gustolandia commented Dec 19, 2024 • edited by dackers86 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Uh oh!

cabljac commented Dec 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gustolandia commented Dec 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pr-Mais Dec 23, 2024

Choose a reason for hiding this comment

Uh oh!

cabljac Dec 23, 2024

Choose a reason for hiding this comment

Uh oh!

pr-Mais Dec 23, 2024

Choose a reason for hiding this comment

Uh oh!

cabljac Dec 23, 2024

Choose a reason for hiding this comment

Uh oh!

cabljac Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cabljac commented Dec 18, 2024 •

edited

Loading

Gustolandia commented Dec 19, 2024 •

edited by dackers86

Loading

cabljac commented Dec 19, 2024 •

edited

Loading

Gustolandia commented Dec 19, 2024 •

edited

Loading