Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add operation to read and store data from definition blob store instead of db #82

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ajhenry
Copy link

@ajhenry ajhenry commented Jun 24, 2024

There is currently a bug where the data that is stored in the db is out of date with what is in the blob store.

Instead of storing data that's fetched from the mongo instance, we use the coordinates from the database to fetch from the definition blob store to replicate it to the changeset blob store.

This is my first crack at writing C# in many many years so open to all feedback and suggestions!

This fixes #81

I did not test this code out due to the unfamiliarity with running this system locally against our dev/prod env so it needs to be tested prior to going to production.

This makes a change so that we no longer store data that is derived from the db as it may be out of data. Instead we go straight to the source of truth, which is the definition blob store.
Copy link
Collaborator

@qtomlinson qtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!
Retrieving the definition from the blob definition store requires some additional effort. Should we contemplate incorporating a feature toggle based on an environment variable? In Program.cs, we could potentially pass in the definitionBlobContainerClient only when FetchFromBlobDefinitionStore is enabled. This would allow us to easily switch back to the previous behavior (using configuration) once the database is fully synchronized.

On the other hand, if we intend to always read from the blob definition store, we might want to optimize the database query to only return coordinates for improved efficiency. Additionally, Readme documentation may need update.

try
{
var data = await ReadFromBlob(DefinitionBlobContainerClient, coordinatePath);
return data;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on Readme, published changed definition json is definition minus files. The definition blob storage content contains the entire definition (with files). so probably need to remove the "files" section here.

}
catch (Exception e)
{
Logger.LogError("Failed to read from definition blob: {blobName}, error message: {exceptionMessage}", blobName, e.Message);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is used to read definition and index. so may want to use "Failed to read from blob" in error message.

/// <returns>returns the constructed definition url from the coordinates field</returns>
internal static string ConstructBlobUrl(this JObject jObject)
{
if (jObject?["coordinates"]?.Type == null || jObject["coordinates"]!.Type == JTokenType.Null)
Copy link
Collaborator

@qtomlinson qtomlinson Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we query from the db, coordinates were excluded in the projection on Line 66. So without modifying the find option, coordinates will not be available. There is a _id field from the db, e.g. "_id": "pypi/npmjs/-/lodash/3.4.0", which may be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

changes-notifications blobs are out of sync with production-definition blobs
2 participants