Skip to content

Nikita-search-bar#25

Merged
Shrinks99 merged 5 commits intomainfrom
nikita-search-bar
May 16, 2025
Merged

Nikita-search-bar#25
Shrinks99 merged 5 commits intomainfrom
nikita-search-bar

Conversation

@nikitalokhmachev-ai
Copy link
Collaborator

@nikitalokhmachev-ai nikitalokhmachev-ai commented May 15, 2025

User description

This PR tackes #22 and #10.

  1. The search bar is fully functional.
    a) whenever the user types something, webpage title, link, and text are being indexed.
    b) On the UI side, the text found is being highlighted.

  2. Whenever the user opens an archived page and then clicks another archived page in the app, that page is opened in the same tab.


PR Type

Enhancement


Description

  • Search functionality for archived pages

  • Open archived pages in same tab

  • Highlight search matches in results

  • Improved navigation between archived pages


Changes walkthrough 📝

Relevant files
Enhancement
argo-archive-list.ts
Implement search and improve navigation                                   

src/argo-archive-list.ts

  • Added search functionality with FlexSearch index
  • Implemented text highlighting for search matches
  • Added method to display search results with context
  • Modified page opening logic to reuse existing tabs
  • +117/-9 
    sidepanel.ts
    Add search UI and connect to archive list                               

    src/sidepanel.ts

  • Added searchQuery property and input handler
  • Connected search input to archive list component
  • Removed unused WebTorrent import
  • +21/-3   
    Bug fix
    bg.ts
    Fix TypeScript errors in background script                             

    src/ext/bg.ts

  • Fixed TypeScript errors with proper annotations
  • Improved parameter handling in startRecorder function
  • +16/-5   
    Dependencies
    package.json
    Add FlexSearch dependency                                                               

    package.json

    • Added flexsearch dependency (version 0.7.31)
    +1/-0     

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @pr-agent-monadical
    Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🎫 Ticket compliance analysis ✅

    10 - Fully compliant

    Compliant requirements:

    • Implement archive search functionality
    • Surface the same extracted text info that ArchiveWeb.page's search does
    • Follow the provided mockup for search state

    22 - Fully compliant

    Compliant requirements:

    • Make the extension open archived pages in the same tab instead of creating a new tab each time
    • When viewing an archived page and clicking another archived page in the sidebar, navigate within the same tab
    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Search Index Efficiency

    The search index is rebuilt completely whenever pages change. For large archives, this could be inefficient. Consider incremental updates to the index.

    if (changed.has("pages")) {
      this.flex = new FlexIndex<string>({
        tokenize: "forward",
        resolution: 3,
      });
      this.pages.forEach((p) => {
        // include title + text (and URL if you like)
    
        const toIndex = [p.title ?? "", p.text ?? ""].join(" ");
        this.flex.add(p.ts, toIndex);
      });
    }
    Regex Security

    The regex used for highlighting matches could potentially cause performance issues with certain input patterns, despite the attempt to escape special characters.

    const safeQuery = query.trim().replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
    const regex = new RegExp(safeQuery, "ig");
    TS-Ignore Comments

    Multiple TypeScript errors are suppressed with @ts-expect-error comments rather than properly fixing the type issues, which could lead to runtime errors.

    //@ts-expect-error tabs has any type
    async (tabs) => {
      for (const tab of tabs) {
        if (!isValidUrl(tab.url)) continue;
    
        await startRecorder(
          tab.id,
          {
            // @ts-expect-error - collId implicitly has an 'any' type.
            collId: defaultCollId,
            port: null,
            autorun,
          },
          //@ts-expect-error - 2 parameters but 3
          tab.url,

    Comment on lines +220 to +237
    private _highlightMatch(
    text?: string,
    query: string = "",
    maxLen = 180,
    ): string {
    if (!text) return "";

    const safeQuery = query.trim().replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
    const regex = new RegExp(safeQuery, "ig");

    const matchIndex = text.search(regex);
    if (matchIndex === -1) return text.slice(0, maxLen) + "...";

    const previewStart = Math.max(0, matchIndex - 30);
    const preview = text.slice(previewStart, previewStart + maxLen);

    return preview.replace(regex, (m) => `<b>${m}</b>`) + "...";
    }

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Suggestion: The current implementation is vulnerable to XSS attacks because it directly sets innerHTML with user-controlled content. Even though you're escaping regex special characters, HTML special characters aren't being escaped, which could allow injection of malicious HTML. [security, importance: 9]

    Suggested change
    private _highlightMatch(
    text?: string,
    query: string = "",
    maxLen = 180,
    ): string {
    if (!text) return "";
    const safeQuery = query.trim().replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
    const regex = new RegExp(safeQuery, "ig");
    const matchIndex = text.search(regex);
    if (matchIndex === -1) return text.slice(0, maxLen) + "...";
    const previewStart = Math.max(0, matchIndex - 30);
    const preview = text.slice(previewStart, previewStart + maxLen);
    return preview.replace(regex, (m) => `<b>${m}</b>`) + "...";
    }
    private _highlightMatch(
    text?: string,
    query: string = "",
    maxLen = 180,
    ): string {
    if (!text) return "";
    const safeQuery = query.trim().replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
    const regex = new RegExp(safeQuery, "ig");
    const matchIndex = text.search(regex);
    if (matchIndex === -1) return text.slice(0, maxLen) + "...";
    const previewStart = Math.max(0, matchIndex - 30);
    const preview = text.slice(previewStart, previewStart + maxLen);
    // Escape HTML special characters first
    const escapedPreview = preview.replace(/[&<>"']/g, (m) => {
    return {'&': '&amp;', '<': '&lt;', '>': '&gt;', '"': '&quot;', "'": '&#39;'}[m] || m;
    });
    return escapedPreview.replace(regex, (m) => `<b>${m}</b>`) + "...";
    }

    Comment on lines +179 to +185
    this.flex = new FlexIndex();
    this.pages.forEach((p) => {
    const text = p.url + (p.title ? ` ${p.title}` : "");
    this.flex.add(p.ts, text); // use ts (timestamp) as a unique id
    });
    }

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Suggestion: The buildIndex() method is defined but never called in the code. Additionally, it creates a new index without the same configuration options used in the updated() method, and it doesn't include the page text in the indexed content, unlike the implementation in updated(). [possible issue, importance: 7]

    Suggested change
    private buildIndex() {
    this.flex = new FlexIndex();
    this.pages.forEach((p) => {
    const text = p.url + (p.title ? ` ${p.title}` : "");
    this.flex.add(p.ts, text); // use ts (timestamp) as a unique id
    });
    }
    private buildIndex() {
    this.flex = new FlexIndex<string>({
    tokenize: "forward",
    resolution: 3,
    });
    this.pages.forEach((p) => {
    const toIndex = [p.title ?? "", p.text ?? "", p.url].join(" ");
    this.flex.add(p.ts, toIndex);
    });
    }

    Comment on lines 240 to 251
    await startRecorder(
    tabId,
    { collId: defaultCollId, port: null, autorun },
    {
    // @ts-expect-error - collId implicitly has an 'any' type.
    collId: defaultCollId,
    port: null,
    autorun,
    },

    // @ts-expect-error - 2 parameters but 3
    tab.url,
    );

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Suggestion: The code is passing three parameters to startRecorder() but adding a TypeScript error comment indicating it expects only two. This suggests a mismatch between the function signature and its usage. Either the function should be updated to accept three parameters, or the third parameter should be removed from all calls. [possible issue, importance: 8]

    Suggested change
    await startRecorder(
    tabId,
    { collId: defaultCollId, port: null, autorun },
    {
    // @ts-expect-error - collId implicitly has an 'any' type.
    collId: defaultCollId,
    port: null,
    autorun,
    },
    // @ts-expect-error - 2 parameters but 3
    tab.url,
    );
    await startRecorder(
    tabId,
    {
    collId: defaultCollId,
    port: null,
    autorun,
    url: tab.url, // Include URL in the options object instead of as a separate parameter
    }
    );

    Comment on lines 764 to 774
    style="flex: 1; overflow-y: auto; position: relative; flex-grow: 1;"
    >
    <div id="my-archives" class="tab-panel" active>
    <argo-archive-list id="archive-list"></argo-archive-list>
    <argo-archive-list
    id="archive-list"
    .filterQuery=${
    //@ts-expect-error - TS2339 - Property 'searchQuery' does not exist on type 'ArgoViewer'.
    this.searchQuery
    }
    ></argo-archive-list>
    </div>

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Suggestion: Instead of using TypeScript error suppressions for the missing searchQuery property, properly define it in the class properties. This will make the code more maintainable and type-safe. [general, importance: 3]

    New proposed code:
     <div
       class="tab-panels"
       style="flex: 1; overflow-y: auto; position: relative; flex-grow: 1;"
     >
       <div id="my-archives" class="tab-panel" active>
         <argo-archive-list
           id="archive-list"
    -      .filterQuery=${
    -        //@ts-expect-error - TS2339 - Property 'searchQuery' does not exist on type 'ArgoViewer'.
    -        this.searchQuery
    -      }
    +      .filterQuery=${this.searchQuery}
         ></argo-archive-list>
       </div>

    import "@material/web/divider/divider.js";
    import { mapIntegerToRange, truncateString } from "./utils";
    import { CollectionLoader } from "@webrecorder/wabac/swlib";
    import WebTorrent from "webtorrent";
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    is this not needed? @nikitalokhmachev-ai

    Copy link
    Collaborator Author

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    @yamijuan No since we have a global import.

    @Shrinks99 Shrinks99 merged commit fde1602 into main May 16, 2025
    1 of 4 checks passed
    @Shrinks99 Shrinks99 deleted the nikita-search-bar branch May 16, 2025 15:13
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    3 participants