Skip to content

Tools for Misfigured Urls #18

@greebie

Description

@greebie

There is an unresolved issue when parsing for urls that bleed into regular text (often because of rich text features like tables etc.).

For example,

https://www.example.com/index.html.Beginning_of_following_paragraph which could be resolved by accepting only one period after the url, except that

https://www.example.com/index.htmlBeginning_of_following_paragraph would still not be resolved.

I think an easier solution might be to offer some optional cleaning functions for the dataframes that archivr produces, but there could be other ideas.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions