Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rendering code replaces @xyz with github link in URLs, breaking link targets #204

Open
joshtriplett opened this issue Dec 19, 2024 · 3 comments
Labels
help wanted Extra attention is needed meta Issues related to running the Rust Project Goals program

Comments

@joshtriplett
Copy link
Member

If you write something like [this document](https://hackmd.org/@xyz/document), the rendering code will replace @xyz with a github link, breaking the URL in the hackmd link.

This replacement needs to be markdown-aware, and shouldn't happen in (at least) code blocks and link targets. If that's not easily possible, then we should disable the username-replacement logic until it can be fixed.

#203 was a PR that resulted from having to work around this breakage.

@nikomatsakis
Copy link
Contributor

Yeah, it's currently just some simple regexs. It might be worth breaking out a real markdown parser.

@nikomatsakis
Copy link
Contributor

It seems like comrak can be easily made to do what is needed...

https://github.com/kivikakk/comrak?tab=readme-ov-file#usage

@nikomatsakis nikomatsakis added meta Issues related to running the Rust Project Goals program help wanted Extra attention is needed labels Jan 13, 2025
@nikomatsakis
Copy link
Contributor

Mentoring instructions

The linkifier code is part of the mdbook plugin. It can be found here:

fn linkify(&self, chapter: &mut Chapter) -> anyhow::Result<()> {
for (regex, string) in &self.linkifiers {
chapter.content = regex
.replace_all(&chapter.content, |c: &Captures<'_>| -> String {
// we add `[]` around it
assert!(c[0].starts_with("[") && c[0].ends_with("]"));
let mut href = String::new();
href.push_str(&c[0]);
href.push('(');
c.expand(string, &mut href);
href.push(')');
href
})
.to_string();
}

It's pretty dumb. It iterates over regular expressions read from the TOML file and just blindly replaces them. Username replacement is similar, except the regular expression is hardcoded.

fn link_users(&mut self, chapter: &mut Chapter) -> anyhow::Result<()> {
let usernames: BTreeSet<String> = self
.username
.find_iter(&chapter.content)
.map(|m| m.as_str().to_string())
.collect();
for username in &usernames {
chapter.content = chapter
.content
.replace(username, &format!("[{}][]", self.display_name(username)?));
}
chapter.content.push_str("\n\n");
for username in &usernames {
chapter.content.push_str(&format!(
"[{}]: https://github.com/{}\n",
self.display_name(username)?,
&username[1..]
));
}
Ok(())
}

What I think we want to do is to use the comrak crate as described here to parse the markdown (presumably around here in the code). We would then apply the linkifiers, username substitution, and the other translations to the text nodes in the markdown rather than just applying it to the entire string.

We MIGHT need to change from using <!-- XXX --> comments in that case, not sure how comrak works. But we could also do the comment replacement before we invoke comrak.

The regular expression construction is here:

if let Some(value) = config.get(LINKIFIERS) {
linkifiers = value
.as_table()
.with_context(|| format!("`{}` must be a table", LINKIFIERS))?
.iter()
.map(|(k, v)| {
if let Some(v) = v.as_str() {
Ok((Regex::new(&format!(r"\[{}\]", k))?, v.to_string()))
} else {
Err(anyhow::anyhow!(
"linkifier value for `{}` must be a string",
k
))
}
})
.collect::<Result<_, _>>()?;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed meta Issues related to running the Rust Project Goals program
Projects
None yet
Development

No branches or pull requests

2 participants