Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify accent and punctuation correction #1439

Closed
wcjord opened this issue Jan 14, 2025 Discussed in #1438 · 8 comments · Fixed by #1511
Closed

Simplify accent and punctuation correction #1439

wcjord opened this issue Jan 14, 2025 Discussed in #1438 · 8 comments · Fixed by #1511

Comments

@wcjord
Copy link

wcjord commented Jan 14, 2025

Discussed in #1438

Originally posted by wcjord January 14, 2025
If a "grammar" correction is just about adding accents or punctuation, then maybe we should skip the generation of distractors and correction activity, instead giving the user just the correction.

In this case, we'd also exclude this from learning analytics, and consider these "without assistance" uses of the words involved.

Skipping this step would be 1) faster and 2) less expensive.

Code for identifying correction of this type:

import 'dart:convert';
import 'package:characters/characters.dart';

String normalizeString(String input) {
  // Step 1: Remove diacritics (accents)
  var normalized = input.characters.map((c) {
    return ascii.encode(c).isNotEmpty ? c : c.toLowerCase();
  }).join();
  normalized = normalized.replaceAll(RegExp(r'[^\x00-\x7F]'), '');

  // Step 2: Remove punctuation
  normalized = normalized.replaceAll(RegExp(r'[^\w\s]', ''), '');

  // Step 3: Convert to lowercase
  normalized = normalized.toLowerCase();

  // Step 4: Trim and normalize whitespace
  normalized = normalized.replaceAll(RegExp(r'\s+'), ' ').trim();

  return normalized;
}

bool areEqualIgnoringPunctuationAndAccents(String str1, String str2) {
  return normalizeString(str1) == normalizeString(str2);
}

void main() {
  String text1 = "¿Cómo estás?";
  String text2 = "Como estas?";
  
  bool isEqual = areEqualIgnoringPunctuationAndAccents(text1, text2);
  print(isEqual); // Output: true
}

@linhtphung If you're fine with this from a learning perspective (and I think you are), we can convert to an issue and knock it out pretty quickly.

@wcjord
Copy link
Author

wcjord commented Jan 14, 2025

@ggurdin Let me know if there's any open questions in this!

@ggurdin
Copy link
Collaborator

ggurdin commented Jan 14, 2025

@wcjord Does the initial list of choices from grammar_lite ever contain tokens for each choice? Right now they seem to be empty, so constructs can't be assigned without calling getSpanDetails

@ggurdin
Copy link
Collaborator

ggurdin commented Jan 15, 2025

Put this info in span data model / match

@ggurdin ggurdin linked a pull request Jan 21, 2025 that will close this issue
@ggurdin
Copy link
Collaborator

ggurdin commented Jan 23, 2025

Waiting on https://github.com/pangeachat/2-step-choreographer/issues/356 to finish this

@sienna-sterling
Copy link
Collaborator

In Spanish correction for just accents and punctation works fine. Only one solution given. Testing on android. However, closing '?' on corrections for questions asked do not appear.

@linhtphung
Copy link

How do I test this @ggurdin?

@ggurdin
Copy link
Collaborator

ggurdin commented Jan 30, 2025

@linhtphung You would type a message with a grammar error that's only an error with accents or punctuation (i.e., in Spanish "¿Como estas?" => "¿Cómo estás?", the only difference is accents). When you open this error during IGC, you should only see one choice in the activity.

@linhtphung
Copy link

I saw one choice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants