-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Open
Labels
status: waiting-for-triageIssues that we did not analyse yetIssues that we did not analyse yettype: feature
Description
Expected Behavior
The DefaultFieldSet.indexOf()
method should perform field name lookups in O(1) time complexity using a precomputed index map, rather than the current O(n) linear search through the names list.
// Expected: O(1) lookup using HashMap
protected int indexOf(String name) {
if (nameIndexMap == null) {
throw new IllegalArgumentException("Cannot access columns by name without meta data");
}
Integer index = nameIndexMap.get(name);
if (index != null) {
return index;
}
throw new IllegalArgumentException("Cannot access column [" + name + "] from " + names);
}
Current Behavior
Currently, DefaultFieldSet.indexOf()
uses List.indexOf()
which performs a linear search through all field names every time a field is accessed by name. This results in O(n) time complexity.
// Current: O(n) linear search
protected int indexOf(String name) {
if (names == null) {
throw new IllegalArgumentException("Cannot access columns by name without meta data");
}
int index = names.indexOf(name);
if (index >= 0) {
return index;
}
throw new IllegalArgumentException("Cannot access column [" + name + "] from " + names);
}
The performance difference becomes significant when processing CSV files with many columns (50+ fields) and accessing fields by name frequently during batch processing. For a 100-column CSV, accessing the last column requires 100 iterations every time, which can severely impact performance in high-volume batch jobs.
Metadata
Metadata
Assignees
Labels
status: waiting-for-triageIssues that we did not analyse yetIssues that we did not analyse yettype: feature