Skip to content

Commit 3965c92

Browse files
committed
Merge branch 'main' of https://github.com/LidkeLab/smite into main
2 parents 5cba70b + 0ab1a48 commit 3965c92

85 files changed

Lines changed: 58194 additions & 132 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 345 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,345 @@
1+
# Context-Dependent Requirements in smite
2+
3+
## The Problem
4+
5+
Many SMF (SingleMoleculeFitting) fields are documented as "optional" with default values (often `[]`), but are actually **required in specific contexts**. This leads to:
6+
7+
1. **Confusing errors** - Missing required field causes cryptic downstream error
8+
2. **Documentation ambiguity** - "Optional" doesn't tell full story
9+
3. **User frustration** - Code works in one context, fails in another
10+
4. **Testing gaps** - Field never tested in context where it's required
11+
12+
## Examples of Context-Dependent Fields
13+
14+
### 1. SMF.Data.DataROI
15+
16+
**Documented as:**
17+
```matlab
18+
SMF.Data.DataROI = []; % Optional, default []
19+
```
20+
21+
**Reality:**
22+
- **Optional** when loading from files (gets size from data)
23+
- **REQUIRED** when calling `gaussBlobImage()` (needs output dimensions)
24+
- **Optional** for most analysis workflows
25+
- **Required** for image generation workflows
26+
27+
**Better documentation:**
28+
```matlab
29+
SMF.Data.DataROI = [YStart, XStart, YEnd, XEnd];
30+
% [1, 1, 128, 128] for full 128x128 image
31+
%
32+
% REQUIRED for:
33+
% - smi_sim.GaussBlobs.gaussBlobImage() - defines output size
34+
% - Image generation from SMD
35+
%
36+
% OPTIONAL for:
37+
% - File-based analysis (loaded from data)
38+
% - LocalizeData workflows (gets from image stack)
39+
%
40+
% If omitted when required: Causes index error in image generation
41+
```
42+
43+
**Detection in code:**
44+
```matlab
45+
% WRONG - will fail
46+
SMF = smi_core.SingleMoleculeFitting();
47+
[~, img] = smi_sim.GaussBlobs.gaussBlobImage(SMD, SMF); % ERROR!
48+
49+
% RIGHT - includes required context
50+
SMF = smi_core.SingleMoleculeFitting();
51+
SMF.Data.DataROI = [1, 1, 128, 128]; % Required for this operation
52+
[~, img] = smi_sim.GaussBlobs.gaussBlobImage(SMD, SMF); % Works!
53+
```
54+
55+
### 2. SMF.Data.CameraGain
56+
57+
**Documented as:**
58+
```matlab
59+
SMF.Data.CameraGain = 1; % Default 1
60+
```
61+
62+
**Reality:**
63+
- **Acceptable default** for simulated data (unit gain)
64+
- **Must match calibration** for real camera data
65+
- **Critical** for photon number accuracy
66+
- **Less critical** for position-only analysis
67+
68+
**Better documentation:**
69+
```matlab
70+
SMF.Data.CameraGain = 1; % electrons/ADU
71+
%
72+
% Simulated data: Use 1 (default)
73+
% Real sCMOS: Load from calibration file (typical: 0.5-5)
74+
% Real EMCCD: Set to EM gain value (typical: 50-300)
75+
%
76+
% Impact if incorrect:
77+
% - Photon numbers wrong (affects SNR, precision estimates)
78+
% - Positions usually OK (gain cancels in ratio)
79+
% - Thresholding may fail (wrong photon-based thresholds)
80+
```
81+
82+
### 3. SMF.Data.PixelSize
83+
84+
**Documented as:**
85+
```matlab
86+
SMF.Data.PixelSize = 0.1; % micrometers, default 0.1
87+
```
88+
89+
**Reality:**
90+
- **Any value works** for pixel-space analysis
91+
- **Must be accurate** for physical units
92+
- **Required for calibration** (camera µm to physical µm)
93+
- **Affects clustering** if distance-based
94+
95+
**Better documentation:**
96+
```matlab
97+
SMF.Data.PixelSize = 0.1; % micrometers per pixel
98+
%
99+
% Common values:
100+
% - 100nm pixel, 100x objective, no magnifier: 0.1 µm
101+
% - 16µm pixel, 100x, 1.6x magnifier: 0.1 µm
102+
% - 6.5µm pixel, 60x: 0.108 µm
103+
%
104+
% Used for:
105+
% - Converting pixel coordinates to physical units
106+
% - Drift correction distance thresholds
107+
% - Clustering distance parameters
108+
% - Physical diffusion coefficients
109+
%
110+
% If incorrect:
111+
% - Wrong physical scale (everything scaled incorrectly)
112+
% - Drift correction may fail (wrong search radius)
113+
% - Clustering may under/over-connect
114+
```
115+
116+
## General Pattern
117+
118+
### When a field is context-dependent:
119+
120+
1. **Don't just say "optional"** - Specify contexts
121+
2. **List when required** - Explicit use cases
122+
3. **List when optional** - Where defaults work
123+
4. **Document failure mode** - What happens if missing/wrong
124+
5. **Provide examples** - Show correct usage in both contexts
125+
126+
### Documentation Template
127+
128+
```matlab
129+
SMF.Category.FieldName = <default_value>; % <units>
130+
%
131+
% REQUIRED for:
132+
% - <operation 1> - <why needed>
133+
% - <operation 2> - <why needed>
134+
%
135+
% OPTIONAL for:
136+
% - <operation A> - <why default works>
137+
% - <operation B> - <why not needed>
138+
%
139+
% DEFAULTS to:
140+
% <default_value> - <what this means>
141+
%
142+
% EXAMPLES:
143+
% <common value 1> - <use case>
144+
% <common value 2> - <use case>
145+
%
146+
% IMPACT if incorrect:
147+
% - <consequence 1>
148+
% - <consequence 2>
149+
%
150+
% SEE ALSO: <related fields>, <related docs>
151+
```
152+
153+
## API Design Implications
154+
155+
### Current Issues
156+
157+
1. **Silent failures** - Missing field causes downstream error
158+
2. **Inconsistent validation** - Some functions check, others don't
159+
3. **Poor error messages** - Index errors instead of "missing DataROI"
160+
4. **No hints** - Function doesn't tell you what's needed
161+
162+
### Improvements to Consider
163+
164+
#### Option 1: Validate at function entry
165+
166+
```matlab
167+
function [Model, Data] = gaussBlobImage(SMD, SMF, Bg, Density)
168+
% Validate required fields
169+
if isempty(SMF.Data.DataROI)
170+
error(['gaussBlobImage requires SMF.Data.DataROI to be set.\n' ...
171+
'Example: SMF.Data.DataROI = [1, 1, 128, 128];']);
172+
end
173+
174+
% ... rest of function
175+
end
176+
```
177+
178+
**Pros:** Clear error message, fail fast
179+
**Cons:** Adds validation code to every function
180+
181+
#### Option 2: Separate setup functions
182+
183+
```matlab
184+
% Instead of:
185+
SMF = smi_core.SingleMoleculeFitting();
186+
SMF.Data.DataROI = [1, 1, 128, 128];
187+
SMF.Fitting.PSFSigma = 1.3;
188+
189+
% Provide:
190+
SMF = smi_core.SingleMoleculeFitting.forImageGeneration(128, 128);
191+
% Automatically sets DataROI and other required fields
192+
```
193+
194+
**Pros:** Clearer intent, groups related settings
195+
**Cons:** More API surface, duplicate code
196+
197+
#### Option 3: Context-aware validation
198+
199+
```matlab
200+
function SMF = validateForContext(SMF, context)
201+
% context = 'analysis' | 'simulation' | 'image_generation'
202+
203+
switch context
204+
case 'image_generation'
205+
assert(~isempty(SMF.Data.DataROI), ...
206+
'DataROI required for image generation');
207+
case 'real_data'
208+
assert(~isempty(SMF.Data.CalibrationFile), ...
209+
'Calibration required for real data');
210+
% ... etc
211+
end
212+
end
213+
```
214+
215+
**Pros:** Explicit context checking
216+
**Cons:** User must remember to call, adds complexity
217+
218+
#### Option 4: Documentation improvements only
219+
220+
Keep API as-is, but improve documentation:
221+
- Mark fields as "Context-Dependent"
222+
- Add "Required For" sections
223+
- Improve error messages where possible
224+
- Add validation examples to docs
225+
226+
**Pros:** No API changes, backward compatible
227+
**Cons:** Still relies on user reading docs
228+
229+
## Recommended Approach
230+
231+
**Short term** (for v1.3.0 documentation):
232+
1. Add "Context-Dependent Requirements" section to docs
233+
2. Mark affected fields explicitly in SMF structure docs
234+
3. Add validation examples to function docs
235+
4. Improve error messages where possible (non-breaking)
236+
237+
**Medium term** (v1.4.0):
238+
5. Add input validation to high-risk functions
239+
6. Create helper constructors for common contexts
240+
7. Add warnings for suspicious configurations
241+
8. Create troubleshooting guide for context errors
242+
243+
**Long term** (v2.0.0):
244+
9. Consider context-aware SMF variants
245+
10. Redesign API to make requirements explicit
246+
11. Add comprehensive validation framework
247+
12. Breaking changes OK in major version
248+
249+
## Audit Checklist
250+
251+
When reviewing documentation for context-dependent requirements:
252+
253+
- [ ] Field documented with "optional" or "default"?
254+
- [ ] Check: Is it truly optional in ALL contexts?
255+
- [ ] List contexts where field is required
256+
- [ ] List contexts where field is optional
257+
- [ ] Document what happens if missing when required
258+
- [ ] Add example code showing correct setup
259+
- [ ] Check if function validates input (add if not)
260+
- [ ] Add error message improvement if needed
261+
- [ ] Cross-reference other docs using same field
262+
- [ ] Test code examples in both contexts
263+
264+
## Testing Strategy
265+
266+
Create tests that verify context-dependent behavior:
267+
268+
```matlab
269+
function test_gaussBlobImage_requires_DataROI()
270+
% This should FAIL with helpful error
271+
SMF = smi_core.SingleMoleculeFitting();
272+
% Deliberately omit DataROI
273+
274+
SMD = smi_core.SingleMoleculeData.createSMD();
275+
SMD.X = [64]; SMD.Y = [64];
276+
SMD.Photons = [1000]; SMD.Bg = [5];
277+
SMD.PSFSigma = [1.3]; SMD.FrameNum = [1];
278+
279+
try
280+
[~, ~] = smi_sim.GaussBlobs.gaussBlobImage(SMD, SMF);
281+
error('Should have failed without DataROI');
282+
catch ME
283+
% Check error message is helpful
284+
assert(contains(ME.message, 'DataROI'), ...
285+
'Error should mention DataROI requirement');
286+
end
287+
end
288+
289+
function test_gaussBlobImage_works_with_DataROI()
290+
% This should SUCCEED
291+
SMF = smi_core.SingleMoleculeFitting();
292+
SMF.Data.DataROI = [1, 1, 128, 128]; % Required!
293+
294+
SMD = smi_core.SingleMoleculeData.createSMD();
295+
SMD.X = [64]; SMD.Y = [64];
296+
SMD.Photons = [1000]; SMD.Bg = [5];
297+
SMD.PSFSigma = [1.3]; SMD.FrameNum = [1];
298+
299+
[Model, Data] = smi_sim.GaussBlobs.gaussBlobImage(SMD, SMF);
300+
301+
assert(~isempty(Model), 'Should return model');
302+
assert(size(Model, 1) == 128, 'Size from DataROI');
303+
assert(size(Model, 2) == 128, 'Size from DataROI');
304+
end
305+
```
306+
307+
## Summary
308+
309+
**Key Principles:**
310+
311+
1. **Be explicit** - Don't hide requirements in context
312+
2. **Document both** - When required AND when optional
313+
3. **Show examples** - Correct usage in both contexts
314+
4. **Fail helpfully** - Clear error messages
315+
5. **Test both** - Required context AND optional context
316+
317+
**For LLM Documentation:**
318+
319+
When generating examples:
320+
- Always check if field is context-dependent
321+
- Include required setup for the example context
322+
- Add comments explaining why field is needed
323+
- Cross-reference to full explanation
324+
- Test code actually runs in stated context
325+
326+
**For Code Review:**
327+
328+
Red flags:
329+
- Function assumes field is set without checking
330+
- Error message doesn't mention missing field
331+
- Documentation says "optional" without context
332+
- Example code missing required setup
333+
- No tests for missing required field
334+
335+
## Related Documents
336+
337+
- `.claude/commands/smite_audit_docs.md` - How to audit for these issues
338+
- `.claude/commands/smite_validate_llm_docs.md` - Comprehensive validation
339+
- `doc/llm-guide/core-concepts/smf-structure.md` - SMF field documentation
340+
- `doc/llm-guide/troubleshooting/common-mistakes.md` - User-facing errors
341+
342+
## Version History
343+
344+
- 2025-10-14: Initial documentation of pattern
345+
- Future: Update as API changes or more patterns discovered

0 commit comments

Comments
 (0)