Skip to content

Conversation

yaugenst-flex
Copy link
Collaborator

@yaugenst-flex yaugenst-flex commented Apr 28, 2025

Previously face gradients came from a fixed 30 × 30 grid so the accuracy depended on the size of the polygon. Edge gradients were evaluated only at the center of each face, so long edges and tall slabs produced incorrect adjoint results depending on the simulation.

Changed:

  • Adaptive, spacing-based sampling for both faces and vertices/edges.
  • Extra edge-cases covered: slab fully outside sim, faces at +/- inf, and fix for correct normal sign for axis==1.

I also added pretty comprehensive analytic tests for the face and edge integration.

The new implementation is also significantly faster, especially for cases where many structures are included in a geometry group:
image

Memory usage is about the same as before.

The speedup comes mainly from interpolator caching and removing Pydantic and xarray overhead. DerivativeInfo is now a regular dataclass, and the DerivativeSurfaceMesh abstraction is gone in favor of a new evaluate_gradient_at_points function in DerivativeInfo. Most operations in the "hot path" now use numpy instead of xarray interpolation.

Quadrature scheme

First, the code determines a hypothetical number of uniform samples to classify an edge as "short" or "long." Short edges are then evaluated with a direct Gauss quadrature that uses about 40% fewer points than the uniform method. Long edges, which would require more than seven sample points, are instead broken into smaller segments. A 7-point Gauss quadrature is then applied to each segment to maintain accuracy without excessive computational cost.

Greptile Summary

Significantly improves polyslab gradient calculations in autograd by implementing adaptive sampling and performance optimizations. Key changes:

  • Replaced fixed 30x30 grid with adaptive spacing-based sampling for faces and edges, making accuracy independent of polygon size
  • Improved performance through interpolator caching and reduced overhead by replacing Pydantic/xarray operations with numpy, showing significant speedup
  • Added intelligent quadrature scheme that uses fewer points (~40%) for short edges while segmenting long edges with 7-point Gauss quadrature
  • Added comprehensive handling of edge cases including slabs outside simulation bounds, faces at infinity, and correct normal sign for axis==1
  • Added extensive analytic tests validating face and edge integration accuracy

@yaugenst-flex yaugenst-flex self-assigned this Apr 28, 2025
@yaugenst-flex yaugenst-flex force-pushed the yaugenst-flex/polyslab-grads branch from f37734b to 00b99de Compare April 28, 2025 17:24
@yaugenst-flex yaugenst-flex requested a review from tylerflex April 28, 2025 17:24
@yaugenst-flex yaugenst-flex marked this pull request as ready for review April 28, 2025 17:25
@yaugenst-flex yaugenst-flex force-pushed the yaugenst-flex/polyslab-grads branch 2 times, most recently from d73a485 to d188b31 Compare April 28, 2025 17:42
Copy link
Collaborator

@tylerflex tylerflex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yaugenst-flex. This is pretty complex, I think it will require a lot of testing for both performance and gradient accuracy. Probably in all of the notebooks. I'd say maybe best left to 2.9 to be safe. how much testing in notebooks have you done already? either way will be great to have more accurate gradients for PolySlab!

Copy link
Contributor

@groberts-flex groberts-flex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really exciting and will be a huge improvement. Is it possible to run some of the numerical testing and possibly modify the polyslab numerical test file as well to make sure we can test going forward the surface integration (and keep testing cases where it is adding the most improvement).

Copy link
Contributor

@groberts-flex groberts-flex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updates look good to me!

@yaugenst-flex yaugenst-flex force-pushed the yaugenst-flex/polyslab-grads branch 2 times, most recently from 332d8f4 to c0ab734 Compare May 2, 2025 07:25
@groberts-flex
Copy link
Contributor

Was in the structure.py file and noticed that in make_adjoint_monitors (

def make_adjoint_monitors(
), the monitor size for polyslabs along the extrusion axis is set to 0. I am assuming to get the full integration to work, we would need that to be the height of the polyslab.

@yaugenst-flex yaugenst-flex force-pushed the yaugenst-flex/polyslab-grads branch from c0ab734 to 95c1619 Compare May 9, 2025 19:04
@yaugenst-flex yaugenst-flex added the 2.9 will go into version 2.9.* label May 9, 2025
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 files reviewed, 4 comments
Edit PR Review Bot Settings | Greptile

@yaugenst-flex yaugenst-flex force-pushed the yaugenst-flex/polyslab-grads branch 5 times, most recently from 0ae943a to f03db4f Compare July 9, 2025 10:41
@flexcompute flexcompute deleted a comment from github-actions bot Jul 9, 2025
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

293 files reviewed, 1 comment
Edit PR Review Bot Settings | Greptile

@yaugenst-flex
Copy link
Collaborator Author

@groberts-flex @tylerflex this is ready for final review. Apologize in advance, this got pretty huge. I grouped the changes into a few distinct commits but there is still one monster diff 😅

I ran all relevant notebooks and results are the same as or better than before. In particular the metalens is doing quite a bit better now.

@groberts-flex
Copy link
Contributor

haven't made it too far on this, but will continue on it tomorrow! really cool how much performance improvement there is. I'm currently running some of the numerical test benchmarks I have for polyslab with it and will especially check when we have thicker polyslabs that are high index and will try some cases with further spaced vertices.

@yaugenst-flex yaugenst-flex force-pushed the yaugenst-flex/polyslab-grads branch 2 times, most recently from 17067da to 85414dd Compare July 10, 2025 11:20
Copy link
Contributor

github-actions bot commented Jul 10, 2025

Diff Coverage

Diff: origin/develop...HEAD, staged and unstaged changes

  • tidy3d/compat.py (100%)
  • tidy3d/components/autograd/constants.py (100%)
  • tidy3d/components/autograd/derivative_utils.py (87.7%): Missing lines 169-170,172-174,194-197,223,313,351,355-356,358,424-427
  • tidy3d/components/autograd/utils.py (100%)
  • tidy3d/components/geometry/base.py (100%)
  • tidy3d/components/geometry/polyslab.py (93.7%): Missing lines 1457,1461-1463,1496,1529,1571-1572,1579,1587-1588,1595,1656,1737,1753,1788,1888
  • tidy3d/components/geometry/primitives.py (80.0%): Missing lines 305
  • tidy3d/components/structure.py (100%)
  • tidy3d/web/api/autograd/autograd.py (100%)

Summary

  • Total: 471 lines
  • Missing: 37 lines
  • Coverage: 92%

tidy3d/components/autograd/derivative_utils.py

  165     @staticmethod
  166     def _nan_to_num_if_needed(coords: np.ndarray) -> np.ndarray:
  167         """Convert NaN and infinite values to finite numbers, optimized for finite inputs."""
  168         # skip check for small arrays - overhead exceeds benefit
! 169         if coords.size < 1000:
! 170             return np.nan_to_num(coords, posinf=LARGE_NUMBER, neginf=-LARGE_NUMBER)
  171 
! 172         if np.isfinite(coords).all():
! 173             return coords
! 174         return np.nan_to_num(coords, posinf=LARGE_NUMBER, neginf=-LARGE_NUMBER)
  175 
  176     @staticmethod
  177     def _evaluate_with_interpolators(
  178         interpolators: dict, coords: np.ndarray

  190         -------
  191         dict[str, np.ndarray]
  192             Dictionary mapping component names to field values at coordinates.
  193         """
! 194         coords = DerivativeInfo._nan_to_num_if_needed(coords)
! 195         if coords.dtype != GRADIENT_DTYPE_FLOAT and coords.dtype != GRADIENT_DTYPE_COMPLEX:
! 196             coords = coords.astype(GRADIENT_DTYPE_FLOAT, copy=False)
! 197         return {name: interp(coords) for name, interp in interpolators.items()}
  198 
  199     def create_interpolators(self, dtype=GRADIENT_DTYPE_FLOAT) -> dict:
  200         """Create interpolators for field components and permittivity data.

  219         from scipy.interpolate import RegularGridInterpolator
  220 
  221         cache_key = str(dtype)
  222         if cache_key in self._interpolators_cache:
! 223             return self._interpolators_cache[cache_key]
  224 
  225         interpolators = {}
  226         coord_cache = {}

  309             (N,) array of gradient values at each surface point. Must be integrated
  310             with appropriate quadrature weights to get total gradient.
  311         """
  312         if interpolators is None:
! 313             raise NotImplementedError(
  314                 "Direct field evaluation without interpolators is not implemented. "
  315                 "Please create interpolators using 'create_interpolators()' first."
  316             )

  347         # get permittivity jumps across interface
  348         if "eps_inf" in interpolators:
  349             eps_in = interpolators["eps_inf"](spatial_coords)
  350         else:
! 351             eps_in = self.eps_in
  352 
  353         if "eps_no" in interpolators:
  354             eps_out = interpolators["eps_no"](spatial_coords)
! 355         elif self.eps_background is not None:
! 356             eps_out = self.eps_background
  357         else:
! 358             eps_out = self.eps_out
  359 
  360         delta_eps_inv = 1.0 / eps_in - 1.0 / eps_out
  361         delta_eps = eps_in - eps_out

  420             dx_candidates.append(wl_fraction * lambda_min)
  421 
  422         # skin depth-based sampling for metallic materials
  423         if np.any(eps_real <= 0):
! 424             omega = 2 * np.pi * self.frequency
! 425             eps_neg = eps_real[eps_real <= 0]
! 426             delta_min = C_0 / (omega * np.sqrt(np.abs(eps_neg).max()))
! 427             dx_candidates.append(wl_fraction * delta_min)
  428 
  429         return max(min(dx_candidates), min_allowed_spacing)
  430 

tidy3d/components/geometry/polyslab.py

  1453 
  1454         # early return if polyslab is not in simulation domain
  1455         slab_min, slab_max = self.slab_bounds
  1456         if (slab_max <= sim_min[self.axis]) or (slab_min >= sim_max[self.axis]):
! 1457             log.warning(
  1458                 "'PolySlab' lies completely outside the simulation domain.",
  1459                 log_once=True,
  1460             )
! 1461             for p in derivative_info.paths:
! 1462                 vjps[p] = np.zeros_like(self.vertices) if p == ("vertices",) else 0.0
! 1463             return vjps
  1464 
  1465         # create interpolators once for ALL derivative computations
  1466         # use provided interpolators if available to avoid redundant field data conversions
  1467         interpolators = derivative_info.interpolators or derivative_info.create_interpolators(

  1492                 if idx == 0:
  1493                     v *= -1
  1494                 vjps[path] = v
  1495             else:
! 1496                 raise ValueError(f"No derivative defined w.r.t. 'PolySlab' field '{path}'.")
  1497 
  1498         return vjps
  1499 
  1500     def _compute_derivative_slab_bounds(

  1525         r2_max = min(r2_max, poly_max_r2)
  1526 
  1527         if (r1_max <= r1_min) and (r2_max <= r2_min):
  1528             # the polygon does not intersect the current simulation slice
! 1529             return 0.0
  1530 
  1531         # re-compute the extents after clipping to the polygon bounds
  1532         extents = np.array([r1_max - r1_min, r2_max - r2_min])

  1567         line_dim = 1 if np.isclose(extents[0], 0) else 0
  1568 
  1569         poly_min_r1, poly_min_r2, poly_max_r1, poly_max_r2 = face_poly.bounds
  1570         if line_dim == 0:  # x varies, y is fixed
! 1571             l_min = max(r1_min, poly_min_r1)
! 1572             l_max = min(r1_max, poly_max_r1)
  1573         else:  # y varies, x is fixed
  1574             l_min = max(r2_min, poly_min_r2)
  1575             l_max = min(r2_max, poly_max_r2)

  1575             l_max = min(r2_max, poly_max_r2)
  1576 
  1577         length = l_max - l_min
  1578         if np.isclose(length, 0):
! 1579             return 0.0
  1580 
  1581         dx = derivative_info.adaptive_vjp_spacing()
  1582         n_seg = max(1, int(np.ceil(length / dx)))
  1583         coords = np.linspace(l_min, l_max, 2 * n_seg + 1, dtype=GRADIENT_DTYPE_FLOAT)[1::2]

  1583         coords = np.linspace(l_min, l_max, 2 * n_seg + 1, dtype=GRADIENT_DTYPE_FLOAT)[1::2]
  1584 
  1585         # build XY coordinates and in-plane direction vectors
  1586         if line_dim == 0:
! 1587             xy = np.column_stack((coords, np.full_like(coords, r2_min)))
! 1588             dir_vec_plane = np.column_stack((np.ones_like(coords), np.zeros_like(coords)))
  1589         else:
  1590             xy = np.column_stack((np.full_like(coords, r1_min), coords))
  1591             dir_vec_plane = np.column_stack((np.zeros_like(coords), np.ones_like(coords)))

  1591             dir_vec_plane = np.column_stack((np.zeros_like(coords), np.ones_like(coords)))
  1592 
  1593         inside = shapely.contains_xy(face_poly, xy[:, 0], xy[:, 1])
  1594         if not inside.any():
! 1595             return 0.0
  1596 
  1597         xy = xy[inside]
  1598         dir_vec_plane = dir_vec_plane[inside]
  1599         n_pts = len(xy)

  1652         pts = np.column_stack((r1_flat, r2_flat))
  1653 
  1654         in_face = shapely.contains_xy(face_poly, pts[:, 0], pts[:, 1])
  1655         if not in_face.any():
! 1656             return 0.0
  1657 
  1658         xyz = self.unpop_axis_vect(
  1659             np.full(in_face.sum(), ax_val, dtype=GRADIENT_DTYPE_FLOAT), pts[in_face]
  1660         )

  1733         z1 = min(self.slab_bounds[1], sim_max[self.axis])
  1734 
  1735         # early return if no z-slices
  1736         if (not is_2d) and z1 <= z0:
! 1737             return np.zeros_like(vertices)
  1738 
  1739         if is_2d:
  1740             z_centers = np.array([self.center_axis], dtype=GRADIENT_DTYPE_FLOAT)
  1741             dz_surf = 1.0

  1749         normals_2d = np.delete(basis["norm"], self.axis, axis=1)
  1750 
  1751         # use provided interpolators or create them if not provided
  1752         if interpolators is None:
! 1753             interpolators = derivative_info.create_interpolators(dtype=GRADIENT_DTYPE_FLOAT)
  1754 
  1755         def compute_edge_clip_bounds(v0_3d, v1_3d, sim_min, sim_max):
  1756             """
  1757             Compute parametric bounds [t_start, t_end] for the portion of edge

  1784                     return None
  1785 
  1786             # avoid numerical issues at boundaries
  1787             if t_end <= t_start + EDGE_CLIP_TOLERANCE:
! 1788                 return None
  1789 
  1790             return (t_start, t_end)
  1791 
  1792         # estimate total number of patches for pre-allocation

  1884         for ei, (v0, v1) in enumerate(zip(vertices, next_v)):
  1885             edge_vec = v1 - v0
  1886             L = np.linalg.norm(edge_vec)
  1887             if np.isclose(L, 0.0):
! 1888                 continue
  1889 
  1890             # check if edge is definitely inside 2D bounds
  1891             edge_min_2d = np.minimum(v0, v1)
  1892             edge_max_2d = np.maximum(v0, v1)

tidy3d/components/geometry/primitives.py

  301             "paths": [("vertices",), ("slab_bounds", 0), ("slab_bounds", 1)],
  302             "deep": False,
  303         }
  304         if derivative_info.interpolators is not None:
! 305             update_kwargs["interpolators"] = derivative_info.interpolators
  306 
  307         derivative_info_polyslab = derivative_info.updated_copy(**update_kwargs)
  308         vjps_polyslab = polyslab._compute_derivatives(derivative_info_polyslab)

@yaugenst-flex yaugenst-flex force-pushed the yaugenst-flex/polyslab-grads branch 2 times, most recently from dc24380 to 674dea9 Compare July 12, 2025 19:10
@momchil-flex momchil-flex added the rc2 2nd pre-release label Jul 14, 2025
@yaugenst-flex yaugenst-flex force-pushed the yaugenst-flex/polyslab-grads branch 2 times, most recently from e7670d6 to 9ecad6b Compare July 15, 2025 08:53
Copy link
Contributor

@groberts-flex groberts-flex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good to me. Left one extra very minor question.

I've also been able to test this numerically and seen either same or improved gradients for those tests. And I was able to test this week that it worked and solved out of the box a sampling issue in polyslab gradient computation for a user.

Everything seems well tested and the performance is great!

@yaugenst-flex yaugenst-flex force-pushed the yaugenst-flex/polyslab-grads branch 2 times, most recently from 2809d02 to 9da31d9 Compare July 16, 2025 07:53
@momchil-flex momchil-flex enabled auto-merge July 16, 2025 08:06
@momchil-flex momchil-flex added this pull request to the merge queue Jul 16, 2025
@daquinteroflex daquinteroflex removed this pull request from the merge queue due to a manual request Jul 16, 2025
@daquinteroflex daquinteroflex enabled auto-merge July 16, 2025 08:32
- Create tidy3d/components/autograd/constants.py
- Move MAX_NUM_TRACED_STRUCTURES and MAX_NUM_ADJOINT_PER_FWD from autograd.py
- Move PTS_PER_WVL_MAT_CYLINDER_DISCRETIZE from primitives.py
- Add functools.lru_cache decorator to _shapely_is_older_than()
- Remove global _SHAPELY_VERSION variable
- Cache up to 8 version comparison results
- Reduces repeated parsing of version strings in hot paths
… edges

- Implement adaptive field sampling for PolySlab and Cylinder for shape
  derivatives
- Replace fixed grid with Gauss-Legendre quadrature
- Sample fields along all surface boundaries (edges and faces)
- Handle edge cases: slabs outside simulation, faces at ±inf, 2D simulations
- Add analytical tests for gradient computation
- Remove `DerivativeSurfaceMesh` abstraction in favor of direct evaluation
- Convert `DerivativeInfo` from Pydantic model to dataclass for efficiency
- Implement interpolator sharing at `GeometryGroup` level
- Add Gauss-Legendre quadrature caching
- Optimize memory allocation with pre-allocated arrays
- Ensure dtype consistency throughout gradient pipeline
@yaugenst-flex yaugenst-flex force-pushed the yaugenst-flex/polyslab-grads branch from 9da31d9 to 1422bad Compare July 16, 2025 09:37
@daquinteroflex daquinteroflex added this pull request to the merge queue Jul 16, 2025
Merged via the queue into develop with commit 26e5981 Jul 16, 2025
10 checks passed
@daquinteroflex daquinteroflex deleted the yaugenst-flex/polyslab-grads branch July 16, 2025 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.9 will go into version 2.9.* rc2 2nd pre-release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants