|
| 1 | +# 2024-12-09 Implementation Plan: Proxy Frontend API Requests Through Nuxt |
| 2 | + |
| 3 | +**Author**: @obulat |
| 4 | + |
| 5 | +## Reviewers |
| 6 | + |
| 7 | +- [ ] @dhruvkb |
| 8 | +- [ ] @krysal |
| 9 | + |
| 10 | +## Project Links |
| 11 | + |
| 12 | +- [Project Thread](https://github.com/WordPress/openverse/issues/3473) |
| 13 | +- [Project Milestone](https://github.com/WordPress/openverse/milestone/35) |
| 14 | + |
| 15 | +This project does not have a project proposal because the scope and rationale of |
| 16 | +the project are clear, as defined in the project thread. |
| 17 | + |
| 18 | +## Overview |
| 19 | + |
| 20 | +Currently, client requests[^1] from the frontend to the API are sent without |
| 21 | +authentication. As a result, strict API rate limiting on unauthenticated |
| 22 | +requests would inadvertently throttle legitimate traffic from openverse.org |
| 23 | +users. By proxying all client requests[^1] through the Nuxt server, we can |
| 24 | +authenticate them and thus distinguish them from the anonymous traffic in the |
| 25 | +API. |
| 26 | + |
| 27 | +This implementation plan outlines how we will route all frontend traffic to the |
| 28 | +API through Nuxt server routes — effectively “authenticating” these requests at |
| 29 | +the API level — and configure Cloudflare rate-limiting to protect these routes |
| 30 | +from excessive automated traffic that is against Openverse Terms of Service. |
| 31 | + |
| 32 | +While this plan does not include lowering unauthenticated rate limits in the |
| 33 | +API, it establishes the groundwork for it. This approach will ensure better |
| 34 | +stability and fairness of the API service, while also preserving the normal user |
| 35 | +experience on the frontend. |
| 36 | + |
| 37 | +## Expected Outcomes |
| 38 | + |
| 39 | +1. **Distinguishable Frontend Requests in the API**: Frontend requests are |
| 40 | + authenticated at the API level and no longer constrained by unauthenticated |
| 41 | + rate limits. |
| 42 | + |
| 43 | +2. **Protected Nuxt Server Routes and Fair Access**: Nuxt server routes are |
| 44 | + protected from automated abuse and suspicious traffic by Cloudflare |
| 45 | + rate-limiting, ensuring that legitimate users (including those in shared IP |
| 46 | + environments, such as schools and libraries) maintain reliable access without |
| 47 | + undue friction. |
| 48 | + |
| 49 | +## Request Flow |
| 50 | + |
| 51 | +From the Nuxt client perspective, the request flow will not change much: instead |
| 52 | +of going directly to the API, requests will go to the Nuxt server routes |
| 53 | +(`/api/**`). The main difference is that, when rate-limited, the client will |
| 54 | +receive a Cloudflare challenge response that will be handled by |
| 55 | +[Nuxt Turnstile Module](https://github.com/nuxt-modules/turnstile). |
| 56 | + |
| 57 | +The Nuxt server routes will authenticate the requests by adding the API token, |
| 58 | +and sign requests with HMAC to allow Cloudflare to apply rate-limiting on a |
| 59 | +per-user basis, even when multiple users share the same IP (e.g., in NAT |
| 60 | +environments). |
| 61 | + |
| 62 | +```{mermaid} |
| 63 | +sequenceDiagram |
| 64 | + participant O as Client |
| 65 | + participant C as Cloudflare |
| 66 | + participant NS as Nuxt Server |
| 67 | + participant A as API |
| 68 | + O ->> C: HMAC-signed request |
| 69 | + Note right of C: "Check the the rate limits based on cookie&IP" |
| 70 | + alt successful |
| 71 | + C ->> NS: request |
| 72 | + NS ->> A: authenticated request |
| 73 | + A ->> NS: response |
| 74 | + NS ->> O: response |
| 75 | + else rate-limited |
| 76 | + C ->> O: turnstile widget |
| 77 | + O ->> C: Cloudflare challenge cookie |
| 78 | + C ->> NS: request |
| 79 | + NS ->> A: authenticated request |
| 80 | + A ->> NS: response |
| 81 | + NS ->> O: response |
| 82 | + end |
| 83 | +``` |
| 84 | + |
| 85 | +## Cloudflare Rate Limiting |
| 86 | + |
| 87 | +Originally, I thought of using Cloudflare managed challenges to protect the Nuxt |
| 88 | +server routes from abuse. However, this would have required issuing a challenge |
| 89 | +to every new user, which would have been a poor user experience. Instead, we |
| 90 | +will use Cloudflare rate limiting to protect the Nuxt server routes from abuse. |
| 91 | +The rate limiting will be based on the IP and cookie combination, and will |
| 92 | +trigger a managed challenge only when the request volume from a user exceeds the |
| 93 | +defined threshold. This way, legitimate users will not be inconvenienced by |
| 94 | +challenges, while automated traffic will be rate-limited and challenged. |
| 95 | + |
| 96 | +- **Supported Browsers for Challenges**: |
| 97 | + [Cloudflare Documentation](https://developers.cloudflare.com/waf/reference/cloudflare-challenges/#supported-browsers) |
| 98 | + notes that managed challenges support modern browsers, but does not show the |
| 99 | + minimum supported versions. Users with older browsers may struggle, |
| 100 | + potentially creating accessibility issues. |
| 101 | + |
| 102 | +- **Referer Header Changes After Challenge**: After passing a managed challenge, |
| 103 | + the referer header may change: |
| 104 | + [Referer Header after Challenge](https://developers.cloudflare.com/waf/reference/cloudflare-challenges/#referer-header) |
| 105 | + We must ensure this does not disrupt our request logic. |
| 106 | + |
| 107 | +- **Multi-language Support**: Challenges support multiple languages [^2], |
| 108 | + although the number of supported languages is lower than the number of |
| 109 | + languages supported by Openverse. |
| 110 | + |
| 111 | +### Impact on the Nuxt Server |
| 112 | + |
| 113 | +- **Load on Nuxt CPU/memory** Proxying the requests by the Nuxt server will |
| 114 | + increase the server load. We need to ensure that the server can handle that |
| 115 | + load by monitoring and adjusting the values if necessary. The server proxy |
| 116 | + routes should be written in such a way that they are not consuming CPU |
| 117 | + resources heavily. |
| 118 | + |
| 119 | +## Prior Art |
| 120 | + |
| 121 | +- HMAC signing: |
| 122 | + [Sign k6 requests with HMAC to enable WAF bypass](https://github.com/WordPress/openverse/pull/4908) |
| 123 | + |
| 124 | +## Step-by-step plan |
| 125 | + |
| 126 | +We’ll implement this plan in the following order. I’ve noted which steps depend |
| 127 | +on others, and included references to the detailed instructions below: |
| 128 | + |
| 129 | +1. [Move API Token to Nuxt server middleware](#move-api-token-to-nuxt-server-middleware) |
| 130 | +2. [Set up Nuxt Server proxy routes](#set-up-nuxt-server-proxy-routes) |
| 131 | +3. [Add the frontend feature flag](#add-the-frontend-feature-flag) |
| 132 | +4. [Set up Cloudflare rate limiting for staging](#set-up-cloudflare-rate-limiting-for-staging) |
| 133 | +5. [Use server routes instead of the `api-client` in the frontend when the feature flag is on](#use-server-routes-in-the-frontend-when-the-feature-flag-is-on) |
| 134 | +6. [Set up the Nuxt turnstile module](#set-up-the-nuxt-turnstile-module) |
| 135 | +7. [Set up Cloudflare rate limiting for production](#set-up-cloudflare-rate-limiting-for-production) |
| 136 | +8. [Switch on the flag in production](#switch-the-feature-flag-on-in-production) |
| 137 | +9. [Monitor the Cloudflare dashboard and Sentry logs for challenge occurrences](#monitor-the-cloudflare-dashboard-and-sentry-logs-for-challenge-occurrences) |
| 138 | + |
| 139 | +## Step Details |
| 140 | + |
| 141 | +### Move API Token to Nuxt server middleware |
| 142 | + |
| 143 | +Extract the functionality that generates the API token from |
| 144 | +[`frontend/src/plugins/01.api-token.server.ts`](https://github.com/WordPress/openverse/blob/a9441f7e38fcf0d56f9732122c9d3106a87eabe5/frontend/src/plugins/01.api-token.server.ts) |
| 145 | +to `frontend/server/utils/api-token.ts`. |
| 146 | + |
| 147 | +Create the |
| 148 | +[Nuxt server middleware](https://nuxt.com/docs/guide/directory-structure/server#server-middleware) |
| 149 | +that requests the API token for every request, and adds the token to the event |
| 150 | +context. In the new Nuxt server routes, we will be able to use it as |
| 151 | +`event.context.apiToken`. |
| 152 | + |
| 153 | +To use the token in the current frontend code, update the |
| 154 | +[`use-api-client` composable](https://github.com/WordPress/openverse/blob/a9441f7e38fcf0d56f9732122c9d3106a87eabe5/frontend/src/composables/use-api-client.ts#L7) |
| 155 | +to use the token from the event context: |
| 156 | + |
| 157 | +```ts |
| 158 | +/** Old code */ |
| 159 | +// const { $openverseApiToken: accessToken } = useNuxtApp() |
| 160 | +/** New code */ |
| 161 | +const { apiToken } = useRequestEvent().context |
| 162 | +``` |
| 163 | + |
| 164 | +### Set up Nuxt Server proxy routes |
| 165 | + |
| 166 | +- Create the Nuxt server routes under `/api/`: |
| 167 | + |
| 168 | + - `/api/search/[type]` - for search requests |
| 169 | + - `/api/[type]/[id]` - for single result requests |
| 170 | + - `/api/[type]/related/[id]` - for related media requests |
| 171 | + |
| 172 | +- Create a helper function that accepts the H3 `event` object, and extracts the |
| 173 | + request details to forward to the API: |
| 174 | + - media type |
| 175 | + - the API media type slug (i.e., the special case of "images" for "image") |
| 176 | + - media id, if applicable |
| 177 | + - search query |
| 178 | + - headers that should be passed on to the API |
| 179 | +- From the |
| 180 | + [k6 implementation](https://github.com/WordPress/openverse/pull/4908), copy |
| 181 | + the helper function that signs the request with HMAC. |
| 182 | + |
| 183 | +The server route handler should: |
| 184 | + |
| 185 | +- generate the relevant API request URL (i.e., converting `/api/image/?q=cat` to |
| 186 | + `<apiUrl>/v1/images/?q=cat`) |
| 187 | +- add the headers (headers from the original client request that can be proxied, |
| 188 | + authentication, HMAC) |
| 189 | +- send the request using `ofetch` (or `ofetch.raw` to extract the response |
| 190 | + headers for `SEARCH_RESPONSE_TIME` Plausible event) |
| 191 | +- handle errors and return the appropriate response to the client |
| 192 | + |
| 193 | +### Add the frontend feature flag |
| 194 | + |
| 195 | +Add `proxy_requests` feature flag to `frontend/feat/feature-flags.json`: |
| 196 | + |
| 197 | +```json |
| 198 | +{ |
| 199 | + "proxy_requests": { |
| 200 | + "description": "Proxy frontend requests through Nuxt server", |
| 201 | + "status": { |
| 202 | + "staging": "switchable", |
| 203 | + "production": "disabled" |
| 204 | + }, |
| 205 | + "defaultState": "off", |
| 206 | + "storage": "cookie" |
| 207 | + } |
| 208 | +} |
| 209 | +``` |
| 210 | + |
| 211 | +### Set up Cloudflare rate limiting for staging |
| 212 | + |
| 213 | +Add the initial rate limiting to the Nuxt server routes (`/api/**`) in the |
| 214 | +infrastructure repository (_maintainers only_). |
| 215 | + |
| 216 | +Cloudflare rules consist of two parts: the expression to match the request and |
| 217 | +the action to take when the request matches the expression [^3]. |
| 218 | + |
| 219 | +#### Expression |
| 220 | + |
| 221 | +The rate limiting expression should be opposite to the one we use in k6 tests: |
| 222 | +we want to limit the requests that are not authenticated with HMAC. |
| 223 | + |
| 224 | +``` |
| 225 | +http.host eq "staging.openverse.org" |
| 226 | +and starts_with(http.request.uri.path, "/api/") |
| 227 | +and not is_timed_hmac_valid_v0(...) |
| 228 | +``` |
| 229 | + |
| 230 | +#### Action |
| 231 | + |
| 232 | +The action should be `managed_challenge`, which will issue a challenge to the |
| 233 | +user when the rate limit is exceeded. |
| 234 | + |
| 235 | +#### Testing the rate limits |
| 236 | + |
| 237 | +When testing this feature, we can adjust the staging limits using the UI. I |
| 238 | +envisage that we will need to adjust the limits many times because testing the |
| 239 | +Cloudflare rate limiting locally is not possible. |
| 240 | + |
| 241 | +### Use server routes in the frontend when the feature flag is on |
| 242 | + |
| 243 | +Create two fetch functions in the stores: one that uses the current direct API |
| 244 | +requests with `api-service`, and the other that sends the request to the Nuxt |
| 245 | +server route. The fetch function to use should be selected based on the feature |
| 246 | +flag status. |
| 247 | + |
| 248 | +The change will need to be added to the `related`, `single-media` and `media` |
| 249 | +stores. Here's sample implementation in the media store, |
| 250 | +`frontend/stores/media/index.ts`: |
| 251 | + |
| 252 | +```ts |
| 253 | +if (featureFlagStore.isOn("proxy_requests")) { |
| 254 | + const result = getProxiedResponse(mediaType, queryParams) |
| 255 | +} else { |
| 256 | + const client = useApiClient() |
| 257 | + const result = await getApiClientResponse(client, mediaType, queryParams) |
| 258 | + if (result.error) { |
| 259 | + // handle the error |
| 260 | + return null |
| 261 | + } |
| 262 | + // handle the response |
| 263 | +} |
| 264 | + |
| 265 | +const getProxiedResponse = async ( |
| 266 | + mediaType: string, |
| 267 | + queryParams: Record<string, string> |
| 268 | +) => { |
| 269 | + try { |
| 270 | + const { eventPayload, data } = await $fetch( |
| 271 | + `/api/search/${mediaType}/?${queryParams}` |
| 272 | + ) |
| 273 | + return { eventPayload, data } |
| 274 | + } catch (error) { |
| 275 | + // return the $fetch error |
| 276 | + } |
| 277 | +} |
| 278 | + |
| 279 | +const getApiClientResponse = async ( |
| 280 | + client: ApiClient, |
| 281 | + mediaType: string, |
| 282 | + queryParams: Record<string, string> |
| 283 | +) => { |
| 284 | + try { |
| 285 | + const { eventPayload, data } = await client.search(mediaType, queryParams) |
| 286 | + return { eventPayload, data } |
| 287 | + } catch (error) { |
| 288 | + // return the error |
| 289 | + } |
| 290 | +} |
| 291 | +``` |
| 292 | + |
| 293 | +### Set up the Nuxt turnstile module |
| 294 | + |
| 295 | +When the Cloudflare rate limiting triggers a managed challenge, the client will |
| 296 | +receive an HTML Cloudflare challenge response. To handle this response, we can |
| 297 | +use the [Nuxt turnstile module](https://nuxt.com/modules/turnstile). |
| 298 | + |
| 299 | +This is the part of the plan I'm most unsure about. I couldn't understand the |
| 300 | +details of how this module works, so we would probably need to do a lot of |
| 301 | +testing to make sure it works as expected. Unfortunately, we cannot test |
| 302 | +Cloudflare challenge responses locally because the proxy is only set up in |
| 303 | +staging/production. We will have to use staging for this, and to make the |
| 304 | +testing easier, we can set the staging rate limits to be very low. |
| 305 | + |
| 306 | +We should capture any problems with the turnstile module in Sentry to make sure |
| 307 | +that no users are affected negatively by the rate-limiting. Sentry will also |
| 308 | +give us an idea on how the older browsers are handling the challenges. |
| 309 | + |
| 310 | +### Set up Cloudflare rate limiting for production |
| 311 | + |
| 312 | +The rate limiting rule should be similar to that of the staging environment. |
| 313 | + |
| 314 | +The rates should be generous enough to accommodate normal user behavior. Later, |
| 315 | +they can be made stricter to deter automated abuse. |
| 316 | + |
| 317 | +[More information on how request counts are calculated](https://developers.cloudflare.com/waf/rate-limiting-rules/request-rate/) |
| 318 | + |
| 319 | +### Switch the feature flag on in production |
| 320 | + |
| 321 | +Set the `proxy_requests` feature flag to `enabled` in the production |
| 322 | +environment. This will start routing all frontend requests through the Nuxt |
| 323 | +server. |
| 324 | + |
| 325 | +### Monitor the Cloudflare dashboard and Sentry logs for challenge occurrences |
| 326 | + |
| 327 | +The following data need to be monitored: |
| 328 | + |
| 329 | +- Cloudflare dashboard (Security > WAF > Rate limiting rules) shows the number |
| 330 | + of issued challenges, and the |
| 331 | + [Challenge solve rate (CSR)](https://developers.cloudflare.com/bots/concepts/challenge-solve-rate/) |
| 332 | + for the rule. The CSR should be very high since we expect that most of the |
| 333 | + frontend usage is not automated. |
| 334 | +- Nuxt CPU and memory usage. In case the server is under heavy load, we might |
| 335 | + need to adjust the rate limits and/or frontend task values. |
| 336 | + |
| 337 | +## Rejected Alternatives |
| 338 | + |
| 339 | +- **Always Issuing a Challenge Without Thresholds**: Issuing a managed challenge |
| 340 | + to every new user, regardless of load or conditions, would inconvenience users |
| 341 | + behind NATs and degrade the initial user experience. While simpler to |
| 342 | + implement (we would not need to calculate the exact rate limit), it fails our |
| 343 | + design goal of minimizing friction for normal users. |
| 344 | + |
| 345 | +- **Proxying thumbnail requests** The thumbnail requests from the frontend will |
| 346 | + remain anonymous on the API level. Since controlling search and related |
| 347 | + requests naturally throttles excessive thumbnail retrieval, we consider it |
| 348 | + unnecessary to further load the Nuxt server. |
| 349 | + |
| 350 | +## Dependencies |
| 351 | + |
| 352 | +- **Feature Flags**: |
| 353 | + |
| 354 | + - `proxy_requests` |
| 355 | + |
| 356 | +- **Infrastructure**: Cloudflare WAF rules, monitoring (Sentry, Cloudflare |
| 357 | + dashboard). |
| 358 | + |
| 359 | +- **Tools & Packages**: |
| 360 | + - `ofetch` This would be a great occasion to move away from `axios`. `ofetch` |
| 361 | + is used internally by Nuxt, so it's a good choice for the server proxying. |
| 362 | + The store requests can be made with Nuxt helper wrapper over `ofetch`, |
| 363 | + `$fetch`. |
| 364 | + - `h3` for server route handling |
| 365 | + - [Nuxt turnstile module](https://nuxt.com/modules/turnstile) - the module for |
| 366 | + handling challenges returned by the Cloudflare rate limiting from the Nuxt |
| 367 | + server requests |
| 368 | + |
| 369 | +## Accessibility and Privacy |
| 370 | + |
| 371 | +- **Accessibility**: |
| 372 | + |
| 373 | + - Cloudflare challenges are generally accessible, but older browsers may fail. |
| 374 | + - Multilingual support helps non-English speakers. |
| 375 | + |
| 376 | +- **Privacy**: |
| 377 | + - No personal data is stored, cookies are short-lived and anonymous. |
| 378 | + |
| 379 | +## Rollback Strategy |
| 380 | + |
| 381 | +- Set the `proxy_requests` feature flag to `off`. This will automatically |
| 382 | + deactivate the rate limiting rules, since the requests won't be going through |
| 383 | + the Nuxt server anymore. |
| 384 | + |
| 385 | +[^1]: |
| 386 | + `Client requests` - Requests made by the frontend client to the API after |
| 387 | + the first page load. These requests are sent whenever a user submits a |
| 388 | + search, selects a filter or a search result. These requests are currently |
| 389 | + send directly to Openverse API without any authentication, because using the |
| 390 | + API credentials in the client code would expose them to the public. |
| 391 | + |
| 392 | +[^2]: |
| 393 | + [Cloudflare multi-language support](https://developers.cloudflare.com/waf/reference/cloudflare-challenges/#multi-language-support) |
| 394 | + |
| 395 | +[^3]: |
| 396 | + [Cloudflare rate limiting rules](https://developers.cloudflare.com/waf/rate-limiting-rules/) |
0 commit comments