|  | 
|  | 1 | +# Remote image thumbnailing | 
|  | 2 | + | 
|  | 3 | +## Types of remote images | 
|  | 4 | + | 
|  | 5 | +### Inline image URLs | 
|  | 6 | + | 
|  | 7 | +These are `` images, which are | 
|  | 8 | +explicit[^1]. At message send time, we should render these as spinners in our | 
|  | 9 | +initial `rendered_content`[^2], and insert a background process to fetch them. If | 
|  | 10 | +the background fetch returns an image type, we should resize the image into our | 
|  | 11 | +set of supported sizes/types, and cache these in-memory[^3]. The duration of | 
|  | 12 | +this cache should be based on the caching headers of the remote image. | 
|  | 13 | + | 
|  | 14 | +The message should then be silently updated to point to a signed `/thumbnail` | 
|  | 15 | +URL with a "reasonable" size/format; the signing covers the URL, but not the | 
|  | 16 | +size, since clients may rewrite it to their preferred size. | 
|  | 17 | + | 
|  | 18 | +On the server side, the `/thumbnail` URL validates the signature, and returns | 
|  | 19 | +400 (possibly with the content of a static failure image, varying on `Accepts` | 
|  | 20 | +header) if the signature is invalid. It then checks that the requested | 
|  | 21 | +size/format is in its supported set, and "rounds" to the closest match if it is | 
|  | 22 | +not (based on `Accepts` header for format). If the thumbnail size/format is in | 
|  | 23 | +cache, it serves it. | 
|  | 24 | + | 
|  | 25 | +If the requested image is not in our local cache, it must re-fetch it. This | 
|  | 26 | +happens synchronously, after which it must resize and store it in cache, and | 
|  | 27 | +then provide the appropriate resized image to the client. Any non-expired[^4] | 
|  | 28 | +size/format combinations should be re-rendered and inserted into the cache at | 
|  | 29 | +the same time, since the network fetch time is likely significant compared to | 
|  | 30 | +the resize time, we should endeavor to provide a consistent preview if the image | 
|  | 31 | +is mutating over time, and one access may herald other accesses from other | 
|  | 32 | +clients. | 
|  | 33 | + | 
|  | 34 | +In the event that either the initial fetch, or subsequent re-fetches, times out, | 
|  | 35 | +returns a document with a non-image `Content-Type`, or cannot be parsed as its | 
|  | 36 | +purported image type, then we cache and return a stock "invalid image" | 
|  | 37 | +content. We may wish to set an upper time bound on this (or multiple different | 
|  | 38 | +bounds, based on the failure type), to handle intermittent failures. | 
|  | 39 | + | 
|  | 40 | +The content requests must be made through Smokescreen, to ensure that they | 
|  | 41 | +cannot be redirected (via DNS or HTTP) into private IP space. | 
|  | 42 | + | 
|  | 43 | +[^1]: | 
|  | 44 | +    We render these even if image previews are disabled, presumably? Since | 
|  | 45 | +    that's mostly about not fetching random network resources, not about | 
|  | 46 | +    preventing image uploads from rendering inline? | 
|  | 47 | + | 
|  | 48 | +[^2]: | 
|  | 49 | +    How do we know how much space the spinner should take up? We do not know | 
|  | 50 | +    anything about the height of the returned image yet, and and yet need to | 
|  | 51 | +    choose a height that minimizes or avoides veritcal movement. | 
|  | 52 | + | 
|  | 53 | +[^3]: In memcached? Or on disk, but then we need to do manual flushing of it? | 
|  | 54 | +[^4]: Possibly _all_ size/format combinations, for maximum consistency? | 
|  | 55 | + | 
|  | 56 | +### Inline URLs | 
|  | 57 | + | 
|  | 58 | +These are messages of the form: | 
|  | 59 | + | 
|  | 60 | +```markdown | 
|  | 61 | +Look at my picture: | 
|  | 62 | + | 
|  | 63 | +https://example.com/image.png | 
|  | 64 | +``` | 
|  | 65 | + | 
|  | 66 | +Or: | 
|  | 67 | + | 
|  | 68 | +```markdown | 
|  | 69 | +[Look at my picture](https://example.com/image.png) | 
|  | 70 | +``` | 
|  | 71 | + | 
|  | 72 | +That is, a link (implied or explicit) with a URL which ends in an image | 
|  | 73 | +extension, assuming that image previews are enabled on the server and realm. | 
|  | 74 | + | 
|  | 75 | +The extension provides a light implication that the URL is an image, which we | 
|  | 76 | +should inline. The above plan for inline image URLs holds, with the exception | 
|  | 77 | +that _nothing_ is inlined upon first message send, and in the event of failure | 
|  | 78 | +or non-image content, the message is not updated in any way[^5]. | 
|  | 79 | + | 
|  | 80 | +The effect of this is that intermittent failures of non-explicit image URLs is | 
|  | 81 | +that they are never retried if they initially fail. | 
|  | 82 | + | 
|  | 83 | +[^5]: | 
|  | 84 | +    This means that these messages will grow taller after sending, which is a | 
|  | 85 | +    bad thing? We could also render them as spinners, and update to plain text | 
|  | 86 | +    if the request fails, which means users will be less likely to have vertical | 
|  | 87 | +    movement, but will see less information about the image until thumbnailing | 
|  | 88 | +    completes. | 
|  | 89 | + | 
|  | 90 | +### Inline bare URLs | 
|  | 91 | + | 
|  | 92 | +These are messages of the form: | 
|  | 93 | + | 
|  | 94 | +```markdown | 
|  | 95 | +https://example.com/image.png | 
|  | 96 | +``` | 
|  | 97 | + | 
|  | 98 | +That is, a body entirely of a URL which ends in an image extension, assuming | 
|  | 99 | +that image previews are enabled on the server and realm. | 
|  | 100 | + | 
|  | 101 | +These are treated as inline bare URLs, with the additional change that the | 
|  | 102 | +entire content of the message is silently updated with the thumbnailed image, | 
|  | 103 | +should it turn out to actually be an image. | 
|  | 104 | + | 
|  | 105 | +### Opengraph images | 
|  | 106 | + | 
|  | 107 | +These are messages of the form: | 
|  | 108 | + | 
|  | 109 | +```markdown | 
|  | 110 | +https://example.com/ | 
|  | 111 | +``` | 
|  | 112 | + | 
|  | 113 | +...where `example.com` has `og:...` tags which we can preview, assuming that | 
|  | 114 | +`INLINE_URL_EMBED_PREVIEW` is enabled and the realm has URL previews enabled. | 
|  | 115 | + | 
|  | 116 | +Any images from this preview will be treated as "Inline image URLs", above. | 
|  | 117 | + | 
|  | 118 | +## Effects on existing URL endpoints | 
|  | 119 | + | 
|  | 120 | +### `/thumbnail?url=...&size=...` | 
|  | 121 | + | 
|  | 122 | +Existing `/thumbnail` URLs are of the form: | 
|  | 123 | + | 
|  | 124 | +    /thumbnail?url=user_uploads%2F2%2F85%2FXoqF0K7XEOLVGylgdpof80RB%2Fimg.png&size=full | 
|  | 125 | +    /thumbnail?url=user_uploads%2F2%2F85%2FXoqF0K7XEOLVGylgdpof80RB%2Fimg.png&size=thumbnail | 
|  | 126 | + | 
|  | 127 | +    /thumbnail?url=https%3A%2F%2Fwww.example.com%2Fimages%2Ffilename.png&size=full | 
|  | 128 | +    /thumbnail?url=https%3A%2F%2Fwww.example.com%2Fimages%2Ffilename.png&size=thumbnail | 
|  | 129 | + | 
|  | 130 | +These were only generated by `THUMBNAIL_IMAGES = True` servers; they may appear | 
|  | 131 | +in historical messages even if it is not currently set. | 
|  | 132 | + | 
|  | 133 | +The former two are currently redirects to the `/user_uploads/` URL, which | 
|  | 134 | +needlessly forces an extra client round-trip and prevents caching. Of those, | 
|  | 135 | +the `size=full` variant will be rendered as if it were a direct request to | 
|  | 136 | +`/user_uploads/2/85/XoqF0K7XEOLVGylgdpof80RB/img.jpg`, and the `size=thumbnail` | 
|  | 137 | +will pick some reasonable thumbnailed size/format from the server's supported | 
|  | 138 | +set, based on `Accepts` header, and act like a request for that `/user_uploads/` | 
|  | 139 | +variant. | 
|  | 140 | + | 
|  | 141 | +The latter two are unsigned requests which accept unauthenticated requests, and | 
|  | 142 | +are not rate-limited; they currently serve a redirect to the equivalent signed | 
|  | 143 | +Camo URL. This means that they can trivially be used as a reflector for | 
|  | 144 | +traffic. We will begin 400'ing such unsigned requests, as a security precaution | 
|  | 145 | +(though we can serve those with a nice static image if we would like to keep | 
|  | 146 | +clients' views prettier). | 
|  | 147 | + | 
|  | 148 | +The endpoint will begin supporting _signed_ URL requests. These sign the `url` | 
|  | 149 | +parameter, and allow the caller to adjust the size and format of the response. | 
|  | 150 | +See "Inline image URLs", above. | 
|  | 151 | + | 
|  | 152 | +### `/external_content/...` | 
|  | 153 | + | 
|  | 154 | +Since thumbnailing is performed on all remote images, there is no need for Camo | 
|  | 155 | +for images in new messages anymore; all images are served either through | 
|  | 156 | +`/user_uploads/...` or `/thumbnail?...` | 
|  | 157 | + | 
|  | 158 | +However, until videos are rendered as their server-side-generated thumbnails, | 
|  | 159 | +videos must continue to go through Camo; previous messages also still encode | 
|  | 160 | +`/external_content/` URLs, which should still be served. | 
|  | 161 | + | 
|  | 162 | +So for backwards-compatibility, the Camo server should be preserved for now, and | 
|  | 163 | +continue to serve `/external_content/` URLs. | 
0 commit comments