Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httr parser missing in httr2 for reading image/png #426

Closed
pmlefeuvre-met opened this issue Jan 16, 2024 · 4 comments
Closed

httr parser missing in httr2 for reading image/png #426

pmlefeuvre-met opened this issue Jan 16, 2024 · 4 comments

Comments

@pmlefeuvre-met
Copy link

pmlefeuvre-met commented Jan 16, 2024

Hi,

I am trying to download an image/png using the following API request: https://wms.geonorge.no/skwms1/wms.nib?SERVICE=WMS&VERSION=1.3.0&request=GetMap&FORMAT=image/png&CRS=EPSG:25833&LAYERS=ortofoto&bbox=260867,6652618,261067,6652818&WIDTH=500&HEIGHT=500.

However, the body of the response does not correspond to a matrix or an image, could it be an encoding or cache issue with an option, I could not find? I do manage to get it using httr.

con <- "https://wms.geonorge.no/skwms1/wms.nib?SERVICE=WMS&VERSION=1.3.0&request=GetMap&FORMAT=image/png&CRS=EPSG:25833&LAYERS=ortofoto&bbox=260867,6652618,261067,6652818&WIDTH=500&HEIGHT=500"

# ----------------
# With httr, it works
wms <- httr::content(httr::GET(con)) * 255
dim(wms) 
# > [1] 500 500   4
wms
# , , 1
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18]
#  [1,]  166  173  179  194  232  226  201  179  139   119   113   109   108    99    90    93    99   104
#  [2,]  165  172  178  224  228  204  179  156  122   115   115   108   107   100    93    94    94    97

# To plot
wms <- terra::rast(wms)
terra::plotRGB(wms)

# ----------------
# With httr2
resp <- httr2::request(con) |> httr2::req_perform()
wms2 <- resp |> httr2::resp_body_raw()
length(wms2) 
# > [1] 445029   ---- expected 500*500*4 = 1000000
wms2
#    [1] 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52 00 00 01 f4 00 00 01 f4 08 06 00 00 00 cb d6 df 8a 00
#  [35] 00 20 00 49 44 41 54 78 9c 6c bd 59 8f 24 59 92 a5 f7 c9 dd 54 d5 cc dc 3d 22 72 cf 5a ba 32 6b c9 ae
#  [69] cc ac 25 7b 7a 29 b2 81 e6 0c 40 02 04 1f 08 82 0f 04 f8 23 08 be 11 04 08 10 05 f0 8d 3f 84 3f 67 f8
@hadley
Copy link
Member

hadley commented Jan 16, 2024

I can't reproduce the call, but I'd suggest debugging the problem by setting verbosity = 1 and then inspecting the headers. You might also want to check the content-type and content-length headers explicitly.

library(httr2)

url <- "https://wms.geonorge.no/skwms1/wms.nib?SERVICE=WMS&VERSION=1.3.0&request=GetMap&FORMAT=image/png&CRS=EPSG:25833&LAYERS=ortofoto&bbox=260867,6652618,261067,6652818&WIDTH=500&HEIGHT=500"

resp <- request(url) |> req_perform(verbosity = 1)
#> -> GET /skwms1/wms.nib?SERVICE=WMS&VERSION=1.3.0&request=GetMap&FORMAT=image/png&CRS=EPSG: 25833&LAYERS=ortofoto&bbox=260867,6652618,261067,6652818&WIDTH=500&HEIGHT=500 HTTP/1.1
#> -> Host: wms.geonorge.no
#> -> User-Agent: httr2/1.0.0 r-curl/5.2.0 libcurl/8.4.0
#> -> Accept: */*
#> -> Accept-Encoding: deflate, gzip
#> -> 
#> <- HTTP/1.1 200 OK
#> <- Server: nginx/1.20.1
#> <- Date: Tue, 16 Jan 2024 15:41:29 GMT
#> <- Content-Type: application/vnd.ogc.se_xml;charset=ISO-8859-1
#> <- Content-Length: 407
#> <- Connection: keep-alive
#> <- Set-Cookie: JSESSIONID=6544B553D7A9DF966A4AD2E60C85352E; Path=/skwms1; Secure; HttpOnly
#> <- Access-Control-Allow-Origin: *
#> <- Access-Control-Allow-Methods: GET, POST, OPTIONS
#> <- Access-Control-Allow-Headers: DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range
#> <-

resp |> resp_content_type()
#> [1] "application/vnd.ogc.se_xml"
resp |> resp_header("Content-Length")
#> [1] "407"

cat(strwrap(resp_body_string(resp)), sep = "\n")
#> <?xml version="1.0" encoding="ISO-8859-1" standalone="yes" ?>
#> <ServiceExceptionReport version="1.1.0"> <ServiceException> *** HTTP
#> TCP/IP AUT *** Applikasjon '/skwms1/wms.nib' Bruker kan ikke
#> autentiseres. 'Ticket request' TCP/IP adresse: 76.31.204.113
#> tjenesteid: wms.nib ikke godkjent av autorisasjons tjener: [7] TCP/IP
#> adresse ikke funnet.. </ServiceException> </ServiceExceptionReport>

Created on 2024-01-16 with reprex v2.0.2.9000

If you compare the request headers that httr is sending to those that httr2 is sending, you'll probably figure out why you're getting a different response.

@pmlefeuvre-met
Copy link
Author

pmlefeuvre-met commented Jan 17, 2024

I add below a not country-restricted URL for the reproducing example and experienced the same behaviour. Caching in httr2 would explain the data/image cutoff, but I am still unsure why is the encoding different.

con <- "https://imagery.pasda.psu.edu/arcgis/services/pasda/UrbanTreeCanopy_Landcover/MapServer/WmsServer?SERVICE=WMS&version=1.1.1&REQUEST=GetMap&LAYERS=10&STYLES=&BBOX=-77.87304,40.78975,-77.85828,40.80228,-77.85828,40.80228&SRS=EPSG:4326&FORMAT=image/png&WIDTH=500&HEIGHT=500"
# ----------------
# With httr, it works
wms <- httr::content(httr::GET(con)) * 255
dim(wms) 
# > [1] 500 500   3
wms
# , , 1
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18]
#  [1,]  255  255  255   57   57   57   57   57   57    57    57    57    57    57    57    57   255   255
#  [2,]   57  255  255   57   57   57   57   57   57    57    57    57    57    57    57   255   255   255


# To plot
wms <- terra::rast(wms)
terra::plotRGB(wms)

# ----------------
# With httr2
resp <- httr2::request(con) |> httr2::req_perform()
wms2 <- resp |> httr2::resp_body_raw()
length(wms2) 
# > [1] 55745   ---- while expected length is 500*500*3 = 750000
wms2
#   [1] 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52 00 00 01 f4 00 00 01 f4 08 02 00 00 00 44 b4 48 dd 00
#  [35] 00 20 00 49 44 41 54 78 9c ec fd bb 52 de da f2 05 8e b6 7e e5 c4 b9 c9 71 08 d1 aa 22 39 26 df a9 1f

Here is the httr2 req_perform() with verbose with the earlier URL

library(httr2)
url <- "https://wms.geonorge.no/skwms1/wms.nib?SERVICE=WMS&VERSION=1.3.0&request=GetMap&FORMAT=image/png&CRS=EPSG:25833&LAYERS=ortofoto&bbox=260867,6652618,261067,6652818&WIDTH=500&HEIGHT=500"

resp <- request(url) |> req_perform(verbosity = 1)
#> -> GET /skwms1/wms.nib?SERVICE=WMS&VERSION=1.3.0&request=GetMap&FORMAT=image/png&CRS=EPSG: 25833&LAYERS=ortofoto&bbox=260867,6652618,261067,6652818&WIDTH=500&HEIGHT=500 HTTP/1.1
#> -> Host: wms.geonorge.no
#> -> User-Agent: httr2/1.0.0 r-curl/5.2.0 libcurl/7.81.0
#> -> Accept: */*
#> -> Accept-Encoding: deflate, gzip, br, zstd
#> -> 
#> <- HTTP/1.1 200 OK
#> <- Server: nginx/1.20.1
#> <- Date: Wed, 17 Jan 2024 07:57:32 GMT
#> <- Content-Type: image/png
#> <- Transfer-Encoding: chunked
#> <- Connection: keep-alive
#> <- Set-Cookie: JSESSIONID=4912088CC422200B03182A384E12742C; Path=/skwms1; Secure; HttpOnly
#> <- X-Content-Type-Options: nosniff
#> <- X-XSS-Protection: 1; mode=block
#> <- Set-Cookie: AGS_ROLES=uYCmzm1opa8A42METOmcixdA3FYW3+qI; Expires=Wed, 17-Jan-2024 07:58:31 GMT; HttpOnly
#> <- Strict-Transport-Security: max-age=15724800; includeSubDomains
#> <- Access-Control-Allow-Origin: *
#> <- Access-Control-Allow-Methods: GET, POST, OPTIONS
#> <- Access-Control-Allow-Headers: DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range
#> <-

resp |> resp_content_type()
#> [1] "image/png"

resp |> resp_header("Content-Length")
#> NULL

cat(strwrap(resp_body_string(resp)), sep = "\n")
#> NA

Created on 2024-01-17 with reprex v2.1.0

I continue investigating cache and encoding options in httr2.

@pmlefeuvre-met
Copy link
Author

pmlefeuvre-met commented Jan 17, 2024

The issue is related to png/jpeg decompression.
httr has a parser for content() called content-parser.R that decompresses the image depending on its type (l.85-95):

# Text formats -----------------------------------------------------------------
parsers$`image/jpeg` <- function(x, type = NULL, encoding = NULL, ...) {
  need_package("jpeg")
  jpeg::readJPEG(x)
}

parsers$`image/png` <- function(x, type = NULL, encoding = NULL, ...) {
  need_package("png")
  png::readPNG(x)
}

With httr2, one needs to add png::readPNG() or jpeg::readJPEG() to decompress the image, such as:

wms2 <- resp |> httr2::resp_body_raw()
wms2 <- png::readPNG(wms2)*255
wms2 <- rast(wms2)
terra::plotRGB(wms2)

@pmlefeuvre-met pmlefeuvre-met changed the title Encoding/cache issue downloading image/png? httr parser missing in httr2 for reading image/png Jan 17, 2024
@hadley
Copy link
Member

hadley commented Jan 17, 2024

Oh yeah, good catch. httr2 doesn't automatically parse content because there are often multiple packages that you might want to use to parse a given file format, and in general, it doesn't save much code compared to doing it yourself.

@hadley hadley closed this as completed Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants