Consume BOM in the text() method of fetch bodies (#36192)

In the fetch spec, the `text()` method of `Body` (an interface mixin
implemented by both `Request` and `Response`) consumes the body with
the Encoding spec "UTF-8 decode" algorithm, which skips the UTF-8 BOM
if it is present at the beginning of the body. Servo's implementation
does not do that. This patch fixes this.

Signed-off-by: Andreu Botella <abotella@igalia.com>
This commit is contained in:
Andreu Botella 2025-03-28 20:02:48 +01:00 committed by GitHub
parent 94bcab177e
commit 95c3033456
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 8 additions and 61 deletions

View file

@ -737,8 +737,15 @@ fn run_package_data_algorithm(
/// <https://fetch.spec.whatwg.org/#ref-for-concept-body-consume-body%E2%91%A4>
fn run_text_data_algorithm(bytes: Vec<u8>) -> Fallible<FetchedData> {
// This implements the Encoding standard's "decode UTF-8", which removes the
// BOM if present.
let no_bom_bytes = if bytes.starts_with(b"\xEF\xBB\xBF") {
&bytes[3..]
} else {
&bytes
};
Ok(FetchedData::Text(
String::from_utf8_lossy(&bytes).into_owned(),
String::from_utf8_lossy(no_bom_bytes).into_owned(),
))
}

View file

@ -1,30 +1,6 @@
[text-utf8.any.html]
[UTF-8 with BOM with Request.text()]
expected: FAIL
[UTF-8 with BOM with fetched data (UTF-16 charset)]
expected: FAIL
[UTF-8 with BOM with fetched data (UTF-8 charset)]
expected: FAIL
[UTF-8 with BOM with Response.text()]
expected: FAIL
[text-utf8.any.worker.html]
[UTF-8 with BOM with Request.text()]
expected: FAIL
[UTF-8 with BOM with fetched data (UTF-16 charset)]
expected: FAIL
[UTF-8 with BOM with fetched data (UTF-8 charset)]
expected: FAIL
[UTF-8 with BOM with Response.text()]
expected: FAIL
[text-utf8.any.serviceworker.html]
expected: ERROR

View file

@ -8,36 +8,18 @@
[Fetching a resource from the same origin, but spelled with a trailing dot.]
expected: FAIL
[Fetching a resource from the same origin, but spelled with a trailing dot.: sec-fetch-dest]
expected: FAIL
[Fetching a resource from the same origin, but spelled with a trailing dot.: sec-fetch-mode]
expected: FAIL
[Fetching a resource from the same origin, but spelled with a trailing dot.: sec-fetch-site]
expected: FAIL
[Fetching a resource from the same site, but spelled with a trailing dot.]
expected: FAIL
[Fetching a resource from the same site, but spelled with a trailing dot.: sec-fetch-dest]
expected: FAIL
[Fetching a resource from the same site, but spelled with a trailing dot.: sec-fetch-mode]
expected: FAIL
[Fetching a resource from the same site, but spelled with a trailing dot.: sec-fetch-site]
expected: FAIL
[Fetching a resource from a cross-site host, spelled with a trailing dot.]
expected: FAIL
[Fetching a resource from a cross-site host, spelled with a trailing dot.: sec-fetch-dest]
expected: FAIL
[Fetching a resource from a cross-site host, spelled with a trailing dot.: sec-fetch-mode]
expected: FAIL
[Fetching a resource from a cross-site host, spelled with a trailing dot.: sec-fetch-site]
expected: FAIL
@ -46,35 +28,17 @@
[Fetching a resource from the same origin, but spelled with a trailing dot.]
expected: FAIL
[Fetching a resource from the same origin, but spelled with a trailing dot.: sec-fetch-dest]
expected: FAIL
[Fetching a resource from the same origin, but spelled with a trailing dot.: sec-fetch-mode]
expected: FAIL
[Fetching a resource from the same origin, but spelled with a trailing dot.: sec-fetch-site]
expected: FAIL
[Fetching a resource from the same site, but spelled with a trailing dot.]
expected: FAIL
[Fetching a resource from the same site, but spelled with a trailing dot.: sec-fetch-dest]
expected: FAIL
[Fetching a resource from the same site, but spelled with a trailing dot.: sec-fetch-mode]
expected: FAIL
[Fetching a resource from the same site, but spelled with a trailing dot.: sec-fetch-site]
expected: FAIL
[Fetching a resource from a cross-site host, spelled with a trailing dot.]
expected: FAIL
[Fetching a resource from a cross-site host, spelled with a trailing dot.: sec-fetch-dest]
expected: FAIL
[Fetching a resource from a cross-site host, spelled with a trailing dot.: sec-fetch-mode]
expected: FAIL
[Fetching a resource from a cross-site host, spelled with a trailing dot.: sec-fetch-site]
expected: FAIL