Auflösen relativer URIs in Java 8 bei gleichzeitiger Behebung des JDK-FehlersJava

Java-Forum
Guest
 Auflösen relativer URIs in Java 8 bei gleichzeitiger Behebung des JDK-Fehlers

Post by Guest »

Ich befinde mich in einem Szenario, in dem ich bei Java 8 feststecke. Ich habe ein Programm, das URIs mithilfe von java.net.URI erstellt und auflöst. Dies wird im Allgemeinen immer ein http-Schema sein, aber wir werden möglicherweise andere erhalten, die wir verarbeiten müssen.

Code: Select all

import java.net.URI;

class Scratch {
public static void main(String[] args) throws Exception {
URI base = new URI("https://example.com/");
URI other = new URI("path/to/resource?query=hello");
URI result = base.resolve(other);
System.out.println(result);
}
}
Dies wird wie erwartet in den richtigen URI aufgelöst:

https://example.com/path/to /resource?query=hello

Aufgrund von JDK-8272702 verhält sich dies jedoch eher unerwartet, wenn relative URIs in verschiedenen aufgelöst werden andere Möglichkeiten.
Beispiele
Hier sind einige Beispiele für verschiedene relative Auflösungen und ihre Erwartungen:

Code: Select all

import java.net.URI;

class Scratch {
private static final String F = "%-20s %-15s %-35s %-35s%n";
public static void main(String[] args) throws Exception {
System.out.printf(F, "Base", "Resolve part", "Expected", "Actual (if different)");
test("https://a.com", ".", "https://a.com/");
test("https://a.com", "./", "https://a.com/");
test("https://a.com", "./path", "https://a.com/path");
test("https://a.com", "path", "https://a.com/path");
test("https://a.com", "path/", "https://a.com/path/");
test("https://a.com", "./path/", "https://a.com/path/");
test("https://a.com", "../", "https://a.com/../");
test("https://a.com", "../path", "https://a.com/../path");
test("https://a.com", "../path/", "https://a.com/../path/");

System.out.println("\nTrailing slash");
test("https://a.com/", ".", "https://a.com/");
test("https://a.com/", "./", "https://a.com/");
test("https://a.com/", "./path", "https://a.com/path");
test("https://a.com/", "path", "https://a.com/path");
test("https://a.com/", "path/", "https://a.com/path/");
test("https://a.com/", "./path/", "https://a.com/path/");
test("https://a.com/", "../", "https://a.com/../");
test("https://a.com/", "../path", "https://a.com/../path");
test("https://a.com/", "../path/", "https://a.com/../path/");
}

private static void test(String base, String resolve, String expected) throws Exception {
URI baseUri = new URI(base);
URI resolveUri = new URI(resolve);
URI actual = baseUri.resolve(resolveUri);
URI expectedUri = new URI(expected);
String difference = actual.equals(expectedUri) ? "" : actual.toString();
System.out.printf(F, baseUri, resolveUri, expectedUri, difference);
}
}
Auf Java 21, wo der verlinkte Fehler „behoben“ ist und wir davon ausgehen, dass die Ausgabe jeder Auflösung korrekt ist, erhalten wir die folgende Ausgabe:

Code: Select all

Base                 Resolve part    Expected                            Actual (if different)
https://a.com        .                https://a.com/
https://a.com        ./              https://a.com/
https://a.com        ./path          https://a.com/path
https://a.com        path            https://a.com/path
https://a.com        path/           https://a.com/path/
https://a.com        ./path/         https://a.com/path/
https://a.com        ../             https://a.com/../
https://a.com        ../path         https://a.com/../path
https://a.com        ../path/        https://a.com/../path/

Trailing slash
https://a.com/       .               https://a.com/
https://a.com/       ./              https://a.com/
https://a.com/       ./path          https://a.com/path
https://a.com/       path            https://a.com/path
https://a.com/       path/           https://a.com/path/
https://a.com/       ./path/         https://a.com/path/
https://a.com/       ../             https://a.com/../
https://a.com/       ../path         https://a.com/../path
https://a.com/       ../path/        https://a.com/../path/
Java 8-Aufruf
Bei Ausführung unter Java 8 (1.8.0_322):

Code: Select all

Base                 Resolve part    Expected                            Actual (if different)
https://a.com        .               https://a.com/                      https://a.com
https://a.com        ./              https://a.com/                      https://a.com
https://a.com        ./path          https://a.com/path                  https://a.compath
https://a.com        path            https://a.com/path                  https://a.compath
https://a.com        path/           https://a.com/path/                 https://a.compath/
https://a.com        ./path/         https://a.com/path/                 https://a.compath/
https://a.com        ../             https://a.com/../                   https://a.com../
https://a.com        ../path         https://a.com/../path               https://a.com../path
https://a.com        ../path/        https://a.com/../path/              https://a.com../path/

Trailing slash
https://a.com/       .               https://a.com/
https://a.com/       ./              https://a.com/
https://a.com/       ./path          https://a.com/path
https://a.com/       path            https://a.com/path
https://a.com/       path/           https://a.com/path/
https://a.com/       ./path/         https://a.com/path/
https://a.com/       ../             https://a.com/../
https://a.com/       ../path         https://a.com/../path
https://a.com/       ../path/        https://a.com/../path/
Sie können sehen, dass die tatsächlichen bei vielen Auflösungen sehr unterschiedlich sind. Dies ist auch kein vollständiger Satz möglicher Eingaben, sondern nur einige, um das Problem zu veranschaulichen.
Was macht Python?
Nebenbei: Ausführen Dasselbe geschieht in Python (hier 3.11): Die Verwendung von urllib.parse erzeugt eine andere Ausgabe

Code: Select all

from urllib.parse import urljoin

FORMAT = "{:20s} {:15s} {:35s} {:35s}"

def main():
print(FORMAT.format("Base", "Resolve part", "Expected", "Actual (if different)"))
test("https://a.com", ".", "https://a.com/")
test("https://a.com", "./", "https://a.com/")
test("https://a.com", "./path", "https://a.com/path")
test("https://a.com", "path", "https://a.com/path")
test("https://a.com", "path/", "https://a.com/path/")
test("https://a.com", "./path/", "https://a.com/path/")
test("https://a.com", "../", "https://a.com/../")
test("https://a.com", "../path", "https://a.com/../path")
test("https://a.com", "../path/", "https://a.com/../path/")

print("\nTrailing slash")
test("https://a.com/", ".", "https://a.com/")
test("https://a.com/", "./", "https://a.com/")
test("https://a.com/", "./path", "https://a.com/path")
test("https://a.com/", "path", "https://a.com/path")
test("https://a.com/", "path/", "https://a.com/path/")
test("https://a.com/", "./path/", "https://a.com/path/")
test("https://a.com/", "../", "https://a.com/../")
test("https://a.com/", "../path", "https://a.com/../path")
test("https://a.com/", "../path/", "https://a.com/../path/")

def test(base, resolve, expected):
base_uri = urljoin(base, "")  # Ensure base is a proper URL
resolve_uri = urljoin("", resolve)  # Resolve treats empty string as base
actual = urljoin(base_uri, resolve_uri)
difference = "" if actual == expected else actual
print(FORMAT.format(base_uri, resolve_uri, expected, difference))

main()
Ausgabe:

Code: Select all

Base                 Resolve part    Expected                            Actual (if different)
https://a.com        .               https://a.com/
https://a.com        ./              https://a.com/
https://a.com        ./path          https://a.com/path
https://a.com        path            https://a.com/path
https://a.com        path/           https://a.com/path/
https://a.com        ./path/         https://a.com/path/
https://a.com        ../             https://a.com/../                   https://a.com/
https://a.com        ../path         https://a.com/../path               https://a.com/path
https://a.com        ../path/        https://a.com/../path/              https://a.com/path/

Trailing slash
https://a.com/       .                https://a.com/
https://a.com/       ./              https://a.com/
https://a.com/       ./path          https://a.com/path
https://a.com/       path            https://a.com/path
https://a.com/       path/           https://a.com/path/
https://a.com/       ./path/         https://a.com/path/
https://a.com/       ../             https://a.com/../                   https://a.com/
https://a.com/       ../path         https://a.com/../path               https://a.com/path
https://a.com/       ../path/        https://a.com/../path/              https://a.com/path/
Erzeugt eine leicht unterschiedliche Ausgabe. Dies könnte an einem RFC-Unterschied zwischen Java und Python oder an einem geringfügigen Unterschied in der Funktionsweise der Normalisierung liegen, da die einzigen Unterschiede bei den Doppelpunktsegmenten bestehen.
FrageWie kann ich trotz des oben genannten Fehlers eine sichere Java 8-Auflösungsmethode erstellen, die für alle java.net.URIs funktioniert?

Quick Reply

Change Text Case: 
   
  • Similar Topics
    Replies
    Views
    Last post