Multiframe-ZSTD-Datei: Warum stimmt content_size nicht überein? - Programmiererforum

Multiframe-ZSTD-Datei: Warum stimmt content_size nicht überein? ⇐ Python

Post Reply Previous topic Next topic

1 post • Page 1 of 1

Anonymous

Multiframe-ZSTD-Datei: Warum stimmt content_size nicht überein?

Report
Quote

Post by Anonymous » 13 Jan 2026, 17:41

Ich möchte mehrere ndjson-Dateien mit der Python-Bindung python-zstandard in eine ZSTD-Datei komprimieren. Für die spätere Zufallssuche möchte ich, dass jede Datei unabhängig als Frame komprimiert wird. Zu diesem Zweck verwende ich

Code: Select all

import zstandard as zstd
from pathlib import Path

file_to_compress = [r"E:\Personal Projects\tmp\chunk_0.ndjson",
r"E:\Personal Projects\tmp\chunk_0.ndjson"]
file_to_compress = [Path(p) for p in file_to_compress]

output_file = r"E:\Personal Projects\tmp\dataset.zst"
output_file = Path(output_file)

cctx = zstd.ZstdCompressor(write_content_size=True, threads=5)

with open(output_file, "wb") as f_out:
for src in file_to_compress:
with open(src, "rb") as fin:
cctx.copy_stream(fin, f_out)

frames = []

with open("E:\\Personal Projects\\tmp\\dataset.zst", "rb") as f:
offset = 0

while True:
f.seek(offset)
header = f.read(512)   # enough for any zstd frame header
if not header:
break

params = zstd.get_frame_parameters(header)

frames.append({
"offset": offset,
"content_size": params.content_size,
"window_size": params.window_size,
"dict_id": params.dict_id,
})

# Advance to next frame:
# compressed_size is not known yet, so we must skip by decompressing OR
# rely on external index if available.
#
# For now, break to show that content_size is present.
break

print(f'The file size of "{str(file_to_compress[0])}" is', file_to_compress[0].stat().st_size)

print("Information of the first frame is", frames[0])

Die Ausgabe ist

Code: Select all

The file size of "E:\Personal Projects\tmp\chunk_0.ndjson" is 2147473321
Information of the first frame is {'offset': 0, 'content_size': 18446744073709551615, 'window_size': 2097152, 'dict_id': 0}

Können Sie erklären, warum content_size nicht mit dem von file_to_compress[0].stat().st_size übereinstimmt?
Vielen Dank für Ihre Ausarbeitung.

1768322514

Anonymous

[url=viewtopic.php?t=30561]Ich möchte[/url] mehrere ndjson-Dateien mit der Python-Bindung python-zstandard in eine ZSTD-Datei komprimieren. Für die spätere Zufallssuche möchte ich, dass jede Datei unabhängig als Frame komprimiert wird. Zu diesem Zweck verwende ich
[code]import zstandard as zstd
from pathlib import Path

file_to_compress = [r"E:\Personal Projects\tmp\chunk_0.ndjson",
r"E:\Personal Projects\tmp\chunk_0.ndjson"]
file_to_compress = [Path(p) for p in file_to_compress]

output_file = r"E:\Personal Projects\tmp\dataset.zst"
output_file = Path(output_file)

cctx = zstd.ZstdCompressor(write_content_size=True, threads=5)

with open(output_file, "wb") as f_out:
for src in file_to_compress:
with open(src, "rb") as fin:
cctx.copy_stream(fin, f_out)

frames = []

with open("E:\\Personal Projects\\tmp\\dataset.zst", "rb") as f:
offset = 0

while True:
f.seek(offset)
header = f.read(512)   # enough for any zstd frame header
if not header:
break

params = zstd.get_frame_parameters(header)

frames.append({
"offset": offset,
"content_size": params.content_size,
"window_size": params.window_size,
"dict_id": params.dict_id,
})

# Advance to next frame:
# compressed_size is not known yet, so we must skip by decompressing OR
# rely on external index if available.
#
# For now, break to show that content_size is present.
break

print(f'The file size of "{str(file_to_compress[0])}" is', file_to_compress[0].stat().st_size)

print("Information of the first frame is", frames[0])
[/code]
Die Ausgabe ist
[code]The file size of "E:\Personal Projects\tmp\chunk_0.ndjson" is 2147473321
Information of the first frame is {'offset': 0, 'content_size': 18446744073709551615, 'window_size': 2097152, 'dict_id': 0}
[/code]
Können Sie erklären, warum content_size nicht mit dem von file_to_compress[0].stat().st_size übereinstimmt?
Vielen Dank für Ihre Ausarbeitung.

Post Reply Previous topic Next topic

1 post • Page 1 of 1

Quick Reply

Subject:

Username:

Change Text Case:

Smilies

View more smilies

Similar Topics

Replies

Views

Last post

Multiframe-ZSTD-Datei: Wie erhalte ich Metadaten für jeden Frame?

Last post by Anonymous « 13 Jan 2026, 00:47
Posted in Python

by Anonymous » 13 Jan 2026, 00:47 » in Python

Ich möchte mehrere ndjson-Dateien mit der Python-Bindung python-zstandard in eine ZSTD-Datei komprimieren. Für die spätere Zufallssuche möchte ich, dass jede Datei unabhängig als Frame komprimiert...

0 Replies

1 Views

Last post by Anonymous
13 Jan 2026, 00:47
Verwenden von Breiten- und Höhenattributen mit `content-intrinsic-size" für IMG-Tag

Last post by Anonymous « 09 Sep 2025, 13:09
Posted in HTML

by Anonymous » 09 Sep 2025, 13:09 » in HTML

Ich habe einen Artikel über Frontend -Leistung mit einem aktuellen Beispiel für Bilder gefunden:

/images/hero.avif
alt= Hero
width= 1200 height= 800
fetchpriority= high
decoding= async...

0 Replies

39 Views

Last post by Anonymous
09 Sep 2025, 13:09
Warum stimmt das Objekt, das ich gerade in CUDA erstellt habe, nicht mit dem überein, wie es aussieht, kurz bevor es vom

Last post by Anonymous « 06 Jan 2025, 07:17
Posted in C++

by Anonymous » 06 Jan 2025, 07:17 » in C++

Ich schreibe CUDA alles auf dem Gerät. Ich habe eine Klasse zum Emulieren von Strings, da die String-Klasse nicht auf einer GPU verwendet werden kann. Meine Klasse hat wchar_t * string in data_ und...

0 Replies

66 Views

Last post by Anonymous
06 Jan 2025, 07:17
Dekomprimieren Sie die ZSTD-Datei mit TypeScript

Last post by Anonymous « 04 Nov 2025, 08:15
Posted in JavaScript

by Anonymous » 04 Nov 2025, 08:15 » in JavaScript

Ich versuche, eine .zst-Datei mithilfe von Typescript programmgesteuert zu dekomprimieren. Dies ist der Code, den ich verwende:
import { createReadStream, createWriteStream } from 'node:fs';
import...

0 Replies

16 Views

Last post by Anonymous
04 Nov 2025, 08:15
Was ist der Unterschied zwischen CSS fit-content und max-content?

Last post by Anonymous « 10 Jan 2026, 22:23
Posted in CSS

by Anonymous » 10 Jan 2026, 22:23 » in CSS

Ich folge diesem Artikel um zu verstehen, wie diese Regeln funktionieren.
Ich habe dieses Beispiel:

* {
margin:0;
padding:0;
}
.box {
background: lightgreen;
margin: 0 auto;
width:...

0 Replies

1 Views

Last post by Anonymous
10 Jan 2026, 22:23

Return to “Python”