Warum handelt mein Lama 3.1 -Modell zwischen AutomodelformcausAllm und llamaforcausallm anders? - Programmiererforum

Warum handelt mein Lama 3.1 -Modell zwischen AutomodelformcausAllm und llamaforcausallm anders? ⇐ Python

Post Reply Previous topic Next topic

1 post • Page 1 of 1

Anonymous

Warum handelt mein Lama 3.1 -Modell zwischen AutomodelformcausAllm und llamaforcausallm anders?

Post by Anonymous » 20 Mar 2025, 14:43

Ich habe einen Satz von Gewichten, einen Tokenizer, die gleiche Eingabeaufforderung und die gleiche Parameter für die Erzeugung. Wenn ich das Modell mit AutomodelforcausAllm lade, erhalte ich jedoch eine Ausgabe, und wenn ich es manuell mit llamaforcausAllm plus derselben Konfiguration und dem gleichen State_Dict baue, erhalte ich eine andere Ausgabe vollständig.

Code: Select all

import torch
from transformers import (
AutoTokenizer,
AutoModelForCausalLM,
LlamaForCausalLM,
LlamaConfig
)

# 1) Adjust these as needed
model_name = "meta-llama/Llama-3.1-8B"
prompt = "Hello from Llama 3.1! Tell me something interesting."
dtype = torch.float16  # or torch.float32 if needed

# 2) Get the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)

# Prepare input
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

############################################
# A) Load with AutoModelForCausalLM
############################################

print("=== Loading with AutoModelForCausalLM ===")

model_auto = AutoModelForCausalLM.from_pretrained(
model_name,
attn_implementation="eager",  # matches your usage
torch_dtype=dtype
).cuda()
model_auto.eval()  # turn off dropout
config = model_auto.config
with torch.no_grad():
out_auto = model_auto(**inputs)
logits_auto = out_auto.logits  # shape: [batch_size, seq_len, vocab_size]

del model_auto
torch.cuda.empty_cache()

############################################
# B) Load with LlamaForCausalLM + config
############################################

print("=== Loading with LlamaForCausalLM + config ===")

# Get config from the same checkpoint
# Build Llama model directly
model_llama = LlamaForCausalLM(config).cuda()
model_llama.eval()

# Load the same weights that AutoModelForCausalLM used
model_auto_temp = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=dtype)
model_llama.load_state_dict(model_auto_temp.state_dict())
del model_auto_temp
torch.cuda.empty_cache()

with torch.no_grad():
out_llama = model_llama(**inputs)
logits_llama = out_llama.logits

############################################
# C) Compare the Logits
############################################

# Compute maximum absolute difference
max_diff = (logits_auto - logits_llama).abs().max()
print(f"\nMax absolute difference between logits: {max_diff.item()}")

if max_diff < 1e-7:
print("→ The logits are effectively identical (within floating-point precision).")
else:
print("→ There is a non-trivial difference in logits!")

1742478236

Anonymous

Ich habe einen Satz von Gewichten, einen Tokenizer, die gleiche Eingabeaufforderung und die gleiche Parameter für die Erzeugung. Wenn ich das Modell mit AutomodelforcausAllm lade, erhalte ich jedoch eine Ausgabe, und wenn ich es manuell mit llamaforcausAllm plus derselben Konfiguration und dem gleichen State_Dict baue, erhalte ich eine andere Ausgabe vollständig.[code]import torch
from transformers import (
AutoTokenizer,
AutoModelForCausalLM,
LlamaForCausalLM,
LlamaConfig
)

# 1) Adjust these as needed
model_name = "meta-llama/Llama-3.1-8B"
prompt = "Hello from Llama 3.1! Tell me something interesting."
dtype = torch.float16  # or torch.float32 if needed

# 2) Get the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)

# Prepare input
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

############################################
# A) Load with AutoModelForCausalLM
############################################

print("=== Loading with AutoModelForCausalLM ===")

model_auto = AutoModelForCausalLM.from_pretrained(
model_name,
attn_implementation="eager",  # matches your usage
torch_dtype=dtype
).cuda()
model_auto.eval()  # turn off dropout
config = model_auto.config
with torch.no_grad():
out_auto = model_auto(**inputs)
logits_auto = out_auto.logits  # shape: [batch_size, seq_len, vocab_size]

del model_auto
torch.cuda.empty_cache()

############################################
# B) Load with LlamaForCausalLM + config
############################################

print("=== Loading with LlamaForCausalLM + config ===")

# Get config from the same checkpoint
# Build Llama model directly
model_llama = LlamaForCausalLM(config).cuda()
model_llama.eval()

# Load the same weights that AutoModelForCausalLM used
model_auto_temp = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=dtype)
model_llama.load_state_dict(model_auto_temp.state_dict())
del model_auto_temp
torch.cuda.empty_cache()

with torch.no_grad():
out_llama = model_llama(**inputs)
logits_llama = out_llama.logits

############################################
# C) Compare the Logits
############################################

# Compute maximum absolute difference
max_diff = (logits_auto - logits_llama).abs().max()
print(f"\nMax absolute difference between logits: {max_diff.item()}")

if max_diff < 1e-7:
print("→ The logits are effectively identical (within floating-point precision).")
else:
print("→ There is a non-trivial difference in logits!")
[/code]

Post Reply Previous topic Next topic

1 post • Page 1 of 1

Quick Reply

Username:

Change Text Case:

Smilies

View more smilies

Similar Topics

Replies

Views

Last post

Integration von LAMA -Modell in die iOS -App - COREML greift nicht auf Gewicht zu.bin [geschlossen]

Last post by Anonymous « 04 Mar 2025, 05:18
Posted in IOS

by Anonymous » 04 Mar 2025, 05:18 » in IOS

Ich versuche, das LAMA -Modell in meine iOS -App zum Image -Inpainting zu integrieren. Ich verwende das folgende Modell:

LAMA COREML-Modell

Verweis auf dieses Github-Repository:...

0 Replies

9 Views

Last post by Anonymous
04 Mar 2025, 05:18
Mein Code kann die endgültige Zeile nicht berechnen und drucken. Laut ChatGPT ist der Code korrekt, aber es handelt sich

Last post by Anonymous « 07 Apr 2025, 04:36
Posted in C++

by Anonymous » 07 Apr 2025, 04:36 » in C++

// Code calculates the total weight of a species
#include
#include
using namespace std;

int total_weight;
int weight;

int main()
{
std::cout

0 Replies

2 Views

Last post by Anonymous
07 Apr 2025, 04:36
Handelt es sich um dieselbe virtuelle Tabelle zwischen dem Basisklassenobjekt und dem abgeleiteten Klassenobjekt?

Last post by Guest « 12 Jan 2025, 08:23
Posted in C++

by Guest » 12 Jan 2025, 08:23 » in C++

Teilen Basis- und abgeleitete Klassen eine virtuelle Tabelle? Als ich mich auf eine Stelle bewarb, sagten einige Interviewer berühmter Internetunternehmen mit Sicherheit Ja. Obwohl ich ihnen sagte,...

0 Replies

15 Views

Last post by Guest
12 Jan 2025, 08:23
Handelt es sich um dieselbe virtuelle Tabelle zwischen dem Basisklassenobjekt und dem abgeleiteten Klassenobjekt?

Last post by Guest « 12 Jan 2025, 09:48
Posted in C++

by Guest » 12 Jan 2025, 09:48 » in C++

Teilen Basis- und abgeleitete Klassen eine virtuelle Tabelle? Als ich mich auf eine Stelle bewarb, sagten einige Interviewer berühmter Internetunternehmen mit Sicherheit Ja. Obwohl ich ihnen gesagt...

0 Replies

14 Views

Last post by Guest
12 Jan 2025, 09:48
Während PIP lama-cpp-python installieren, um einen Fehler auf Windows PC zu erhalten

Last post by Anonymous « 14 Feb 2025, 13:48
Posted in Python

by Anonymous » 14 Feb 2025, 13:48 » in Python

Verzeichnis erstellen llava_shared.dir \ release .
strukturierte Ausgabe ist aktiviert. Die Formatierung der Compiler -Diagnostik spiegelt die Fehlerhierarchie wider. Weitere Informationen finden Sie...

0 Replies

5 Views

Last post by Anonymous
14 Feb 2025, 13:48

Return to “Python”