Ich habe versucht, das LLAMA-2-7B-Modell mit dem FreedomIntelligence/Medical-O1-Reasoning-Datensatz in einem Anweisungs-Antwort-Format zu optimieren. Im Moment übernitiere ich den Datensatz absichtlich, um den Trainingsfluss zu beobachten. Wenn ich jedoch das LORA-adaptierte Modell später zum Testen lade, unterscheidet sich die Ausgabe von dem, was ich zuvor gesehen habe. Insbesondere stimmen die Ausgänge aus der Funktion test_model (nach dem Training) und der Funktion test_Merged_model (nach dem Laden des Modells später) nicht überein.
> [INST] You are a helpful assistant. user A
> patient with psoriasis was treated with systemic steroids, and upon
> discontinuation of the treatment, developed generalized pustules all
> over the body. What is the most likely cause of this condition?
> assistant [/INST] think I'm thinking about this patient with
> psoriasis who's been on systemic steroids. Okay, so psoriasis is this
> chronic skin condition that can be pretty stubborn and flares up now
> and then. They stopped their steroid treatment, and then suddenly
> there are these generalized pustules appearing all over the place.
> What's going on here?
>
> Alright, pustules are like little bumps that are filled with pus—they
> can be a sign of different skin issues, but in someone with psoriasis,
> a major change like stopping treatment can really mess with things.
> Steroids are powerful at tamping down inflammation and managing the
> immune system. When you take someone off steroids, particularly
> suddenly, the body can react in unexpected ways. It's like the immune
> system gets a bit confused and overreacts—a rebound effect.
>
> Oh, right, I remember reading that systemic steroids can sometimes
> mask the severity of psoriasis, keeping it under control. But pulling
> those steroids too fast can allow the psoriasis to come back stronger,
> even shifting into another form. This sounds familiar—there's a type
> called pustular psoriasis. If I recall, that's when the psoriasis
> turns into this more extreme version with pustules.
>
> So connecting the dots: it's likely that stopping the steroids led to
> a rebound and we might be seeing a case of generalized pustular
> psoriasis. That makes sense given the scenario—I've seen this
> mentioned before where physicians need to be cautious about how they
> taper steroids off in psoriasis patients to prevent exactly this kind
> of escalation. It fits pretty well with what I'd expect based on what
> I know about steroid withdrawal and psoriasis flare-ups.
> answer The development of generalized pustules in a patient
> with psoriasis following the discontinuation of systemic steroid
> treatment is most likely due to a condition called generalized
> pustular psoriasis (GPP). This condition can occur as a rebound effect
> when systemic steroids are suddenly stopped or rapidly tapered down.
> In patients with psoriasis, steroids can suppress the immune system
> and mitigate symptoms. When these drugs are withdrawn too quickly, the
> immune system can overreact, leading to more severe symptoms,
> including the transformation of psoriasis into a pustular form.
> Careful management and slow tapering of steroids are critical in
> psoriasis patients to prevent such exacerbations.
< /code>
Zweite Ausgabe (nach dem Training, Lading Lora -Modell) < /h4>
> [INST]You are a helpful assistant. user A patient
> with psoriasis was treated with systemic steroids, and upon
> discontinuation of the treatment, developed generalized pustules all
> over the body. What is the most likely cause of this condition?
> assistant [/INST]thinkink I'thinkisinkis this patient with
> psoriasis who's been on systemic steroids going on systemic steroid
> treatment, and then. What are pretty cause. They stopped their steroid
> treatment, and then. What's going on now and then. What are these
> generalized pustules are like little bumps that can be a sign of these
> generalized pustules that can be a sign, and then suddenly there are
> these are filled pustules that can be a sign of different skin issues,
> but in someone with psoriasis, a major change like stopping treatment
> can really mess with the immune system. Steroids are powerful at
> tamping down the immune system and keeping it under control. When you
> take someone off steroids, particularly suddenly, the body can react
> in unexpected ways. It's like the immune system gets a bit confused
> and overreacts, a rebound effect. Alright, so connecting the dots:
> it's likely that stopping the steroids led to a rebound effect where
> the psoriasis got worse. This makes sense given the scenario—I've seen
> this mentioned before where physicians need to be cautious about how
> they taper steroids off in psoriasis patients to prevent exactly this
> kind of escalation. It fits pretty well with what I'd expect based on
> the available information. The development of generalized
> pustules in a patient with psoriasis following the discontinuation of
> systemic steroid treatment is most likely due to a condition called
> generalized pustular psoriasis (GPP). This condition can occur as a
> rebound effect when systemic steroids are suddenly stopped or rapidly
> tapered down. In patients with psoriasis, steroids can suppress the
> immune system and mitigate symptoms. When these drugs are withdrawn
> too quickly, the immune system can overreact, leading to more severe
> symptoms, including the transformation of psoriasis into a pustular
> form. Careful management and slow tapering of steroids are critical in
> psoriasis patients to prevent the escalation of symptoms.
< /code>
Ich bin mir nicht sicher, warum das Verhalten zwischen den beiden inkonsistent ist. < /p>
"""
Practical Introduction to Llama 2 Fine-Tuning with S1K Dataset using Standard Trainer
Fine-tune a 7B parameter Llama 2 model using QLoRA on a T4 GPU with limited VRAM.
This script uses parameter-efficient fine-tuning techniques to enable training
on consumer-grade hardware, using medical reasoning dataset.
"""
import torch
from datasets import load_dataset
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
BitsAndBytesConfig,
TrainingArguments,
Trainer,
pipeline,
logging,
DataCollatorForLanguageModeling,
LlamaTokenizer
)
from peft import LoraConfig, PeftModel, get_peft_model, prepare_model_for_kbit_training
from tqdm import tqdm
# Configuration
# Model and dataset settings
MODEL_NAME = "meta-llama/Llama-2-7b-hf"
# MODEL_NAME = "NousResearch/Llama-2-7b-hf"
DATASET_NAME = "FreedomIntelligence/medical-o1-reasoning-SFT"
NEW_MODEL_NAME = "llama-2-7b-medical-reasoning"
OUTPUT_DIR = "./results-medical-reasoning"
# QLoRA parameters
LORA_R = 64
LORA_ALPHA = 16
LORA_DROPOUT = 0.1
# Quantization parameters
USE_4BIT = True
BNB_4BIT_COMPUTE_DTYPE = "float16"
BNB_4BIT_QUANT_TYPE = "nf4"
USE_NESTED_QUANT = False
# Training parameters
NUM_TRAIN_EPOCHS = 100
FP16 = True
BF16 = False
PER_DEVICE_TRAIN_BATCH_SIZE = 4
PER_DEVICE_EVAL_BATCH_SIZE = 4
GRADIENT_ACCUMULATION_STEPS = 4
GRADIENT_CHECKPOINTING = True
MAX_GRAD_NORM = 0.3
LEARNING_RATE = 2e-4
WEIGHT_DECAY = 0.001
OPTIM = "paged_adamw_32bit"
LR_SCHEDULER_TYPE = "constant"
MAX_STEPS = -1 # Override epochs if positive
WARMUP_RATIO = 0.03
GROUP_BY_LENGTH = True
SAVE_STEPS = 25
LOGGING_STEPS = 25
# Sequence parameters
MAX_SEQ_LENGTH = 10000 # More reasonable max length
PACKING = False
DEVICE_MAP = {"": 0} # Load on GPU 0
# Generation parameters - consistent across all model tests
GENERATION_CONFIG = {
"max_length": 2000,
"do_sample": False,
"temperature": 0.0,
"num_beams": 1,
"top_p": 1.0,
"top_k": 50,
"repetition_penalty": 1.0,
}
# Prompt formatting functions
def generate_prompt_llama(question, think, answer):
formatted_prompt = "[INST]You are a helpful assistant.\n"
formatted_prompt += f"user\n{question}\nassistant\n[/INST]"
formatted_prompt += f"think\n{think}\nanswer\n{answer}"
return formatted_prompt
def generate_prompt_llama_answer(question, think, answer):
formatted_prompt = f"think\n{think}\nanswer\n{answer}"
return formatted_prompt
def generate_prompt_llama_system(question):
formatted_prompt = "[INST]You are a helpful assistant.\n"
formatted_prompt += f"user\n{question}\nassistant\n[/INST]"
return formatted_prompt
def main():
"""Main function to run the fine-tuning process."""
print("Starting Llama 2 fine-tuning process with medical reasoning dataset")
# 1. Load dataset
print(f"Loading {DATASET_NAME} dataset...")
dataset = load_dataset(DATASET_NAME, 'en')
dataset = dataset['train'].select(range(3,4)) # For testing purposes
print(f"Dataset loaded: {dataset}")
print(f"Dataset length: {len(dataset)}")
print(f"Dataset structure: {dataset.features}")
# 2. Configure quantization
print("Configuring BitsAndBytes for 4-bit quantization...")
compute_dtype = getattr(torch, BNB_4BIT_COMPUTE_DTYPE)
bnb_config = BitsAndBytesConfig(
load_in_4bit=USE_4BIT,
bnb_4bit_quant_type=BNB_4BIT_QUANT_TYPE,
bnb_4bit_compute_dtype=compute_dtype,
bnb_4bit_use_double_quant=USE_NESTED_QUANT,
)
# Check GPU compatibility with bfloat16
if compute_dtype == torch.float16 and USE_4BIT:
if torch.cuda.is_available():
major, _ = torch.cuda.get_device_capability()
if major >= 8:
print("=" * 80)
print("Your GPU supports bfloat16: accelerate training with bf16=True")
print("=" * 80)
# 3. Load model and tokenizer
print(f"Loading {MODEL_NAME} in 4-bit precision...")
model = AutoModelForCausalLM.from_pretrained(
MODEL_NAME,
quantization_config=bnb_config,
device_map=DEVICE_MAP,
trust_remote_code=True,
use_cache=False # Set here directly
)
# Configure model settings for training
model.config.pretraining_tp = 1
print("Loading tokenizer...")
tokenizer = LlamaTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
num_added_toks = tokenizer.add_tokens(['[INST]', '[/INST]', '', ''], special_tokens=True)
# Use distinct end-of-text and padding tokens
special_tokens_dict = {
'eos_token': '',
'pad_token': ''
}
# Add special tokens and resize model embeddings
num_added_tokens = tokenizer.add_special_tokens(special_tokens_dict)
print(f"Added {num_added_tokens} special tokens to the tokenizer")
# Important: Resize model embeddings to match new tokenizer size
model.resize_token_embeddings(len(tokenizer))
# Set padding side for the tokenizer
tokenizer.padding_side = "right" # Fix overflow issues with fp16 training
# Update model config with token IDs
model.config.pad_token_id = tokenizer.pad_token_id
model.config.eos_token_id = tokenizer.eos_token_id
# Print token information for debugging
print(f"Pad token: {tokenizer.pad_token}, ID: {tokenizer.pad_token_id}")
print(f"EOS token: {tokenizer.eos_token}, ID: {tokenizer.eos_token_id}")
# Prepare model for k-bit training - CRITICAL STEP!
model = prepare_model_for_kbit_training(model)
# 4. Configure LoRA
print("Configuring LoRA...")
peft_config = LoraConfig(
lora_alpha=LORA_ALPHA,
lora_dropout=LORA_DROPOUT,
r=LORA_R,
bias="none",
task_type="CAUSAL_LM",
target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "down_proj", "up_proj"],
)
# Apply LoRA to the model
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
# 5. Process dataset
print("Processing dataset...")
def preprocess_function(examples):
"""Process a batch of examples."""
full_prompts = [
generate_prompt_llama(q, traj, att)
for q, traj, att in zip(
examples["Question"],
examples["Complex_CoT"],
examples["Response"]
)
]
print("Full prompt example:", full_prompts[0])
# Tokenize inputs with proper padding and truncation
tokenized = tokenizer(
full_prompts,
truncation=True,
max_length=MAX_SEQ_LENGTH,
padding=False, # DataCollator will handle padding
return_tensors=None, # Return python lists, not tensors
)
# For causal language modeling, labels are the same as input_ids
tokenized["labels"] = tokenized["input_ids"].copy()
return tokenized
# Apply preprocessing to dataset
tokenized_dataset = dataset.map(
preprocess_function,
batched=True,
remove_columns=dataset.column_names,
desc="Tokenizing dataset",
)
print("Decoded sample:", tokenizer.decode(tokenized_dataset[0]["input_ids"], skip_special_tokens=False))
print(f"Sample input_ids shape: {len(tokenized_dataset[0]['input_ids'])}")
# 6. Data collator - critical for properly batching sequences
data_collator = DataCollatorForLanguageModeling(
tokenizer=tokenizer,
mlm=False, # We're doing causal language modeling
)
# 7. Set up training arguments
print("Setting up training arguments...")
training_args = TrainingArguments(
output_dir=OUTPUT_DIR,
num_train_epochs=NUM_TRAIN_EPOCHS,
per_device_train_batch_size=PER_DEVICE_TRAIN_BATCH_SIZE,
per_device_eval_batch_size=PER_DEVICE_EVAL_BATCH_SIZE,
gradient_accumulation_steps=GRADIENT_ACCUMULATION_STEPS,
optim=OPTIM,
save_steps=SAVE_STEPS,
logging_steps=LOGGING_STEPS,
learning_rate=LEARNING_RATE,
weight_decay=WEIGHT_DECAY,
fp16=FP16,
bf16=BF16,
max_grad_norm=MAX_GRAD_NORM,
max_steps=MAX_STEPS,
warmup_ratio=WARMUP_RATIO,
group_by_length=GROUP_BY_LENGTH,
lr_scheduler_type=LR_SCHEDULER_TYPE,
report_to="tensorboard",
gradient_checkpointing=GRADIENT_CHECKPOINTING,
remove_unused_columns=False, # Important for LoRA fine-tuning
)
# 8. Initialize standard Trainer
print("Initializing standard Trainer...")
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset,
data_collator=data_collator,
)
# 9. Train model
print("Starting training...")
trainer.train()
# 10. Save trained model
print(f"Saving model to {NEW_MODEL_NAME}...")
trainer.save_model(NEW_MODEL_NAME)
tokenizer.save_pretrained(NEW_MODEL_NAME)
print("Fine-tuning complete!")
# 11. Test model
test_model(model, tokenizer)
# 12. Merge weights (requires restarting with fresh VRAM)
print("====================================================================:")
print("Note: To merge LoRA weights with the base model:")
print("1. Restart your environment to clear VRAM")
print("2. Run the merge_weights() function")
merge_weights()
def test_model(model, tokenizer):
"""Test the model with a sample question."""
logging.set_verbosity(logging.CRITICAL)
prompt = 'A patient with psoriasis was treated with systemic steroids, and upon discontinuation of the treatment, developed generalized pustules all over the body. What is the most likely cause of this condition?'
# Format for inference
formatted_prompt = generate_prompt_llama_system(prompt)
print("\nTesting model with prompt:", prompt)
# Testing using model.generate()
input_ids = tokenizer(formatted_prompt, return_tensors="pt").input_ids.to(model.device)
model.eval()
with torch.no_grad():
# Use the global generation config
output_ids = model.generate(
input_ids=input_ids,
max_length=GENERATION_CONFIG["max_length"],
do_sample=GENERATION_CONFIG["do_sample"],
temperature=GENERATION_CONFIG["temperature"],
num_beams=GENERATION_CONFIG["num_beams"],
top_p=GENERATION_CONFIG["top_p"],
top_k=GENERATION_CONFIG["top_k"],
repetition_penalty=GENERATION_CONFIG["repetition_penalty"],
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
)
print("\nGenerated output:")
print(tokenizer.decode(output_ids[0], skip_special_tokens=False))
def merge_weights():
"""Merge LoRA weights with base model (run after restarting environment)."""
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
print("Loading tokenizer first...")
tokenizer = AutoTokenizer.from_pretrained(NEW_MODEL_NAME, trust_remote_code=True)
# Calculate the new vocabulary size
new_vocab_size = len(tokenizer)
print(f"New vocabulary size: {new_vocab_size}")
print("Loading base model in FP16...")
base_model = AutoModelForCausalLM.from_pretrained(
MODEL_NAME,
low_cpu_mem_usage=True,
return_dict=True,
torch_dtype=torch.float16,
device_map={"": 0},
)
# Resize model embeddings BEFORE loading LoRA weights
base_model.resize_token_embeddings(len(tokenizer))
print(f"Resized base model embeddings to {len(tokenizer)}")
print(f"Loading LoRA weights from {NEW_MODEL_NAME}...")
model = PeftModel.from_pretrained(base_model, NEW_MODEL_NAME)
print("Merging weights...")
model = model.merge_and_unload()
# Set padding configuration
tokenizer.padding_side = "right"
# Print token info
print(f"Pad token: {tokenizer.pad_token}, ID: {tokenizer.pad_token_id}")
print(f"EOS token: {tokenizer.eos_token}, ID: {tokenizer.eos_token_id}")
# Save merged model
merged_model_name = f"{NEW_MODEL_NAME}-merged"
print(f"Saving merged model to {merged_model_name}...")
model.save_pretrained(merged_model_name)
tokenizer.save_pretrained(merged_model_name)
print("Model weights successfully merged!")
# Test the merged model
test_merged_model(model, tokenizer)
return model, tokenizer
def test_merged_model(model, tokenizer):
"""Test the merged model."""
# Use the exact same test code and parameters as test_model
logging.set_verbosity(logging.CRITICAL)
prompt = 'A patient with psoriasis was treated with systemic steroids, and upon discontinuation of the treatment, developed generalized pustules all over the body. What is the most likely cause of this condition?'
# Format for inference
formatted_prompt = generate_prompt_llama_system(prompt)
print("\nTesting merged model with prompt:", prompt)
# Testing using model.generate()
input_ids = tokenizer(formatted_prompt, return_tensors="pt").input_ids.to(model.device)
model.eval()
with torch.no_grad():
# Use the identical generation config as the original test
output_ids = model.generate(
input_ids=input_ids,
max_length=GENERATION_CONFIG["max_length"],
do_sample=GENERATION_CONFIG["do_sample"],
temperature=GENERATION_CONFIG["temperature"],
num_beams=GENERATION_CONFIG["num_beams"],
top_p=GENERATION_CONFIG["top_p"],
top_k=GENERATION_CONFIG["top_k"],
repetition_penalty=GENERATION_CONFIG["repetition_penalty"],
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
)
print("\nGenerated output:")
print(tokenizer.decode(output_ids[0], skip_special_tokens=False))
if __name__ == "__main__":
main()here
Ich habe versucht, das LLAMA-2-7B-Modell mit dem FreedomIntelligence/Medical-O1-Reasoning-Datensatz in einem Anweisungs-Antwort-Format zu optimieren. Im Moment übernitiere ich den Datensatz absichtlich, um den Trainingsfluss zu beobachten. Wenn ich jedoch das LORA-adaptierte Modell später zum Testen lade, unterscheidet sich die Ausgabe von dem, was ich zuvor gesehen habe. Insbesondere stimmen die Ausgänge aus der Funktion test_model (nach dem Training) und der Funktion test_Merged_model (nach dem Laden des Modells später) nicht überein.[code]> [INST] You are a helpful assistant. user A > patient with psoriasis was treated with systemic steroids, and upon > discontinuation of the treatment, developed generalized pustules all > over the body. What is the most likely cause of this condition? > assistant [/INST] think I'm thinking about this patient with > psoriasis who's been on systemic steroids. Okay, so psoriasis is this > chronic skin condition that can be pretty stubborn and flares up now > and then. They stopped their steroid treatment, and then suddenly > there are these generalized pustules appearing all over the place. > What's going on here? > > Alright, pustules are like little bumps that are filled with pus—they > can be a sign of different skin issues, but in someone with psoriasis, > a major change like stopping treatment can really mess with things. > Steroids are powerful at tamping down inflammation and managing the > immune system. When you take someone off steroids, particularly > suddenly, the body can react in unexpected ways. It's like the immune > system gets a bit confused and overreacts—a rebound effect. > > Oh, right, I remember reading that systemic steroids can sometimes > mask the severity of psoriasis, keeping it under control. But pulling > those steroids too fast can allow the psoriasis to come back stronger, > even shifting into another form. This sounds familiar—there's a type > called pustular psoriasis. If I recall, that's when the psoriasis > turns into this more extreme version with pustules. > > So connecting the dots: it's likely that stopping the steroids led to > a rebound and we might be seeing a case of generalized pustular > psoriasis. That makes sense given the scenario—I've seen this > mentioned before where physicians need to be cautious about how they > taper steroids off in psoriasis patients to prevent exactly this kind > of escalation. It fits pretty well with what I'd expect based on what > I know about steroid withdrawal and psoriasis flare-ups. > answer The development of generalized pustules in a patient > with psoriasis following the discontinuation of systemic steroid > treatment is most likely due to a condition called generalized > pustular psoriasis (GPP). This condition can occur as a rebound effect > when systemic steroids are suddenly stopped or rapidly tapered down. > In patients with psoriasis, steroids can suppress the immune system > and mitigate symptoms. When these drugs are withdrawn too quickly, the > immune system can overreact, leading to more severe symptoms, > including the transformation of psoriasis into a pustular form. > Careful management and slow tapering of steroids are critical in > psoriasis patients to prevent such exacerbations. < /code> Zweite Ausgabe (nach dem Training, Lading Lora -Modell) < /h4> > [INST]You are a helpful assistant. user A patient > with psoriasis was treated with systemic steroids, and upon > discontinuation of the treatment, developed generalized pustules all > over the body. What is the most likely cause of this condition? > assistant [/INST]thinkink I'thinkisinkis this patient with > psoriasis who's been on systemic steroids going on systemic steroid > treatment, and then. What are pretty cause. They stopped their steroid > treatment, and then. What's going on now and then. What are these > generalized pustules are like little bumps that can be a sign of these > generalized pustules that can be a sign, and then suddenly there are > these are filled pustules that can be a sign of different skin issues, > but in someone with psoriasis, a major change like stopping treatment > can really mess with the immune system. Steroids are powerful at > tamping down the immune system and keeping it under control. When you > take someone off steroids, particularly suddenly, the body can react > in unexpected ways. It's like the immune system gets a bit confused > and overreacts, a rebound effect. Alright, so connecting the dots: > it's likely that stopping the steroids led to a rebound effect where > the psoriasis got worse. This makes sense given the scenario—I've seen > this mentioned before where physicians need to be cautious about how > they taper steroids off in psoriasis patients to prevent exactly this > kind of escalation. It fits pretty well with what I'd expect based on > the available information. The development of generalized > pustules in a patient with psoriasis following the discontinuation of > systemic steroid treatment is most likely due to a condition called > generalized pustular psoriasis (GPP). This condition can occur as a > rebound effect when systemic steroids are suddenly stopped or rapidly > tapered down. In patients with psoriasis, steroids can suppress the > immune system and mitigate symptoms. When these drugs are withdrawn > too quickly, the immune system can overreact, leading to more severe > symptoms, including the transformation of psoriasis into a pustular > form. Careful management and slow tapering of steroids are critical in > psoriasis patients to prevent the escalation of symptoms. < /code> Ich bin mir nicht sicher, warum das Verhalten zwischen den beiden inkonsistent ist. < /p> """ Practical Introduction to Llama 2 Fine-Tuning with S1K Dataset using Standard Trainer
Fine-tune a 7B parameter Llama 2 model using QLoRA on a T4 GPU with limited VRAM. This script uses parameter-efficient fine-tuning techniques to enable training on consumer-grade hardware, using medical reasoning dataset. """
import torch from datasets import load_dataset from transformers import ( AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, Trainer, pipeline, logging, DataCollatorForLanguageModeling, LlamaTokenizer ) from peft import LoraConfig, PeftModel, get_peft_model, prepare_model_for_kbit_training from tqdm import tqdm
# Check GPU compatibility with bfloat16 if compute_dtype == torch.float16 and USE_4BIT: if torch.cuda.is_available(): major, _ = torch.cuda.get_device_capability() if major >= 8: print("=" * 80) print("Your GPU supports bfloat16: accelerate training with bf16=True") print("=" * 80)
# 3. Load model and tokenizer print(f"Loading {MODEL_NAME} in 4-bit precision...") model = AutoModelForCausalLM.from_pretrained( MODEL_NAME, quantization_config=bnb_config, device_map=DEVICE_MAP, trust_remote_code=True, use_cache=False # Set here directly )
# Configure model settings for training model.config.pretraining_tp = 1
# Use distinct end-of-text and padding tokens special_tokens_dict = { 'eos_token': '', 'pad_token': '' } # Add special tokens and resize model embeddings num_added_tokens = tokenizer.add_special_tokens(special_tokens_dict) print(f"Added {num_added_tokens} special tokens to the tokenizer")
# Important: Resize model embeddings to match new tokenizer size model.resize_token_embeddings(len(tokenizer))
# Set padding side for the tokenizer tokenizer.padding_side = "right" # Fix overflow issues with fp16 training
# Update model config with token IDs model.config.pad_token_id = tokenizer.pad_token_id model.config.eos_token_id = tokenizer.eos_token_id
# Print token information for debugging print(f"Pad token: {tokenizer.pad_token}, ID: {tokenizer.pad_token_id}") print(f"EOS token: {tokenizer.eos_token}, ID: {tokenizer.eos_token_id}")
# Prepare model for k-bit training - CRITICAL STEP! model = prepare_model_for_kbit_training(model)
# Apply LoRA to the model model = get_peft_model(model, peft_config) model.print_trainable_parameters()
# 5. Process dataset print("Processing dataset...")
def preprocess_function(examples): """Process a batch of examples.""" full_prompts = [ generate_prompt_llama(q, traj, att) for q, traj, att in zip( examples["Question"], examples["Complex_CoT"], examples["Response"] ) ] print("Full prompt example:", full_prompts[0])
# Tokenize inputs with proper padding and truncation tokenized = tokenizer( full_prompts, truncation=True, max_length=MAX_SEQ_LENGTH, padding=False, # DataCollator will handle padding return_tensors=None, # Return python lists, not tensors )
# For causal language modeling, labels are the same as input_ids tokenized["labels"] = tokenized["input_ids"].copy()
# 6. Data collator - critical for properly batching sequences data_collator = DataCollatorForLanguageModeling( tokenizer=tokenizer, mlm=False, # We're doing causal language modeling )
# 7. Set up training arguments print("Setting up training arguments...") training_args = TrainingArguments( output_dir=OUTPUT_DIR, num_train_epochs=NUM_TRAIN_EPOCHS, per_device_train_batch_size=PER_DEVICE_TRAIN_BATCH_SIZE, per_device_eval_batch_size=PER_DEVICE_EVAL_BATCH_SIZE, gradient_accumulation_steps=GRADIENT_ACCUMULATION_STEPS, optim=OPTIM, save_steps=SAVE_STEPS, logging_steps=LOGGING_STEPS, learning_rate=LEARNING_RATE, weight_decay=WEIGHT_DECAY, fp16=FP16, bf16=BF16, max_grad_norm=MAX_GRAD_NORM, max_steps=MAX_STEPS, warmup_ratio=WARMUP_RATIO, group_by_length=GROUP_BY_LENGTH, lr_scheduler_type=LR_SCHEDULER_TYPE, report_to="tensorboard", gradient_checkpointing=GRADIENT_CHECKPOINTING, remove_unused_columns=False, # Important for LoRA fine-tuning )
# 8. Initialize standard Trainer print("Initializing standard Trainer...") trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_dataset, data_collator=data_collator, )
# 9. Train model print("Starting training...") trainer.train()
# 10. Save trained model print(f"Saving model to {NEW_MODEL_NAME}...") trainer.save_model(NEW_MODEL_NAME) tokenizer.save_pretrained(NEW_MODEL_NAME)
print("Fine-tuning complete!")
# 11. Test model test_model(model, tokenizer)
# 12. Merge weights (requires restarting with fresh VRAM) print("====================================================================:") print("Note: To merge LoRA weights with the base model:") print("1. Restart your environment to clear VRAM") print("2. Run the merge_weights() function")
merge_weights()
def test_model(model, tokenizer): """Test the model with a sample question.""" logging.set_verbosity(logging.CRITICAL) prompt = 'A patient with psoriasis was treated with systemic steroids, and upon discontinuation of the treatment, developed generalized pustules all over the body. What is the most likely cause of this condition?'
# Format for inference formatted_prompt = generate_prompt_llama_system(prompt)
print("\nTesting model with prompt:", prompt)
# Testing using model.generate() input_ids = tokenizer(formatted_prompt, return_tensors="pt").input_ids.to(model.device)
model.eval()
with torch.no_grad(): # Use the global generation config output_ids = model.generate( input_ids=input_ids, max_length=GENERATION_CONFIG["max_length"], do_sample=GENERATION_CONFIG["do_sample"], temperature=GENERATION_CONFIG["temperature"], num_beams=GENERATION_CONFIG["num_beams"], top_p=GENERATION_CONFIG["top_p"], top_k=GENERATION_CONFIG["top_k"], repetition_penalty=GENERATION_CONFIG["repetition_penalty"], pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id, )
def merge_weights(): """Merge LoRA weights with base model (run after restarting environment).""" import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel
# Calculate the new vocabulary size new_vocab_size = len(tokenizer) print(f"New vocabulary size: {new_vocab_size}")
print("Loading base model in FP16...") base_model = AutoModelForCausalLM.from_pretrained( MODEL_NAME, low_cpu_mem_usage=True, return_dict=True, torch_dtype=torch.float16, device_map={"": 0}, )
# Resize model embeddings BEFORE loading LoRA weights base_model.resize_token_embeddings(len(tokenizer)) print(f"Resized base model embeddings to {len(tokenizer)}")
print(f"Loading LoRA weights from {NEW_MODEL_NAME}...") model = PeftModel.from_pretrained(base_model, NEW_MODEL_NAME)
print("Merging weights...") model = model.merge_and_unload()
# Set padding configuration tokenizer.padding_side = "right"
# Save merged model merged_model_name = f"{NEW_MODEL_NAME}-merged" print(f"Saving merged model to {merged_model_name}...") model.save_pretrained(merged_model_name) tokenizer.save_pretrained(merged_model_name)
print("Model weights successfully merged!")
# Test the merged model test_merged_model(model, tokenizer)
return model, tokenizer
def test_merged_model(model, tokenizer): """Test the merged model.""" # Use the exact same test code and parameters as test_model logging.set_verbosity(logging.CRITICAL) prompt = 'A patient with psoriasis was treated with systemic steroids, and upon discontinuation of the treatment, developed generalized pustules all over the body. What is the most likely cause of this condition?'
# Format for inference formatted_prompt = generate_prompt_llama_system(prompt)
print("\nTesting merged model with prompt:", prompt)
# Testing using model.generate() input_ids = tokenizer(formatted_prompt, return_tensors="pt").input_ids.to(model.device)
model.eval()
with torch.no_grad(): # Use the identical generation config as the original test output_ids = model.generate( input_ids=input_ids, max_length=GENERATION_CONFIG["max_length"], do_sample=GENERATION_CONFIG["do_sample"], temperature=GENERATION_CONFIG["temperature"], num_beams=GENERATION_CONFIG["num_beams"], top_p=GENERATION_CONFIG["top_p"], top_k=GENERATION_CONFIG["top_k"], repetition_penalty=GENERATION_CONFIG["repetition_penalty"], pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id, )
Ich definiere meine eigene Train () -Funktion - die Schulungen und Validierung (möglicherweise der Name der Funktion ist hier nicht der beschreibendste)
, da ich eine verwende Custom iterable...
Ich habe eine Reihe von Objekten in einem Browserspiel, an dem ich arbeite. Jedes Objekt ist ein Teilchen und erhält eine zufällige Geschwindigkeit bei der Erstellung. Wenn sich jedoch beim Spielen...
Ich habe ein Problem mit Microsoft.ml in C# und ich hatte gehofft, dass jemand mir helfen kann. LightGbmBinaryTrainer modelTrainer = mlContext.BinaryClassification.Trainers.LightGbm(options);
Mengen sind ungeordnet, bzw. ihre Reihenfolge ist ein Implementierungsdetail. Dieses Detail interessiert mich. Und ich habe einen Fall gesehen, der mich überrascht hat:
print({2, 3, 10})
x = 2...
Ich schreibe Tests für meine Flask-Anwendungsendpunkte. Der erste Endpunkt nimmt eine mehrteilige Anfrage entgegen und speichert die bereitgestellten Dateien auf dem Server. Der zweite Endpunkt ruft...