Target lora modules identity for Llama 2 7B on Amazon SageMaker Studio C

0

I have the following Q:

  • I trained a few Llama 2 7B models using the "Train" GUI in Amazon SageMaker Studio
  • This was around October/November 2023
  • Back then, the "Target lora modules" hyperparameter did not exist
  • I am trying to understand which matrices of the Llama 2 7B model were adjusted by the Lora algorithm which was available back then?

Currently, the defaults for the Target lora modules hyperparam are: q_proj,v_proj i.e. W_q and W_v the matrices for queries and values.

Is there any way to trace/figure out what were the defaults back in Oct 2023 with the old list of hyperparams available?

I tried looking at my saved models, looking at their parameters or the folder with all the model params and metadata (after you unzip model.tar.gz) but could not find anything. I checked the AWS documentation, and training scripts and looked at online discussions but it seems impossible to reconstruct what matrices LoRA from AWS SageMaker Studio was modifying.

Any help would be appreciated!

Here is a snapshot of the parameters I used back then:

KeyValue
enable_fsdpTrue
epoch5
instruction_tunedFalse
int8_quantizationFalse
learning_rate0.0001
lora_alpha32
lora_dropout0.05
lora_r2
max_input_length-1
max_train_samples-1
max_val_samples-1
per_device_eval_batch_size1
per_device_train_batch_size4
preprocessing_num_workersNone
sagemaker_container_log_level20
sagemaker_job_namesmjs-c-llama-2-7b-uft-textgen-lora-rank-2-20230925-152541
sagemaker_programtransfer_learning.py
sagemaker_regioneu-west-1
sagemaker_submit_directory/opt/ml/input/data/code/sourcedir.tar.gz
seed10
train_data_split_seed0
validation_split_ratio0.2
2 Answers
0

Thank you Giovanni Lauria for your answer.

These are some useful suggestions, but I have tried most of them already and couldn't find the information.

Also, this answer has been clearly copy/pasted directly from an LLM (GPT or maybe the one used internally within Amazon). I did check some suggestions from GPT before posting here, which were the same as the ones suggested here but couldn't get anywhere.

Is there no way to find out the actual Amazon documentation that specifies this information? Or some kind of metadata that I could check?

Thank you.

profile picture
Alex
answered a month ago
-1

Try the following. to investigate the behavior of LoRA in Llama 2 7B models during October/November 2023, even in the absence of explicit documentation or access to historical configurations.

  1. Literature Review: Begin by conducting a thorough literature review on LoRA algorithm implementations, particularly focusing on papers, articles, or official documentation released around the timeframe of October/November 2023. Look for any discussions, publications, or updates related to the integration of LoRA with Llama 2 7B models or similar architectures.

  2. Documentation Analysis: Scrutinize any available documentation from AWS, SageMaker, or related repositories. Pay close attention to release notes, version updates, or changelogs that might provide insights into the introduction or modification of hyperparameters, including the default settings for LoRA-related parameters.

  3. Experiment Replication: Attempt to replicate the training environment and experimental conditions prevalent during October/November 2023. If feasible, access historical training data, compute resources, and training scripts used during that time. Reproduce the training process for Llama 2 7B models and meticulously document any differences or nuances observed compared to current configurations.

  4. Parameter Sensitivity Analysis: Conduct a parameter sensitivity analysis to investigate the impact of varying hyperparameters, including those related to LoRA, on model performance. Systematically modify hyperparameters within a reasonable range and monitor changes in model behavior, such as convergence speed, accuracy, or loss curves. This analysis could help infer the default settings for LoRA in absence of explicit documentation.

  5. Cross-Validation Studies: Employ cross-validation techniques to assess the robustness and generalization performance of Llama 2 7B models trained with different hyperparameter configurations, including variations in LoRA-related settings. Compare model performance metrics across different settings to discern patterns or dependencies that could hint at the default behavior of LoRA.

  6. Statistical Inference: Utilize statistical inference methods to analyze experimental results and draw meaningful conclusions about the effects of hyperparameters on model behavior. Employ hypothesis testing, analysis of variance (ANOVA), or regression analysis to quantify the significance of variations in LoRA-related parameters and their impact on model outcomes.

  7. Peer Review and Collaboration: Engage with peers, researchers, or practitioners in the field of machine learning and natural language processing. Share your findings, methodologies, and insights related to the investigation of LoRA in Llama 2 7B models. Collaborate with others to validate results, exchange perspectives, and collectively advance understanding in this domain.

profile picture
EXPERT
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions