{"id":5790,"date":"2025-06-06T17:25:21","date_gmt":"2025-06-06T13:55:21","guid":{"rendered":"https:\/\/arrahimipour.com\/articles-en\/reducing-resource-consumption-in-training-large-language-models-through-multi-objective-optimization\/"},"modified":"2026-03-28T19:40:42","modified_gmt":"2026-03-28T16:10:42","slug":"reducing-resource-consumption-in-training-large-language-models-through-multi-objective-optimization","status":"publish","type":"post","link":"https:\/\/arrahimipour.com\/en\/articles-en\/reducing-resource-consumption-in-training-large-language-models-through-multi-objective-optimization\/","title":{"rendered":"Reducing Resource Consumption in Training Large Language Models through Multi-Objective Optimization"},"content":{"rendered":"<p>The rapid scaling of large language models (LLMs) has led to unprecedented computational costs and environmental impact. We address the problem of multi-objective optimization for LLM training, balancing model performance against resource usage (training time, energy, GPU-hours, carbon footprint). We survey and evaluate state-of-the-art techniques\u2014model pruning, quantization, knowledge distillation, neural architecture search (NAS), and hyperparameter tuning via evolutionary or reinforcement learning\u2014in terms of their trade-offs between accuracy (or loss) and efficiency. Using recent experimental data from public benchmarks (e.g. BERT fine-tuning on GLUE tasks, GPT-family training), we analyze how each method shapes the Pareto frontier of accuracy vs. cost. For example, static 8-bit quantization has been shown to cut energy use by ~29% with negligible accuracy loss, while structured pruning can speed up inference by ~63% for minor accuracy degradation. Advanced LLM compression methods achieve even larger gains: a multi-objective shift-add reparameterization method achieved over 80% reduction in memory and energy usage compared to full models. We include Pareto-plots (accuracy vs energy) to visualize these trade-offs (Figure 1\u20134). Overall, we find that multi-objective search (e.g. Bayesian optimization or genetic algorithms) can systematically identify configurations that lie on the Pareto-optimal front, enabling practitioners to choose models that best fit their constraints. Our paper highlights that, consistent with \u201cGreen AI\u201d principles, moderate sacrifices in accuracy can yield large efficiency gains, and we provide actionable recommendations for training sustainable LLMs.<\/p>\n<p>Article Link:<\/p>\n<p><a href=\"https:\/\/civilica.com\/doc\/2280364\/\">https:\/\/civilica.com\/doc\/2280364\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The rapid scaling of large language models (LLMs) has led to unprecedented computational costs and environmental impact. We address the problem of multi-objective optimization for LLM training, balancing model performance against resource usage (training time, energy, GPU-hours, carbon footprint). We survey and evaluate state-of-the-art techniques\u2014model pruning, quantization, knowledge distillation, neural architecture search (NAS), and hyperparameter [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":5736,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[39],"tags":[64,67,68,66],"class_list":["post-5790","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-articles-en","tag-ai-development","tag-ai-optimization","tag-large-language-models","tag-machine-learning"],"acf":[],"_links":{"self":[{"href":"https:\/\/arrahimipour.com\/en\/wp-json\/wp\/v2\/posts\/5790","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/arrahimipour.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/arrahimipour.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/arrahimipour.com\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/arrahimipour.com\/en\/wp-json\/wp\/v2\/comments?post=5790"}],"version-history":[{"count":0,"href":"https:\/\/arrahimipour.com\/en\/wp-json\/wp\/v2\/posts\/5790\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/arrahimipour.com\/en\/wp-json\/wp\/v2\/media\/5736"}],"wp:attachment":[{"href":"https:\/\/arrahimipour.com\/en\/wp-json\/wp\/v2\/media?parent=5790"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/arrahimipour.com\/en\/wp-json\/wp\/v2\/categories?post=5790"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/arrahimipour.com\/en\/wp-json\/wp\/v2\/tags?post=5790"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}