<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[#icodeformyभाषा: Resources]]></title><description><![CDATA[Collecting reading materials and courses on ML, NLP and Gen AI for the community!]]></description><link>https://www.icodeformybhasa.com/s/resources</link><image><url>https://substackcdn.com/image/fetch/$s_!DMde!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d92160-5f82-40ca-8233-b069f42bbba6_1080x1080.png</url><title>#icodeformyभाषा: Resources</title><link>https://www.icodeformybhasa.com/s/resources</link></image><generator>Substack</generator><lastBuildDate>Fri, 10 Apr 2026 10:49:00 GMT</lastBuildDate><atom:link href="https://www.icodeformybhasa.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[#icodeformyभाषा]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[icodeformybhasa@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[icodeformybhasa@substack.com]]></itunes:email><itunes:name><![CDATA[Shreeya]]></itunes:name></itunes:owner><itunes:author><![CDATA[Shreeya]]></itunes:author><googleplay:owner><![CDATA[icodeformybhasa@substack.com]]></googleplay:owner><googleplay:email><![CDATA[icodeformybhasa@substack.com]]></googleplay:email><googleplay:author><![CDATA[Shreeya]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Large Language Models - A Curated Reading List]]></title><description><![CDATA[While I am working on the blog series on the LLaMA family of models, I have also put together a curated reading list of papers that chart the evolution of large language models.]]></description><link>https://www.icodeformybhasa.com/p/large-language-models-a-curated-reading</link><guid isPermaLink="false">https://www.icodeformybhasa.com/p/large-language-models-a-curated-reading</guid><dc:creator><![CDATA[Shreeya]]></dc:creator><pubDate>Sun, 16 Jun 2024 02:31:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/009e2e4d-e3bd-44a3-9cbe-2b8496cac5cf_2500x1500.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>While I am working on the blog series on the LLaMA family of models, I have also put together a curated reading list of papers that chart the evolution of large language models. These papers provide crucial context for understanding the foundations of large language models and landscape of LLMs that are the backbone of systems like meta.ai, ChatGPT, and Claude, among others.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;4b436455-f9ee-4d05-ba5d-2d92c53f5d30&quot;,&quot;caption&quot;:&quot;Since February 2023, Meta has open-sourced three versions of their LLaMA language model. This has enabled thousands of people in the AI and NLP communities to explore and build upon the LLaMA models for their use-cases. On April 18, 2024, Meta open-sourced&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The LLaMA Family of Models, Model Architecture, Size, and Scaling Laws &quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:104468357,&quot;name&quot;:&quot;Shreeya Dhakal&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2ccf9e7e-b918-482e-b8ed-800ad52e084e_1176x982.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-05-05T21:19:08.993Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff79826e3-d938-417a-8e68-b7a3ed3e7bff_1280x1280.jpeg&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.icodeformybhasa.com/p/the-llama-family-of-models-model&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:143924377,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;#icodeformy&#2349;&#2366;&#2359;&#2366;&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d92160-5f82-40ca-8233-b069f42bbba6_1080x1080.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>In this blog, my attempt is to create a comprehensive reading list, including some that are featured in my blog above. </p><h3>Key research papers on deep learning architectures</h3><ol><li><p><a href="https://arxiv.org/pdf/1409.3215v3">Sequence to Sequence Learning with Neural Networks.</a> Sutskever et al., 2014, Google</p></li><li><p><a href="https://arxiv.org/pdf/1706.03762">Attention is All You Need. </a>Vaswani et al., 2017, Google</p></li><li><p><a href="https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf">Improving Language Understanding by Generative Pre-Training.</a> Radford et al., 2018. OpenAI - GPT-1</p></li><li><p><a href="https://aclanthology.org/N19-1423.pdf">BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.</a> Devlin et al., 2018. Google</p></li><li><p><a href="https://arxiv.org/pdf/1910.13461">BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.</a> Lewis et al., 2019. Facebook</p></li><li><p><a href="https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf">Language Models are Unsupervised Multitask Learners</a>. Radford et al., 2019. OpenAI - GPT-2</p></li><li><p><a href="https://arxiv.org/pdf/1910.10683">Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. </a>Raffel et al., 2019. Google - T5</p></li><li><p><a href="https://arxiv.org/pdf/2005.14165">Language Models are Few-Shot Learners.</a> Brown et al., 2020. Open AI - GPT-3</p></li><li><p><a href="https://arxiv.org/abs/2002.04745">On Layer Normalization in the Transformer Architecture.</a> Xiong et al., 2020. </p></li></ol><p><em><strong>Survey Papers</strong></em></p><ol><li><p><a href="https://arxiv.org/abs/2303.18223">A Survey of Large Language Models.</a> Zhao et al., 2023. [<a href="https://github.com/RUCAIBox/LLMSurvey">Github</a>]</p></li><li><p><a href="https://arxiv.org/pdf/2404.04925">Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers.</a> Qin et al., 2024. </p></li><li><p><a href="https://arxiv.org/pdf/2402.06196">Large Language Models: A Survey. </a>Minaee et al., 2024</p></li></ol><h3>Efficient pre-training and scaling laws</h3><ol><li><p><a href="https://arxiv.org/pdf/2001.08361">Scaling Laws for Neural Language Models.</a> Kaplan et al., 2020. Open AI</p></li><li><p><a href="https://arxiv.org/pdf/2112.11446">Scaling Language Models: Methods, Analysis &amp; Insights from Training Gopher.</a> Rae et al., 2021. DeepMind</p></li><li><p><a href="https://arxiv.org/pdf/2203.15556">Training Compute-Optimal Large Language Models. </a>Hoffmann et al., 2022 DeepMind - Chinchilla</p></li><li><p><a href="https://arxiv.org/abs/2205.14135">FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. </a>Dao et al., 2022. Stanford University</p></li><li><p><a href="https://arxiv.org/pdf/2204.02311">PaLM: Scaling Language Modeling with Pathways. </a>Chowdhery et al., 2022. Google</p></li><li><p><a href="https://arxiv.org/pdf/2212.14034">Cramming: Training a Language Model on a Single GPU in One Day.</a> Geiping et al., 2022. University of Maryland, College Park</p></li></ol><p><em><strong>Survey Papers</strong></em></p><ol><li><p><a href="https://arxiv.org/pdf/2009.06732">Efficient Transformers: A Survey. </a>Tay et al., 2020 (Revised 2022). Google</p></li><li><p><a href="https://arxiv.org/pdf/2302.01107">A Survey on Efficient Training of Transformers. </a>Zhuang et al., 2023</p></li></ol><h3>Fine-tuning and parameter-efficient transfer learning</h3><ol><li><p><a href="https://github.com/google-research/adapter-bert">Parameter-Efficient Transfer Learning for NLP.</a> Houlsby et al., 2019. Google [<a href="https://github.com/google-research/adapter-bert">Github</a>]</p></li><li><p><a href="https://arxiv.org/pdf/2109.01652">Finetuned Language Models are Zero-Shot Learners. </a>Wei et al., 2021. Google</p></li><li><p><a href="https://arxiv.org/pdf/2106.09685">LoRA: Low-Rank Adaptation of Large Language Models.</a> Hu et al., 2021. Microsoft [<a href="https://github.com/microsoft/LoRA">Github</a>] [<a href="https://www.youtube.com/watch?v=DhRoTONcyZE">Video</a>]</p></li><li><p><a href="https://arxiv.org/pdf/2305.14314">QLoRA: Efficient Finetuning of Quantized LLMs.</a> Dettmers et al., 2023. University of Washington [<a href="https://github.com/artidoro/qlora">Github</a>]</p></li></ol><p><em><strong>Survey Papers</strong></em></p><ol><li><p><a href="https://arxiv.org/pdf/2103.13630">A Survey of Quantization Methods for Efficient Neural Network Inference.</a> Gholami et al., 2021. UC Berkeley</p></li><li><p><a href="https://arxiv.org/pdf/2303.15647">Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning. </a>Lialin et al., 2022. UMass Lowell </p></li><li><p><a href="https://arxiv.org/pdf/2304.13712">Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. </a>Yang et al., 2023 [ <a href="https://github.com/Mooler0410/LLMsPracticalGuide">Github</a>]</p></li><li><p><a href="https://arxiv.org/abs/2312.00678">The Efficiency Spectrum of Large Language Models: An Algorithmic Survey.</a> Ding et al., 2024, Microsoft</p></li></ol><h3>Aligning LLMs</h3><ol><li><p><a href="https://arxiv.org/pdf/1706.03741">Deep Reinforcement Learning from Human Preferences.</a> Christiano et al., 2017 (Revised 2023). Google</p></li><li><p><a href="https://arxiv.org/pdf/1909.08593">Fine-Tuning Language Models from Human Preferences.</a> Ziegler et al., 2019 (Revised 2020)</p></li><li><p><a href="https://arxiv.org/abs/2203.02155">Training Language Models to Follow Instructions with Human Feedback. </a>Ouyang et al., 2022. OpenAI - InstructGPT</p></li><li><p><a href="https://arxiv.org/pdf/2204.05862">Training a helpful and harmless assistant with reinforcement learning from human feedback. </a>Bia et al., 2022. Anthropic</p></li><li><p><a href="https://arxiv.org/pdf/2209.07858">Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned. </a>Ganguli et al., 2022. Anthropic</p></li><li><p><a href="https://arxiv.org/pdf/2212.08073">Constitutional AI: Harmlessness from AI Feedback.</a> Bia et al., 2022. Anthropic</p></li></ol><p><em><strong>Survey Papers</strong></em></p><ol><li><p><a href="https://arxiv.org/pdf/2307.12966">Aligning Large Language Models with Human: A Survey.</a> Wang et al., 2023. Huawei Noah&#8217;s Ark Lab</p></li><li><p><a href="https://arxiv.org/pdf/2309.15025">Large Language Model Alignment: A Survey.</a> Shen et al., 2023.  Tianjin University</p></li></ol><p><em>Note,</em> this is not an exhaustive reading list, and I will be updating it as I come across any new paper while I work on #icodeformy&#2349;&#2366;&#2359;&#2366;! There will be more of such reading lists in the future for different areas within natural language processing. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.icodeformybhasa.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading #icodeformy&#2349;&#2366;&#2359;&#2366;! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>