Here is a collection of the best AI articles on different topics. ## Tokenization This is the preliminary process in LLMs where raw text is broken down into smaller, discrete units called "tokens," which can be words, subwords, or characters, serving as the fundamental input units that the model then processes and understands. - Medium: Tokenization - A complete guide ## Embeddings These are dense vector representations of words, phrases, or even entire documents, capturing their semantic relationships and allowing LLMs to process textual information numerically, where similar meanings are represented by closer vectors in a high-dimensional space. - Encord: [The Full Guide to Embeddings in Machine Learning](https://encord.com/blog/embeddings-machine-learning/) ## Attention Mechanism (and Self-Attention) Attention Mechanism as the fundamental component within Transformers that allows the model to weigh the importance of different words in an input sequence relative to each other, and Self-Attention specifically enables this within the same sequence, crucial for understanding context and relationships between tokens. ## Transformers This is the groundbreaking neural network architecture that revolutionized sequence processing in LLMs, primarily through its self-attention mechanism, which allows the model to weigh the importance of different parts of the input sequence when processing each element, leading to highly effective parallelization and long-range dependency capture. - Datacamp: [How Transformers Work: A Detailed Exploration of Transformer Architecture](https://www.datacamp.com/tutorial/how-transformers-work) ## Efficient fine-tuning (LoRA, QLoRA, PEFT) These are advanced methods designed to adapt large pre-trained LLMs to specific downstream tasks with significantly fewer computational resources and trainable parameters than full fine-tuning, thereby making the customization process more accessible and faster. ## Underfitting and Overfitting These terms describe two common challenges in machine learning model training: **Underfitting** occurs when a model is too simple to capture the underlying patterns in the training data, leading to high errors on both training and new data; conversely, **Overfitting** happens when a model learns the training data too well, including its noise and specific quirks, resulting in excellent performance on the training set but poor generalization to unseen data. - Geeks for Geeks: [ML - Underfitting and overfitting](https://www.geeksforgeeks.org/machine-learning/underfitting-and-overfitting-in-machine-learning/) ## Retrieval-Augmented Generation (RAG) This technique enhances the generative capabilities of LLMs by enabling them to retrieve relevant information from a separate knowledge base or document store before generating a response, thereby improving accuracy, factuality, and reducing hallucinations by grounding the output in external data. - Prompt Engineering Guide: [Retrieval Augmented Generation (RAG) for LLMs](https://www.promptingguide.ai/research/rag) ## Evaluation Metrics These are quantitative measures used to assess the performance, quality, and effectiveness of LLMs and other machine learning models, providing objective insights into how well a model is achieving its intended purpose on various tasks, such as accuracy, precision, recall, F1-score, BLEU, ROUGE, or perplexity. - Read more: [[LLM Benchmarking]]