Glossary of Key Generative AI Terms

1. Generative AI

A subset of artificial intelligence focused on creating new content, such as text, images, audio, video, or code. It relies on models trained on vast datasets to identify patterns and generate similar but unique outputs.

2. Large Language Model (LLM)

An AI model trained to process and generate human-like text. Examples include GPT (Generative Pre-trained Transformer), BERT, and LaMDA, all of which leverage deep learning architectures, specifically transformers.

3. Transformer

A neural network architecture known for its ability to process sequential data, like text. Transformers use self-attention mechanisms, enabling the model to learn relationships between words and capture long-range dependencies efficiently.

4. Retrieval-Augmented Generation (RAG)

A hybrid approach combining retrieval-based and generative AI techniques. RAG models retrieve relevant information from external sources (e.g., databases or documents) and then use this information to generate contextually accurate responses, improving the model’s factual accuracy and relevance.

5. LangChain

A framework designed to facilitate the development of applications that rely on LLMs for various NLP tasks. LangChain is particularly useful for chaining together complex workflows where multiple steps or sources of data are involved, such as combining data retrieval, generation, and summarization into a single, coherent process.

6. CrewAI

An emerging tool focused on collaborative AI model development and deployment. CrewAI enables multiple users to contribute to and manage AI projects, making it easier to maintain version control, monitor performance, and integrate feedback for fine-tuning models in real-time, especially in enterprise environments.

7. Pre-training and Fine-tuning

Pre-training: Training a model on a large dataset to develop a general understanding of the data.
Fine-tuning: Customizing the model for specific tasks by training it on a smaller, task-specific dataset.

8. Prompt and Prompt Engineering

Prompt: Text input given to a generative model to guide its response.
Prompt Engineering: The craft of designing prompts that elicit the most accurate or creative responses from a model, especially valuable in guiding LLMs toward desired outputs.

9. Few-shot, One-shot, and Zero-shot Learning

These terms describe how much prior information or example data is needed for a model to perform a new task.

Few-shot learning: The model is given a few examples.
One-shot learning: The model is given only one example.
Zero-shot learning: The model receives no prior examples and must generalize based on its pre-trained knowledge.

10. Token

The basic unit of data processed by language models, often representing individual words or subwords. For example, the word “chatbot” might be split into “chat” and “bot.”

11. Temperature and Top-p Sampling

These parameters help control the variability of model outputs.

Temperature: A higher temperature introduces more randomness, while a lower temperature makes the output more deterministic.
Top-p (Nucleus Sampling): Limits the model’s choices to a subset of probable outcomes, refining its output quality by considering only top results until a cumulative probability threshold (p) is reached.

12. GAN (Generative Adversarial Network)

A model architecture where two networks—a generator and a discriminator—compete to create high-quality data. GANs are especially prominent in image generation.

13. Diffusion Model

A generative approach used mainly for image creation, where the model generates data by reversing a noise-adding process. Diffusion models have become popular for creating realistic visual outputs in tools like DALL-E and Stable Diffusion.

14. Reinforcement Learning with Human Feedback (RLHF)

A training approach where human evaluators provide feedback to improve model outputs. RLHF is used to make responses more accurate and aligned with human values, particularly in conversational AI models.

15. Ethical AI and Responsible AI

Principles guiding the safe, fair, and transparent development of AI systems. This includes addressing issues like bias, privacy, and the potential societal impacts of AI-generated content.

16. Bias and Fairness in AI

Bias in AI can lead to unfair treatment of individuals or groups and is a significant challenge in ensuring responsible AI use. Bias mitigation techniques are essential to avoid harmful stereotypes and enhance fairness.

17. Synthetic Data

Artificially generated data used for training AI models, often when real data is limited, costly, or poses privacy concerns. Synthetic data diversifies training datasets and improves model accuracy in scenarios with insufficient real-world data.

18. Image-to-Image, Text-to-Image, and Text-to-Text Models

These are generative models for converting one type of input to another:

Image-to-Image: Converts an image input into a modified version, such as adding color to black-and-white photos.
Text-to-Image: Generates images from text prompts (e.g., DALL-E).
Text-to-Text: Language models generating text based on an input prompt (e.g., GPT-4, ChatGPT).

19. Inference and Latency

Inference: The process of generating outputs based on model predictions for new data.
Latency: The time delay between input and response, crucial for applications that require real-time interactions.

20. Multimodal AI

AI that can process and generate multiple types of data (e.g., text, images, audio). Multimodal applications are highly beneficial in tasks where a holistic understanding of diverse data types is required.

21. Hallucination

When a generative model outputs incorrect or nonsensical information. Addressing hallucinations is essential for deploying reliable AI, especially in applications where factual accuracy is crucial.

22. Vector Database

A specialized database designed to store vector embeddings, which are numerical representations of data points (such as text or images). Vector databases, like Pinecone and FAISS, are essential in retrieval-augmented generation (RAG) workflows, allowing models to efficiently search for relevant information.

23. Embeddings

Numerical representations of data that capture semantic meaning, used to measure similarity between different pieces of information. Embeddings are vital in search, recommendation, and clustering applications.

24. Fine-tuning vs. Training from Scratch

Fine-tuning: Adapting a pre-trained model to a specific task by training it further on a smaller dataset.
Training from Scratch: Training a model from an uninitialized state, usually requiring extensive data and computing power.

25. OpenAI API

A popular API by OpenAI that provides access to advanced language models like GPT-4o. It allows developers to build applications leveraging LLM capabilities without developing and hosting their own models.

Stop Overengineering: Why Test IDs Beat AI-Powered Locator Intelligence for UI Automation

We have all read the blogs. We have all seen the charts showing how Generative AI can "revolutionize" test automation by magically resolving locators, self-healing broken selectors, and interpreting UI changes on the fly. There are many articles that paints a compelling picture of a future where tests maintain themselves. Cool story. But let’s take a step back. Why are we bending over backward to make tests smart enough to deal with ever-changing DOMs when there's a simpler, far more sustainable answer staring us in the face? - Just use Test IDs. That’s it. That’s the post. But since blogs are supposed to be more than one sentence, let’s unpack this a bit. 1. Test IDs Never Lie (or Change) Good automation is about reliability and stability. Test IDs—like data-testid ="submit-button"—are predictable. They don’t break when a developer changes the CSS class, updates the layout, or renames an element. You know...

My Testing Thoughts

Search This Blog

Glossary of Key Generative AI Terms

Labels

Comments

Post a Comment

Popular posts from this blog

The use of Verbose attribute in testNG or POM.xml (maven-surefire-plugin)

How to Unzip files in Selenium (Java)?

Stop Overengineering: Why Test IDs Beat AI-Powered Locator Intelligence for UI Automation