Skip to main content

Posts

Showing posts with the label AI

Stop Overengineering: Why Test IDs Beat AI-Powered Locator Intelligence for UI Automation

  We have all read the blogs. We have all seen the charts showing how Generative AI can "revolutionize" test automation by magically resolving locators, self-healing broken selectors, and interpreting UI changes on the fly. There are many articles that paints a compelling picture of a future where tests maintain themselves. Cool story. But let’s take a step back. Why are we bending over backward to make tests smart enough to deal with ever-changing DOMs when there's a simpler, far more sustainable answer staring us in the face? -             Just use Test IDs. That’s it. That’s the post. But since blogs are supposed to be more than one sentence, let’s unpack this a bit. 1. Test IDs Never Lie (or Change) Good automation is about reliability and stability. Test IDs—like data-testid ="submit-button"—are predictable. They don’t break when a developer changes the CSS class, updates the layout, or renames an element. You know...

Glossary of Key Generative AI Terms

  1. Generative AI A subset of artificial intelligence focused on creating new content, such as text, images, audio, video, or code. It relies on models trained on vast datasets to identify patterns and generate similar but unique outputs. 2. Large Language Model (LLM) An AI model trained to process and generate human-like text. Examples include GPT (Generative Pre-trained Transformer), BERT, and LaMDA, all of which leverage deep learning architectures, specifically transformers. 3. Transformer A neural network architecture known for its ability to process sequential data, like text. Transformers use self-attention mechanisms, enabling the model to learn relationships between words and capture long-range dependencies efficiently. 4. Retrieval-Augmented Generation (RAG) A hybrid approach combining retrieval-based and generative AI techniques. RAG models retrieve relevant information from external sources (e.g., databases or documents) and then use this information to generate contex...

LlamaParse: Incredibly good at parsing PDFs

  What is LlamaParse? LlamaParse is a proprietary parsing service that is incredibly good at parsing PDFs with complex tables into a well-structured markdown format. It directly integrates with LlamaIndex ingestion and retrieval to let you build retrieval over complex, semi-structured documents. It is promised to be able to answer complex questions that weren’t possible previously. This service is available in a public preview mode: available to everyone, but with a usage limit (1k pages per day) with 7,000 free pages per week. Then $0.003 per page ($3 per 1,000 pages). It operates as a standalone service that can also be plugged into the managed ingestion and retrieval API Currently, LlamaParse primarily supports PDFs with tables, but they are also building out better support for figures, and an expanded set of the most popular document types: .docx, .pptx, .html as a part of the next enhancements. Code Implementation: Install required dependencies: a) Create requirements.txt in t...