TECH

June 12, 2026

SPRING DATA JPA: STREAMLINING DATA ACCESS

In modern web application development, interacting with a database is a core requirement. Spring Boot integrates seamlessly with Spring Data JPA, providing a robust way to manage data persistence with significantly less code. This section explains Spring Data JPA, how it works, practical examples, and its benefits.

I. What is Spring Data JPA?

Spring Data JPA is a powerful framework that helps you:

  • Simplify Database Operations: Perform CRUD (Create, Read, Update, Delete) operations without writing complex SQL.
  • Eliminate Boilerplate: Reduce the need for DAO (Data Access Object) implementation classes.
  • Automatic Query Generation: Create database queries simply by defining method names in an interface.
  • Support Pagination and Sorting: Manage large datasets efficiently with built-in tools.

By adding the spring-boot-starter-data-jpa dependency, Spring Boot configures a production-ready data layer automatically.

II. How Does Spring Data JPA Work?

Spring Data JPA acts as an abstraction layer on top of JPA (Java Persistence API) and Hibernate:

  • The Repository Pattern: It uses interfaces to provide a standard way to access data.
  • Proxy Mechanism: At runtime, Spring Boot automatically creates an implementation for your repository interfaces.
  • Method Name Parsing: When you define a method like findByEmail(String email), the framework parses the name and generates the appropriate JPQL (Java Persistence Query Language) or SQL query.
  • Transaction Management: It handles database transactions automatically, ensuring data integrity.

III. How to Use Spring Data JPA

1. Add Dependency

Maven:

<dependency>

  <groupId>org.springframework.boot</groupId>

  <artifactId>spring-boot-starter-data-jpa</artifactId>

</dependency>

<dependency>

  <groupId>com.h2database</groupId><artifactId>h2</artifactId>

  <scope>runtime</scope>

</dependency>

 

Gradle:

implementation 'org.springframework.boot:spring-boot-starter-data-jpa'

runtimeOnly 'com.h2database:h2'

 

2. Define an Entity

An Entity represents a table in your database:

@Entity

public class Product {

     @Id

     @GeneratedValue(strategy = GenerationType.IDENTITY)

     private Long id;

     private String name;

     private Double price;

                // Getters and Setters

}



3. Create a Repository Interface

By extending JpaRepository, you gain access to all standard CRUD operations.

@Repository

public interface ProductRepository extends JpaRepository<Product, Long> {

    // Automatically generates: SELECT * FROM product WHERE name = ?

    List<Product> findByName(String name);

 

    // Custom query using @Query

    @Query("SELECT p FROM Product p WHERE p.price > :minPrice")

    List<Product> findExpensiveProducts(@Param("minPrice") Double minPrice);

}

 

IV. Example of Spring Data JPA in Action

Managing products in a Service layer:

@Service

public class ProductService {

    @Autowired

    private ProductRepository productRepository;

 

    public Product saveProduct(Product product) {

        return productRepository.save(product); // Saves or Updates

              }

 

              public List<Product> getProductsByName(String name) {

                  return productRepository.findByName(name);

              }

}


When calling productRepository.findAll(PageRequest.of(0, 10)), Spring Data JPA handles the SQL "LIMIT" and "OFFSET" logic behind the scenes.

V. Benefits of Spring Data JPA

  • Development Speed: Write only interfaces; let the framework handle the implementation.
  • Reduced Errors: Automatic query generation prevents common syntax errors in SQL.
  • Easy Pagination: Built-in support for Pageable and Sort makes UI integration simple.
  • Vendor Independence: Easily switch between different databases (H2, MySQL, PostgreSQL) with minimal configuration changes.
  • Seamless Integration: Works perfectly with Spring Security for data-level protection and Spring Boot Auto-Configuration.

VI. Conclusion

Spring Data JPA allows you to manage your application's data layer with elegance and efficiency. It removes the burden of repetitive data access code, allowing developers to focus on building features rather than debugging SQL strings.

Whether you need scalable software solutions, expert IT outsourcing, or a long-term development partner, ISB Vietnam is here to deliver. Let’s build something great together—reach out to us today. Or click here to explore more ISB Vietnam's case studies.

[References]

https://spring.io/projects/spring-data-jpa

https://docs.spring.io/spring-data/jpa/docs/current/reference/html/

https://www.baeldung.com/the-persistence-layer-with-spring-data-jpa

https://spring.io/guides/gs/accessing-data-jpa/

View More
TECH

June 12, 2026

Upload CSV to AWS S3 with React/Next.js

Upload CSV to AWS S3 with React: Setup and Code

First, many data apps use sync jobs. Therefore, letting users send CSV files straight to S3 is smart. In addition, this setup splits the UI from background tasks. As a result, direct uploads boost both speed and trust.

Moreover, this guide shows how direct transfers work. Furthermore, we look at common setups and safe rules. Also, you will learn about CORS and React code.

View More
TECH

June 12, 2026

From AZ-900 to Real Understanding: Understanding Microsoft Azure Benefits

After completing the AZ‑900 certification, I realized that Microsoft Azure is far more than a cloud platform—it’s a well‑designed ecosystem built to support modern applications at scale. In this article, I’ll introduce Azure using the same structured framework I studied for AZ‑900, but with a stronger focus on practical, real‑world understanding from a developer’s perspective.

I. Understanding Cloud Concepts Before Using the Cloud

Before deploying anything, it is essential to understand what cloud computing actually provides.

Microsoft Azure is a cloud computing platform developed by Microsoft that offers on-demand computing resources over the internet. Instead of managing physical servers, developers provision infrastructure and services dynamically.

Service Models

Azure services are generally categorized into three cloud models:

  • IaaS (Infrastructure as a Service)
    Infrastructure as a service gives you maximum control over your cloud environment because you manage almost everything except the physical hardware. The provider handles the datacenter equipment, internet connectivity, and on‑site security, while you take charge of installing and maintaining operating systems, configuring networks, setting up storage and databases, and managing applications. In practice, it’s like renting servers and networking gear in someone else’s datacenter, with full freedom to decide how those resources are used.
    Example: Running a Linux server on Azure Virtual Machines.
  • PaaS (Platform as a Service)
    Platform as a service can be described as a cloud model that sits between IaaS and SaaS, giving you a managed environment for building and running applications without dealing with the underlying system layers. The provider takes care of the physical servers, security, networking, and also the software stack that supports development—such as operating systems, middleware, runtime environments, and analytics tools. Because these layers are handled for you, you don’t need to manage licenses, updates, or patches for the OS or databases.
    Example: Deploying a web application on Azure App Service.
  • SaaS (Software as a Service)
    Software as a service refers to using a fully built, ready‑to‑use application that runs in the cloud. Instead of installing or managing the software yourself, you simply access it—common examples include email services, accounting tools, messaging platforms, and collaboration apps. Because everything is already developed and maintained by the provider, you’re essentially subscribing to a complete product.

Even though SaaS offers the least customization, it’s also the simplest and fastest model to adopt. It requires minimal technical skill because all updates, maintenance, and infrastructure responsibilities are handled for you.
Example: Microsoft 365.

II. Networking and Storage

Azure Networking

Azure’s networking services provide the core connectivity that allows your cloud resources to communicate securely and efficiently. At the foundation are services like Virtual Networks (VNets), Private Link, Azure DNS, Bastion, Route Server, NAT Gateway, and Traffic Manager, which together create a customizable and secure network environment for your applications. These tools let you isolate workloads, manage routing, control inbound and outbound traffic, and connect on‑premises networks to Azure.

Beyond basic connectivity, Azure also offers load balancing and content delivery capabilities—such as Load Balancer, Application Gateway, and Azure Front Door—to distribute traffic, improve performance, and ensure high availability. These services help optimize how applications respond to user requests, whether they’re internal workloads or global web applications.

Security is built into the networking layer through features like network security groups, firewalls, and private endpoints, allowing you to tightly control which resources can communicate and how that communication happens.

Azure Storage

Azure Storage is Microsoft’s cloud‑based platform for storing and managing data at massive scale. It’s designed to be highly available, durable, secure, and globally accessible, making it suitable for everything from simple file storage to large‑scale analytics workloads. Azure Storage supports multiple data types and offers tools that developers and IT teams can use from anywhere via HTTP/HTTPS and REST APIs.

Core Characteristics:

  • Massive scalability — Designed to grow with your data needs, from gigabytes to petabytes.
  • High durability and availability — Multiple copies of your data are stored to protect against failures.
  • Strong security — Encryption, network isolation, and identity-based access controls are built in.
  • Global accessibility — Data can be accessed from anywhere over secure HTTP/HTTPS endpoints.
  • Developer-friendly — Supports REST APIs and client libraries for .NET, Java, Python, JavaScript, C++, and Go.

Azure maintains extra copies of your data to ensure availability and durability, even when failures occur. These failures can include hardware issues, power or network outages, or natural disasters. Choosing a redundancy option is a balance between cost, performance, and resilience.

Redundancy within the primary region

These options keep your data inside a single Azure region.

- Locally Redundant Storage (LRS) — Your data is copied three times within a single datacenter. It’s the most cost‑effective option but doesn’t protect against a full datacenter outage.

- Zone‑Redundant Storage (ZRS) — Your data is stored across three separate availability zones within the same region. This protects against datacenter‑level failures while staying within one region.

Redundancy across regions

These options replicate your data to a geographically distant secondary region.

- Geo‑Redundant Storage (GRS) — Your data is stored three times in the primary region (like LRS) and then copied to a secondary region for disaster recovery.

- Read‑Access Geo‑Redundant Storage (RA‑GRS) — Same as GRS, but you can read from the secondary region. This improves availability during regional outages.

III. Security and Identity

Azure security and identity in the cloud revolve around protecting access to resources through strong authentication, authorization, and continuous threat-aware controls. At the center of this model is Microsoft Entra ID, which provides identity management, single sign‑on, multifactor authentication, and role‑based access control to ensure that only the right people and applications can reach the right resources.

Security Concepts in Azure

Azure security concepts center on protecting identities, data, applications, and infrastructure through a multilayered, defense‑in‑depth approach. This model combines built‑in platform protections, shared responsibility between Microsoft and customers, and advanced security services that detect and respond to threats. Azure emphasizes securing every layer—from physical datacenters to identities, networks, and workloads—because cloud environments face constantly evolving cyber risks.

Defense in depth

Azure applies multiple layers of protection across physical, network, identity, application, and data layers. If one layer is compromised, others continue to protect the environment. This includes secure datacenters, network segmentation, identity controls, encryption, and monitoring.

The Zero Trust model

Zero Trust treats every network—internal or external—as untrusted, so no user or device is assumed safe by default. It follows the idea of “never trust, always verify,” meaning every access request must be authenticated, authorized, and continuously validated before anything is granted.

Data protection

One way to mitigate against common cybersecurity threats is to encrypt sensitive or valuable data. Encryption is the process of making data unreadable and unusable to unauthorized viewers. To use or read encrypted data, it must be decrypted, which requires the use of a secret key.

Shared responsibility model

Microsoft secures the physical infrastructure, hosts, and foundational services, while customers secure their identities, data, applications, and configurations. Understanding this division is essential for building a secure cloud environment.

Core Identity Concepts in Azure

Identity management

Identity management ensures that every user, device, or application accessing Azure resources is properly authenticated and authorized. Microsoft Entra ID acts as the cloud identity provider, extending on‑premises Active Directory to the cloud and enabling unified access across thousands of SaaS and on‑premises applications.

Single sign‑on (SSO)

SSO allows users to sign in once and access multiple applications without repeatedly entering credentials. This reduces password fatigue and improves security by minimizing exposed credentials. Entra ID supports SSO for a wide range of cloud and on‑premises apps.

Multifactor authentication (MFA)

MFA adds a second verification step—such as an authenticator app, biometric sign‑in, or security key—to strengthen protection against unauthorized access. It provides a critical extra layer of defense while keeping the sign‑in experience smooth.

Role‑based access control (RBAC)

RBAC assigns permissions based on roles rather than individual accounts, ensuring users only have the access they need. This supports the principle of least privilege and helps reduce accidental or malicious misuse of resources.

Additional security capabilities

Azure identity and security services also support modern frameworks such as Zero Trust and conditional access, which evaluate user identity, device health, location, and risk signals before granting access. These approaches help organizations defend against evolving threats.

IV. Pricing Model and Operational Flexibility

Azure Pricing Model

Pay‑as‑you‑go

You are billed based on actual consumption with no upfront commitment. This is ideal for workloads that change frequently because you can scale resources up or down instantly.

Reserved capacity

You commit to using a service (such as virtual machines or databases) for one or three years in exchange for a lower price. This is best for predictable, always‑on workloads.

Spot pricing

You use unused Azure capacity at a steep discount, with the understanding that Azure can reclaim the resources at any time. This works well for batch jobs, testing, or workloads that can tolerate interruptions.

What affects cost

  • Service type (compute, storage, networking)
  • Region where the service runs
  • Performance tier
  • Data transfer
  • Duration of usage

These factors allow organizations to tailor spending to their technical and financial goals.

Operational Flexibility in Azure

Elastic scalability

Azure resources can automatically scale based on demand. This prevents overprovisioning and reduces wasted cost.

Global deployment

Azure’s worldwide datacenter network lets you run applications close to users, improving performance and offering redundancy options.

Multiple service tiers

Most Azure services offer different performance levels, allowing you to choose between cost‑optimized or high‑performance configurations.

Cost management tools

Azure provides budgeting, monitoring, and optimization tools to help track spending and identify savings opportunities.

V. What AZ-900 Changed in My Perspective

Studying for AZ-900 was not just about passing an exam. It helped me structure cloud knowledge into:

  1. Concepts
  2. Services
  3. Security
  4. Cost management

More importantly, it shifted my mindset from “managing servers” to “designing scalable systems.”

VI. Conclusion

Microsoft Azure is not just a collection of cloud services. It is a comprehensive ecosystem designed to support modern software architectures — from startups to global enterprises.

For developers transitioning from traditional infrastructure or embedded systems to cloud-native environments, Azure provides a structured path forward.

Earning AZ-900 was only the beginning. The real value comes from applying these concepts in real-world architectures.

Ready to get started?

Contact IVC for a free consultation and discover how we can help your business grow online.

Contact IVC for a Free Consultation

References:

https://learn.microsoft.com/en-us/training/modules/describe-cloud-service-types/2-describe-infrastructure-service

https://learn.microsoft.com/en-us/training/modules/describe-cloud-service-types/3-describe-platform-service

https://learn.microsoft.com/en-us/training/modules/describe-cloud-service-types/4-describe-software-service

https://learn.microsoft.com/en-us/training/modules/describe-azure-compute-networking-services/8-virtual-network

https://learn.microsoft.com/en-us/training/modules/describe-security-concepts-methodologies/3-describe-defense-depth

https://learn.microsoft.com/en-us/training/modules/describe-security-concepts-methodologies/4-describe-zero-trust-model

https://learn.microsoft.com/en-us/training/modules/describe-security-concepts-methodologies/5-describe-encryption-hashing

https://learn.microsoft.com/en-us/training/modules/describe-security-concepts-methodologies/2-describe-shared-responsibility-model

https://learn.microsoft.com/en-us/training/modules/describe-identity-principles-concepts/3-define-identity-primary-security-perimeter

https://azure.microsoft.com/en-us/pricing/https://www.swiftorial.com/tutorials/cloud_computing/azurecloud/introduction_to_azurecloud/azure_pricing_models/

View More
TECH

May 29, 2026

My First Steps from Manual to Automation QC

After more than 1.5 years working in Manual Testing, I realized that for large-scale systems with long-term development cycles, Regression Testing is a major bottleneck. Manually re-executing test cases is not only repetitive and time-consuming but also highly prone to human error. That is when I knew Automation Testing was the inevitable next step.

If you are also starting as a Manual QC like me, don't worry—it is actually a huge advantage. Here is my journey and some tips from when I first started learning Automation with Playwright using Python.

1. What to Prepare Before Starting

Test Case Writing Skills (The Foundation)

This is an essential skill that every QC needs, whether you do manual or automation testing.

  • Proactiveness: If you write your own Test Cases, you will have a better grasp of the requirements (specs). When converting them into scripts, you might discover inconsistencies or gaps in the test cases, allowing you to optimize and update them quickly. If you rely solely on existing Test Cases to write scripts, you might spend too much time trying to understand them. Also, you can't be sure if those Test Cases are still accurate according to the design specs. This makes your scripting process quite passive.
  • Know what to automate: Not everything should be automated. Deep business understanding helps in deciding which cases to automate (high repetition, stable) and which to test manually (frequent UI changes, overly complex logic). This saves a lot of wasted effort.

Basic Knowledge

  • Programming Language: You don't need to be an expert developer. Start with the core concepts: variables, data types, operators, loops, conditional statements (if/else), and functions. Additionally, grasping basic Object-Oriented Programming (OOP) concepts (like Classes and Objects) is highly recommended. Having basic knowledge of a programming language will make the learning curve smoother when approaching a new test scripting language. You can start with Python because its syntax is very close to natural language, making it extremely easy for beginners to learn.
  • Web Knowledge: Explore the HTML/DOM structure and various element attributes to craft effective locators (such as CSS Selectors and XPath). Master the use of browser DevTools (F12) to inspect elements, debug, and monitor Network requests. Additionally, understanding asynchronous mechanisms (Promises, Async/Await) and element states (e.g., visible, enabled) is crucial for leveraging the Auto-wait feature, ensuring that scripts run stably.
  • Version Control (Git): Automation test code is as important as the application's source code. Knowing how to use a Version Control System is a must. You don't need to memorize advanced commands right away. Instead, just get used to basic daily tasks: cloning a repository, creating branches for new test cases, saving your changes (commit), and syncing (pull/push) with platforms like GitHub or GitLab. Knowing these basics will make sure you can manage your script versions safely, track your own changes, and work smoothly with other QCs and Developers without code conflicts.

2. Learning Playwright: Key Concepts

When I started with Playwright, these were the most important things I learned:

Project Setup Commands

Before writing any code, you need to install the Playwright library and the browsers it uses to run the tests. In Python, you just need to type these two simple commands into the terminal:

  • pip install pytest-playwright: This command installs the Playwright framework along with pytest (a very popular testing tool in Python).

  • playwright install: This command downloads the necessary browser engines (like Chromium, Firefox, and WebKit) so Playwright can open them and simulate user actions.

Test Execution Commands 

After writing your test scripts, you need to run them to see if they Pass or Fail. Here are the commands you might use every day:

  • pytest: This command runs all your test files silently in the background (this is called Headless mode - no UI is shown, and it is the default setting).

  • pytest --headed: This command runs the tests and actually opens the browser window for you to see. This is super helpful for beginners, as it lets you watch the bot clicking and typing on the screen with your own eyes.

  • pytest test_login.py: This command only runs one specific test file (for example, the test_login.py file) instead of making the computer run all the files.

Locators (How to find elements)

In the past, QC engineers heavily relied on XPath or CSS Selectors, which can break easily if the DOM structure changes. While Playwright still supports them, it strongly recommends using User-facing Locators—finding elements based on how a user actually perceives them on the screen:

  • get_by_role: Locates elements by their implicit role (e.g., button, input) and display name.

    Python example: page.get_by_role("button", name="Submit").click()

  • get_by_text: Locates elements by the exact text displayed on the screen.

    Python example: page.get_by_text("Forgot password").click()

  • get_by_label: Locates input fields based on their associated Label (for example, an "Email" field).

    Python example: page.get_by_label("Email").fill("test@example.com")

Auto-waiting

This is a "magic" feature. Playwright automatically waits for a button to appear or be ready before clicking it. You don't have to manually write "wait 5 seconds" anymore, which makes your tests much more stable.

Assertions (Checking the results)

To check if a test Passed or Failed, we use expect function.

One of the most powerful features here is auto-retrying assertions. You don't need to write complex code to wait for an element to load. Playwright is smart enough to automatically wait and retry the check until the condition becomes true (or until the timeout is reached).

For example:  

  • Checking if a success message is visible on the screen:

    Python example: expect(page.get_by_text("Login Successful")).to_be_visible() 

  • Checking if an input field contains exactly the expected value:

    Python example: expect(page.get_by_label("Email Address")).to_have_value("test@example.com")

  • Checking if a specific element is disabled (cannot be clicked):

    Python example: expect(page.get_by_role("button", name="Submit")).to_be_disabled()

Read more at: https://playwright.dev/python/docs/test-assertions

3. Automation Support Tools - Work Smarter

Below are the tools that will help you work faster when coding automation test scripts:

  • Codegen: This is a "magical" tool for automatic code generation. You simply interact with the web interface, and Playwright automatically records and converts your actions into source code. It is an excellent way to learn syntax during the early stages.
  • AI Tools (like Cursor): AI is a powerful asset that can speed up code writing based on Test Cases. By crafting a precise prompt in your desired format and providing the Test Case for reference, the AI will quickly generate the test script. Your remaining tasks are to review, run it with Pytest, and refine the logic, saving you the time of coding line by line.
  • UI Mode & Trace Viewer: Don't panic when a script fails. UI Mode and Trace Viewer allow you to replay the "movie" of your test process, inspecting exactly where it failed and seeing the UI state at that moment to find clues for a faster fix.
  • Reports: After running tests, Playwright automatically creates a professional HTML report with full statistics: Passed, Failed, Flaky (intermittent)... and can even include screenshots for each case if configured.

4. Next Steps

Once you master these basics, the next challenge is organizing your code. I highly recommend looking into the Page Object Model (POM)—a design pattern that will make your scripts scalable, readable, and much easier to maintain as your project grows.

Final Thoughts

And that’s everything I’ve learned and prepared while getting started with automation testing.

The journey from Manual to Automation isn’t as scary as I first thought. As long as you have a solid foundation in Manual testing, plus a bit of patience with modern tools like Playwright, you’ll find the work becomes much more exciting and rewarding.

Good luck to you (and to me too!) as we push further on this Automation QC path!

Ready to get started?

Contact IVC for a free consultation and discover how we can help your business grow online.

Contact IVC for a Free Consultation
View More
TECH

April 23, 2026

Understanding AI Terminology (Part 1)

In today’s IT world, we are surrounded by AI talk. Whether you are a developer, a project manager, or a IT translator, understanding these concepts is no longer optional—it is essential. However, the technical jargon can be overwhelming. Let’s break down the most important AI terms into simple, real-world ideas.

The Big Picture: AI, ML, and Deep Learning

AI, ML, and Deep Learning1

To understand how AI is built, imagine a set of Russian Dolls (Matryoshka) where one sits inside the other. (Read more about the AI hierarchy).

The largest, outermost doll is Artificial Intelligence (AI). This is the broad goal of creating machines that can mimic human intelligence. In the early days, this was done using fixed rules. Think of a chess-playing robot from the 90s; it didn't "learn" anything, it just followed a long list of "If-Then" instructions written by a human.

Inside that is the middle doll: Machine Learning (ML). This is a smarter way to reach the goal of AI. Instead of writing every rule, we give the machine a massive amount of data and let it find patterns on its own. A classic example is a Spam Filter. You show the system 10,000 "Spam" emails and 10,000 "Real" emails. The machine eventually notices that spam often contains words like "FREE" or "WINNER" and starts blocking them automatically without being told exactly what to look for.

Finally, the smallest doll at the center is Deep Learning (DL). This is the most advanced type of ML, using "Neural Networks" that act like a human brain to handle very messy data. This technology is what powers the Face-Unlock feature on your phone. To recognize your face—even if you grow a beard, wear glasses, or get older—the AI needs the deep, brain-like power of DL to analyze thousands of tiny details in your features.

Moving to language, we often hear about Large Language Models (LLMs). These are specific types of AI, like ChatGPT or Gemini, that are trained on almost everything written on the internet. You can think of an LLM as a super-advanced "Auto-complete." If you type "The capital of France is...", it predicts the next word is "Paris" simply because it has seen that pattern millions of times before.

How AI Goes to School

How AI Goes to School

Before an AI can work, it must undergo a rigorous learning process. This starts with Annotation (or Data Labeling), which we can think of as the "Teacher Phase." AI doesn't inherently understand the world; it needs to be told exactly what it’s looking at. Humans, known as Annotators, must look at thousands of data points and "tag" them.

For example, for a Self-Driving Car to function, humans must manually draw boxes around objects in street photos, labeling them as "Tree," "Stop Sign," or "Pedestrian." This provides the AI with "Ground Truth"—the factual foundation it needs to perceive reality. Without this meticulous human help, the AI is essentially blind.

Once an AI has gained general intelligence, we can give it "extra lessons" through a process called Fine-tuning. Instead of building a new model from scratch, we take a pre-trained general AI and show it a specific dataset—like 5,000 legal contracts. Through this specialization, it stops being a general chatbot and evolves into a Legal AI Specialist that masters the complex nuances of law. This approach is highly efficient, saving both time and massive computing costs.

However, the biggest challenge in AI is a problem called Overfitting. This happens when an AI learns the training data too perfectly—it memorizes the specific examples instead of learning the logic.

To understand this, let’s look at a simple example: Teaching an AI to recognize a "Bird."

Imagine you give the AI thousands of photos of birds to study. However, there is a small problem: every bird in your photos is red.

  • Balanced Learning (Correct): The AI looks at the wings, the beak, and the feathers. It understands that a bird is a creature with these specific features.
  • Overfitting (The Trap): The AI looks at the color and concludes: "Anything that is red is a bird."

When you test the AI with a photo of a blue bird, the AI will say: "This is NOT a bird" because it isn't red. The AI failed because it didn't learn the "logic" of what a bird is; it only memorized the "color" from your specific photos.

In the IT world, we want our AI to have Generalization—the ability to handle new, unseen situations correctly, rather than just memorizing old data.

Tokens, Memory, and the Art of "Confident Lying"

Tokens, Context Window, and Hallucination

Once we understand how AI learns, we need to look at how it actually "reads" our instructions. While we see words and sentences, the AI sees the world through Tokens. Think of tokens as the "atoms" of language—small fragments that the AI uses to turn our text into numbers it can calculate. A common word like "apple" might be just one token, but a complex one like "terminology" gets chopped into pieces: termin, olo, and gy.

This isn't just a technical detail; it’s the "currency" of AI. Most AI companies charge you based on how many tokens you send in and how many the AI spits out. Interestingly, for us in the IT world, this means language matters. Because of how AI is built, languages like Vietnamese often require more tokens than English to say the same thing, making the "cost" of a conversation slightly different depending on the language you use.

But these tokens don't just cost money; they also take up space. Every AI model works at what I like to call a Context Window—or its short-term memory. Imagine the AI is a brilliant employee sitting at a very small desk. Every PDF you upload, every old message in the chat, and every instruction you give must fit on that desk for the AI to "see" it.

If your conversation gets too long and exceeds the "desk space," the AI has to start throwing the oldest papers into the trash to make room for new ones. This is exactly why, after a long brainstorming session, the AI might suddenly forget your name or the very first rule you set—it simply ran out of room on its desk.

Nowadays, tech giants are racing to build "giant desks," expanding these context windows so AI can "read" an entire book series in one go. But even with a massive memory, AI has a famous flaw: it loves to tell "confident lies," a phenomenon we call Hallucination.

You see, at its heart, an AI is a high-speed probability engine. It doesn't actually check a "truth database" to see if a fact is real. Instead, it asks itself: "Given the words I've seen so far, what is the most likely next word?" If it doesn't have the exact answer, it won't simply say "I don't know" (unless we tell it to). Instead, it will keep predicting the next word to finish the sentence.

This is how you get professional-sounding answers about non-existent coding functions that look perfect but return an undefined error the moment you run them. The AI prioritizes sounding logical and grammatically perfect over being factually true. For those of us working as Developers or PMs, this is the ultimate reminder: never trust numbers or specific names 100%. Always cross-reference and fact-check.

How do we fix this?

To stop these lies, we use RAG (Retrieval-Augmented Generation). AWS provides a great deep dive into this architecture. Think of this as an "Open Book Exam." Instead of letting the AI guess, a RAG system first searches a reliable source—like your company's handbook—and tells the AI: "Only answer based on this text." We also use Prompt Engineering (giving clear, detailed instructions) and Guardrails (safety fences that block dangerous or wrong answers) to keep the AI on the right track.

Quick Advice for IT Professionals

  • Don't trust, verify: Always fact-check names, dates, and code. AI is a master of sounding professional even when it’s wrong.
  • Be specific: Treat AI like a smart intern. The more context you give in your prompt, the better the result.
  • Watch the tokens: Keep your inputs clean to save costs and avoid hitting the memory limit.

Join the AI Revolution with ISB Vietnam

At ISB Vietnam, we are not just watching the AI revolution—we are leading it. We believe that AI is most powerful when handled by experts who understand its strengths and its "hallucinations." That’s why we are constantly training our developers to master AI tools, ensuring that our software solutions are not just fast, but smart and secure.

Whether you need scalable software solutions, expert IT outsourcing, or a long-term development partner, ISB Vietnam is here to deliver.

Are you a tech talent looking to work in an environment that embraces AI? We are always looking for passionate people to join our team and push the boundaries of what's possible.

Let’s build something great together—reach out to us today!

Image source: Generated by Gemini

View More
TECH

April 23, 2026

File Handling in C++ via MapViewOfFile

Memory-mapped files are one of the most powerful features available to Windows C++ developers. At the center of this mechanism is MapViewOfFile, a function that allows you to treat file contents as if they were part of your program’s memory.

In this blog post, we’ll walk through a complete example and explain every handle involved — what it represents, why it exists, and how it fits into the Windows memory model.

What Exactly Is MapViewOfFile?

Memory-mapped files are a core part of the Windows memory management system. Instead of manually reading file data into buffers, Windows allows you to map a file directly into your process’s virtual memory. The function responsible for this is MapViewOfFile.

In simple terms:

MapViewOfFile lets you treat a file on disk as if it were an array in memory.

Once mapped, you can read (or write) file contents using normal pointer operations — no repeated calls to ReadFile, no manual buffering.

This mechanism is part of the Win32 API and works together with:

  • CreateFile
  • CreateFileMapping
  • UnmapViewOfFile

Why Memory-Mapped Files Exist

Traditional file I/O works like this:

  1. Request data from the OS
  2. OS copies file data into a buffer
  3. Your program reads from that buffer

Memory mapping removes extra copying. The operating system:

  • Maps file data into virtual memory
  • Loads pages only when accessed (on demand)
  • Uses the system cache efficiently
  • Allows sharing between processes

This makes memory mapping ideal for:

  • Large files
  • High-performance systems
  • Random file access
  • Inter-process communication

How Mapping a File Works

Mapping a file involves three important objects:

  1. hFile — File Handle
  2. hMapping — File Mapping Handle
  3. pView — Mapped Memory Pointer

Let’s walk through each one conceptually.

hFile — The File Handle

We start by calling CreateFile.

        HANDLE hFile = CreateFile(...);

What It Represents?

hFile is a handle to a file object managed by the Windows kernel. It does not contain the file data. Instead, it represents:

  • A reference to an open file
  • Access permissions
  • File metadata
  • A kernel-managed file object

Think of it as your program’s official permission slip to access the file

Why It’s Needed?

The file handle tells Windows: “I want to work with this file, and here are my access rights.” Without this handle, you cannot create a file mapping.

When It’s Released?

        CloseHandle(hFile);

Once closed, the program no longer has access to the file.

 hMapping — The File Mapping Handle

We start by calling CreateFileMapping.

        HANDLE hMapping = CreateFileMapping(hFile, ...);

What It Represents?

This handle represents a file mapping object, which is a kernel object describing:

  • How the file will be mapped
  • Protection flags (read-only, read/write, etc.)
  • Maximum size of the mapping

Important:
This still does not map the file into memory. Instead, it creates a blueprint for mapping.

Conceptually:

If hFile is the permission slip to the file, hMapping is the architectural plan for how the file will appear in memory.

Why It Exists Separately?

Windows utilizes a structured approach by separating file access into three distinct layers: the file object, the mapping object (configuration), and the view (the actual memory mapping).

This decoupled architecture provides significant flexibility, enabling developers to create multiple mappings with varying protection levels and facilitate seamless shared memory between processes.

Furthermore, this separation offers granular control over advanced memory management tasks.

When It’s Released?

        CloseHandle(hMapping);

This removes the mapping object from the system.

pView — The Mapped View Pointer

We start by calling MapViewOfFile.

        LPVOID pView = MapViewOfFile(hMapping, ...);

This is where the file actually becomes accessible as memory.

What It Represents?

This is not a handle. It is a pointer to virtual memory inside your process. This is where the magic happens.

When you invoke the MapViewOfFile function, Windows performs a sophisticated memory orchestration.

First, it reserves a specific range of address space within your process. It then creates a direct link between this space and the file's data.

Rather than loading the entire file at once, the OS intelligently loads data pages into physical memory only when they are accessed—a process known as 'on-demand paging.'

Consequently, the file ceases to be a distant object on the disk and begins to behave like a standard in-memory array.

You can now do:

        char* data = static_cast<char*>(pView);

        std::cout << data[0];

No ReadFile, no buffers — just pointer access.

When It’s Released?

        UnmapViewOfFile(pView);

This removes the file from your process’s address space.

How the OS Delivers the Data

Unlike ReadFile, the OS does not immediately load the entire file. Instead, it will do the following actions:

  • Accessing memory triggers a page fault
  • The OS loads the required page from disk
  • The system cache keeps it in memory
  • Future accesses are fast

This mechanism is extremely efficient and is one reason memory-mapped files scale well for large datasets.

Cleaning Up Properly

Each object must be released in reverse order:

  • UnmapViewOfFile(pView);
  • CloseHandle(hMapping);
  • CloseHandle(hFile);

Why this order?

It's important to remember that these elements are interconnected. Because the view is tied to the mapping object, and the mapping object is tied to the file handle, they must be released in a specific order. Failing to do so can lead to unexpected crashes or unstable application behavior.

Sharing Memory Between Processes

One powerful feature of file mappings:

If multiple processes open the same-named mapping object, they can share memory.

Instead of mapping a disk file, you can even pass INVALID_HANDLE_VALUE to CreateFileMapping to create shared memory backed by the system paging file.

This is a common IPC (Inter-Process Communication) technique in Win32.

Conclusion

MapViewOfFile is not just a function — it’s a gateway into Windows’ virtual memory system.

The process involves:

  1. Opening a file (CreateFile)
  2. Creating a mapping object (CreateFileMapping)
  3. Mapping a view into memory (MapViewOfFile)
  4. Accessing file data through a pointer
  5. Cleaning up with UnmapViewOfFile

While it may feel lower-level compared to C++ standard streams, it provides unmatched control and performance.

If you're building performance-critical Windows applications — such as game engines, database systems, or file-processing tools — understanding memory-mapped files will make you a significantly stronger systems developer.

Reference:

https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-mapviewoffile

https://learn.microsoft.com/en-us/windows/win32/memory/file-mapping

Ready to get started?

Contact IVC for a free consultation and discover how we can help your business grow online.

Contact IVC for a Free Consultation
View More
TECH

April 23, 2026

Mendix & Agile: When Low-code is More Than Just "Drag-and-Drop"

In the development world, Mendix is often discussed as a tool for rapid application building. However, that speed doesn't just come from reducing lines of code; it stems from its perfect integration with the Agile methodology. If you consider Low-code the engine, then Agile is the steering system that keeps the project on track.

1. Breaking the "Waterfall" Prejudice

source: https://academy.mendix.com/index3.html#/lectures/3136

Many software projects fail not because of poor code, but due to the rigidity of the Waterfall process. In a rapidly changing market, fixing the scope at the very beginning is a massive risk.

Agile in Mendix inverts the traditional project management triangle:

  • Fixed Resources and Time: You know exactly what you have and how long a Sprint lasts—typically two weeks.

  • Flexible Scope: Instead of doing everything halfway, the team focuses on completing the most valuable features to deliver a working product after every cycle.

2. Agile Mindset: Living with Change

Being Agile is less about the mechanics and more about the Agile Mindset. For a Mendix Developer, this mindset boils down to three principles:

  • Small and Focused: Breaking work into smaller pieces (User Stories) increases focus and enables quick results.

  • Feedback is a Gift: It is better to fail early and fix early than to receive negative feedback after the project has ended.

  • Ownership: In an Agile team, there is no "micro-manager." Every member takes initiative and responsibility for their own tasks.

3. Team Structure: Core Team and Experts

Mendix optimizes development through cross-functional teams:

source:https://academy.mendix.com/link/modules/390/The-Agile-Methodology

    • Core Team: Consists of the Product Owner (vision manager), Scrum Master (process guardian), and 2-3 Business Engineers (the ones building the app).

    • Subject Matter Experts (SMEs): Experts in UX/UI, Security, or Integration "fly in" when a Sprint requires deep specialized knowledge and leave once the task is complete.

4. Realizing Agile with Mendix Tools

Mendix doesn't just keep Agile on paper; the platform provides powerful execution tools:

source: https://academy.mendix.com/link/modules/390/The-Agile-Methodology

  • Epics & User Stories: Manage the backlog and roadmap directly on the Portal using the standard structure: "As a... I want... so that...".

  • Feedback Widget: This is the direct bridge between users and developers. Feedback is sent straight to the Portal for the PO to evaluate and include in the next Sprint.

  • Lean Thinking: Leveraging Reusable Components reduces waste and allows the team to focus on creating new value.

5. The Roadmap to Digital Execution Success

To maximize the effectiveness of a Mendix project, you should follow a 5-step roadmap:

  1. Understand the Context: Know exactly why the project needs Agile.

  2. Establish the Mindset: Build trust and transparency within the team.

  3. Define Roles Clearly: Ensure everyone understands their authority and responsibilities.

  4. Sprint 0: Prepare infrastructure, design wireframes, and align goals before coding begins.

  5. Execute and Improve: Build while reflecting to optimize performance continuously.

Conclusion: Developing on Mendix without Agile is like having a supercar but driving it on a road full of potholes. Combine the power of Low-code with the flexibility of Agile to create truly breakthrough products.

Whether you need scalable software solutions, expert IT outsourcing, or a long-term development partner, ISB Vietnam is here to deliver. Let’s build something great together—reach out to us today. Or click here to explore more ISB Vietnam's case studies.

View More
TECH

April 23, 2026

ECR, ECS, vs EKS: Understanding AWS Containers

In the era of modern software development, packaging applications with Docker is just the beginning. When you have dozens or even hundreds of Microservices that need to run concurrently, auto-recover from failures, and scale in an instant, you need Container Orchestration tools.

In the AWS ecosystem, this challenge is perfectly solved by a trio of services: Amazon ECR (Storage), alongside two orchestration options, Amazon ECS and Amazon EKS. Understanding and choosing the right "conductor" will determine the success of your infrastructure architecture.

1. Amazon ECR (Elastic Container Registry): The Secure "Vault"

Before containers can run, they need a secure place to be stored. ECR is a fully managed container registry by AWS, similar to Docker Hub but tailored for the enterprise ecosystem.

  • Enterprise-Grade Security: Deeply integrated with AWS IAM. You can grant granular permissions down to each repository (e.g., Server A can only "pull", Developer B can "push").

  • Automated Image Scanning: ECR automatically scans for software vulnerabilities (CVEs) whenever a new image is pushed—a mandatory feature for healthcare (HIPAA compliant) or financial systems (PCI-DSS compliant).

  • Speed & Optimization: Thanks to AWS's internal network infrastructure, pulling images from ECR to ECS or EKS happens with near-zero latency.

Once images are ready on ECR, we face a crossroads: Should we choose ECS or EKS to run them?

2. Amazon ECS (Elastic Container Service): Simple, Fast & Optimized for AWS

ECS is the "native" container orchestration solution developed by AWS. The philosophy of ECS is to deliver maximum simplicity for users operating within the AWS ecosystem.

  • Low Learning Curve: If your team lacks Kubernetes experience, ECS is the perfect choice. Concepts like Task Definitions and Services in ECS are very straightforward to grasp.

  • Deep Integration: The biggest strength of ECS is its seamless cohesion with other AWS services (ALB, Route 53, CloudWatch, IAM).

  • The Power of AWS Fargate: Both ECS and EKS support Fargate (Serverless compute for containers), but the Fargate experience on ECS is significantly smoother and more seamless. You simply deploy the container, and AWS handles the entire underlying infrastructure.

3. Amazon EKS (Elastic Kubernetes Service): Unmatched Power & Industry Standard

If ECS is an easy-to-drive automatic car, EKS is an F1 racing car with countless customizable buttons. EKS is a managed Kubernetes (K8s) service—the open-source platform that currently serves as the global "gold standard" for container orchestration.

  • Massive Ecosystem: K8s boasts the largest open-source community. Thousands of tools (Helm, Prometheus, Istio, ArgoCD) are natively designed to run on K8s.

  • No Vendor Lock-in: Because EKS is fundamentally standard Kubernetes, you can easily "lift and shift" your entire system from AWS to Google Cloud (GKE), Azure (AKS), or even run it on physical servers (On-premise) without rewriting extensive configurations.

  • Maximum Flexibility: EKS allows you to deeply customize network configurations (Custom CNI), schedule containers (Advanced Scheduling), and manage complex resources.

4. Comparison Table: ECS vs. EKS

To easily visualize the differences, here is a quick comparison between the two services:

 Criteria  Amazon ECS  Amazon EKS
 Core Technology  AWS Proprietary  Open-source platform (Kubernetes)
 Complexity  Low - Easy to learn and operate  Very High - Requires specialized DevOps team
 Ecosystem  Integrated with AWS native tools  Massive open-source ecosystem (CNCF)
 Vendor Lock-in  High (Hard to migrate to other clouds)  Low (Easy to migrate across Multi-cloud/On-premise)
 Control Plane Cost  Free (Pay only for compute resources used)  ~$73/month per EKS Cluster
 Best Suited For  Startups, fast-to-market projects, AWS-centric teams  Large enterprises, Hybrid Cloud, Multi-cloud systems

 

5. Real-World Scenarios from ISB Vietnam

At ISB Vietnam, choosing an architecture depends entirely on the client's business problem:

  • Scenario 1 (Choosing ECS): An internal Business Management System needs rapid modernization from legacy to Cloud. The client wants the lowest maintenance costs, and their IT team has no K8s experts. Solution: ISB Vietnam consults using ECR + ECS Fargate. The infrastructure is spun up in days, auto-scales during business hours, and scales to zero at night to save costs.

  • Scenario 2 (Choosing EKS): A MedTech corporation needs to build a global wearable device data collection platform. A strict requirement is that the system must run partly on AWS and partly on the hospital's physical Data Center to comply with local data residency laws. Solution: ISB Vietnam utilizes EKS combined with Amazon EKS Anywhere. Kubernetes provides absolute consistency between Cloud and On-premise environments, while allowing the deployment of complex Service Mesh tools to encrypt healthcare data.

Key Takeaways

  • ECR: The secure vault for your Docker Images with built-in vulnerability scanning.

  • ECS: Optimized for speed and simplicity. Choose ECS if you want to focus on application code rather than managing infrastructure.

  • EKS: The industry standard. Choose EKS if your system is highly complex, requires multi-platform capabilities (Multi-cloud), and you have a robust DevOps team.

What's Next?

Both ECS and EKS are powerful, but they truly shine when deployed entirely via automation (Infrastructure as Code). In our next post, we will explore how to use Terraform to spin up these entire ECS/EKS clusters with just a single line of code.

In your organization, is your technical team leaning towards the "simplicity and ease of management" of ECS, or the "global standardization" of EKS? Share your system challenges in the comments below so we can discuss!

Whether your business needs to deploy a flexible system on ECS or build a complex Enterprise-grade EKS cluster, ISB Vietnam's team of experts is ready to design the perfect solution. Let’s build something great together—reach out to us today. Or click here to explore more ISB Vietnam's case studies.

 

References

[1]. Amazon Elastic Container Registry (ECR) Features. Retrieved from https://docs.aws.amazon.com/AmazonECR/latest/userguide/what-is-ecr.html

[2]. Amazon Elastic Container Service (ECS). Retrieved from https://docs.aws.amazon.com/AmazonECS/latest/developerguide/Welcome.html

[3]. Amazon Elastic Kubernetes Service (EKS). Retrieved from https://docs.aws.amazon.com/eks/latest/userguide/what-is-eks.html

Ready to get started?

Contact IVC for a free consultation and discover how we can help your business grow online.

Contact IVC for a Free Consultation
View More
TECH

April 23, 2026

How RAG Works: Lessons from a Junior Developer

Recently, while working on a project, I had the chance to explore how a Retrieval-Augmented Generation (RAG) system works. Before that, I mostly interacted with Large Language Models through APIs, without thinking too much about how they actually retrieve and use external information.

With the rapid development of LLMs, many people have begun asking AI questions rather than searching on Google. This is very convenient, but LLMs have an important limitation: They can only answer questions based on the data they have been trained on.

For example, if a model was trained in 2025, how will it know what happens in 2026? If we want it to respond with information from documents it has never seen before, how will it know?

That's why RAG systems come in. This article is the first part of a short series documenting what I learned while trying to build a RAG system locally.

The series is divided into three parts:

  • Part 1 - Understanding the RAG Pipeline.
  • Part 2 - Running RAG locally.
  • Part 3 - Challenges and lessons learned.

In this first part, we’ll walk through the basic architecture of a RAG system and understand how its main components work together.

Simplified RAG pipeline

Instead of relying only on what the model already knows, RAG allows the model to retrieve relevant information from external documents.

Although many variations of RAG architectures exist today, most of them revolve around three core components: Document ingestion, retrieval, and generation.

Document Ingestion

The first step in a RAG system is preparing the documents so the system can use them.

Document Parsing

The main job of a parser is to extract text from documents such as PDFs, Word files, or HTML pages. Currently, many tools support this: Docling (Python), Langchain Document Loader (Python/TypeScript), Apache Tika (Java), etc.

Text chunking

At its most basic, we can segment by chunk size. Why do we need to chunk? LLMs have a limited context window and cannot process an entire document at once. Just as we can't remember the entire contents of a file.

For example, a chunk size of 100 means splitting the document into smaller chunks of 100 characters each. More complex methods involve segmenting based on the document's structure and layout.

In practice, chunking strategies may vary depending on the document structure and the context window of the language model.

Document
┌───────────────────────────────┐
Employee Handbook
Employees must reset their passwords every 90 days.
Passwords must contain at least 8 characters.
Two-factor authentication is recommended.
└───────────────────────────────┘

                ↓
            Text Chunking

Chunk 1                  Chunk 2                  Chunk 3
┌────────────────┐   ┌────────────────┐   ┌────────────────┐
Employees must   │   │Passwords must  │   │Two-factor      │
reset passwords  │   │contain at      │   │authentication  │
every 90 days    │   │least 8 chars   │   │is recommended  │
└────────────────┘   └────────────────┘   └────────────────┘

Embedding

Since machines process numbers rather than raw text, we have to convert letters into numbers for them. Embedding is the process of converting text into vectors using an embedding model. These vectors allow the system to measure semantic similarity between the user query and document chunks.

Chunk 1
"Employees must reset their passwords every 90 days."
↓
Embedding
[0.21, -0.33, 0.81, 0.45, -0.12, ...]

Vector Database

After the embeddings are generated, they are stored in a vector database. Unlike traditional databases that store structured data, vector databases are designed to store vector representations and efficiently perform similarity searches. Once all document chunks are stored in the vector database, the system is ready to retrieve relevant information when a user asks a question.

Retrieval

This is the core of RAG technology. Instead of searching text traditionally, the system searches for document vectors that are closest to the query vector of the user query.

Query Embedding

Similar to the previously vectorized document chunks, when a user submits a question, that question will also be converted into a vector.

Always remember this: both query embedding and chunk embedding must use the same embedding model.

User Question:
How often should employees reset their passwords?

↓
Embedding
[0.18, -0.41, 0.72, ...]

Vector Search

To put it simply, imagine that all document embeddings are points in a multi-dimensional space. When a user asks a question, the query is also converted into a vector and placed in the same space. The system then searches for document vectors that are closest to the query vector.

But what do we mean by “closest”?

In practice, similarity between vectors is measured using mathematical metrics such as cosine similarity or dot product. These metrics help the system identify document chunks that are semantically similar to the user's question.

Top-k Relevant Chunks

The top-k retrieved chunks are then combined and sent to the LLM as context. The exact value of k can vary depending on the system and the model’s context window.

In simple terms, the system gives the model relevant pieces of text and asks it to answer the question based on that information.

Query:
"How often should employees reset their passwords?"

↓

Top-k Retrieved Chunks

┌──────────────────────────────┐
Chunk 12
Employees must reset passwords
every 90 days.
└──────────────────────────────┘

┌──────────────────────────────┐
Chunk 27
Passwords must contain at least
8 characters.
└──────────────────────────────┘

┌──────────────────────────────┐
Chunk 35
Two-factor authentication is
recommended for all accounts.
└──────────────────────────────┘

↓

Context sent to LLM

↓

Answer

Generation

Now it's time to ask this AI a question. This step works similarly to copying text from somewhere, giving it to the AI, and asking, "Hey, what's in here?”

Prompt Construction

Besides the retrieved context, we also need to provide the LLM with a clear instruction. A simple prompt structure usually contains the context, the user’s question, and an instruction telling the model to answer based only on the provided information. Something like this:

You are an assistant who answers questions based on the provided context.

Context:
Employees must reset their passwords every 90 days.
Passwords must contain at least 8 characters.
Two-factor authentication is recommended.

Question:
How often should employees reset their passwords?

Answer:

LLM will automatically fill in the answer.

Context - Question - Answer Generation

This is the final step of the RAG process. This step is simple: after the LLM receives the prompt and context, it uses that information to answer the question. This process helps reduce hallucinations by grounding the answer in retrieved documents. commonly seen in LLMs.

However, there is one important thing to note. The accuracy of the answer depends on two factors:

  • Was the previous document search step correct? If you give it the wrong information, of course, it will give the wrong answer.
  • Is LLM strong enough? Even large models have a limited context window, so the system must carefully choose how many chunks to include.

Key Takeaways

  • Retrieval-Augmented Generation (RAG) enables LLMs to answer questions using external documents rather than relying solely on training data.
  • The documents must undergo a process of being received and encoded as vectors before they can be used.
  • A vector database is where vectors are stored and searched.
  • When a user asks a question, the system retrieves the most relevant document segments within the database.
  • The retrieved chunks are then combined into context and sent to the LLM to generate the final answer.

What’s next?

In this article, we walked through the basic pipeline of a RAG system — from document ingestion to answer generation.

In Part 2 - Running a RAG system locally, I’ll share what happened when I tried to run a RAG system locally, including the tools I used and some practical limitations I encountered during development.

This article is part of a technical blog series from ISB Vietnam, where our engineering team shares practical insights and lessons learned from real-world projects.

References

https://www.mhlw.go.jp/toukei/itiran/roudou/monthly/30/3009p/3009p.html

View More
TECH

April 23, 2026

Why Your AI Is Only As Good As Your Prompts

I. The Trust Gap

     GenAI tools, such as ChatGPT, Claude, and Google Gemini have an immense potential to improve our quality of life and help out with a lot of global issues. Yet, there is a fairly strong and widely discussed consensus that GenAI can be inaccurate.

     A recent KPMG survey[1] with over 48,000 people globally shows that even though 66% of them use GenAI regularly, only 46% felt willing to trust AI systems. It’s because many people are relying on GenAI output without evaluating its accuracy, which led to mistakes in their work.

     In my opinion, the problem is often not the technology itself, but the way it’s being used. GenAI, without a doubt, is incredibly powerful, but the quality of its output depends heavily on the quality of the prompt users provide. Learning how to create a clear, specific, and well-structured command or request when working with GenAI is the key to unlock these tools real potential.

 

 

II. What is prompt Engineering?

     Prompt Engineering is the essential skill of designing and refining prompts to improve the output of GenAI systems. Because GenAI models respond based on the specific input they receive, a well-structured prompt allows you to:

  • Produce more detailed and relevant responses.
  • Significantly improve output accuracy.
  • Control the format and style of the final result.

 

 

III. Prompt Engineering Techniques.

     Now that we understand what Prompt Engineering is, let’s break down some of the widely used techniques to craft prompts[2].

1. Zero-Shot Prompting:

          This is a technique where you present a task to the AI model without providing any examples or task-specific demonstrations. Its accuracy relies heavily on the strength of the underlying foundation model. The more advanced and capable the foundation model, the more likely the AI is to produce accurate results.

          Zero-shot prompting is often more suitable for straightforward tasks or when a quick response is needed, even though it can still handle more complex tasks with varying reliability.

 

2. Few-Shot Prompting:

          This technique involves providing a few examples within the prompt to guide the AI’s output. Instead of training the model, you guide it during inference by providing specific contextual examples (input-output pairs).

          If you provide the AI with only one example, then this technique is also called “Single -shot” or “One-Shot Prompting”.

 

 

3. Chain-of-Thought (CoT) Prompting:

          This is a technique where you break a task into a sequence of intermediate reasoning steps. This helps the AI model process logic more effectively, leading to more structured and accurate results. You can trigger this by providing examples of step-by-step reasoning or by simply adding instructions like “Break this into steps”.

          However, CoT prompting should be used selectively, mainly for tasks that require multi-step reasoning, where accuracy matters more than speed. In simpler cases, forcing step-by-step reasoning may slow the AI down or introduce unnecessary verbosity.

 

 

IV. Conclusion

     GenAI effectiveness depends largely on how it is used. Rather than viewing them as unreliable, It’s more productive to see AI as a tool that requires skill and thoughtful interaction.

     Mastering AI communication is becoming an essential skill in today’s digital world. By crafting clear and structured prompts, users can unlock the full potential of GenAI and use it more confidently and responsibly in their work and daily lives.

     If you're eager to take your Prompt Engineering skill to the next level and apply them to impactful, real-world projects, ISB VIETNAM offers an environment where PassionsTeamworkInnovations, and continuous learning are part of our everyday work. Visit the official website now to learn more about our company services, and how you can become part of a team that values thoughtful, high-quality software engineering.

 

References:

[1] KPMG survey: https://kpmg.com/xx/en/our-insights/ai-and-technology/trust-attitudes-and-use-of-ai.html

[2] Inspired by: https://www.udemy.com/course/aws-ai-practitioner-certified/

View More
1 2 3 25