The bedrock of generative AI applications. It provides access to foundation models, retrieval augmented generation (RAG), knowledge bases, vector stores, guardrails, and LLM agents. With Bedrock, you can build and deploy generative AI applications quickly and easily without having to manage the underlying infrastructure.

Bedrock: Manage, deploy, and train models.
Bedrock-runtime: Perform inference (execute prompts, generate embeddings, etc.) against these models. It provides Converse, converseStream, InvokeModel, and InvokeModelStream APIs.
bedrock-agent: It manage, deploy, train LLM agents and knowledge bases.
bedrock-agent-runtime: It performs inference against agents and knowledge bases, It has InvokeAgent, Retrieve, RetrieveAndGenerate APIs.

Bedrock IAM permissions: It must use with an IAM user (not root), User must have permissions to use Bedrock and the underlying models.

Fune tuning with Bedrock

Fune tuning

Bedrock provides a simple and efficient way to fine-tune foundation models for specific tasks. You can use the Bedrock API to fine-tune a model on your own dataset, and then use the fine-tuned model for inference. This allows you to customize the behavior of the model to better suit your specific use case. The train data must be in JSONL format, where each line is a JSON object representing a training example. The JSON object should have the following structure:

{
  "input": "The input text for the model.",
  "output": "The desired output text for the model."
}

You can put them on the S3 and provide the S3 URI to the Bedrock API when fine-tuning the model. Bedrock will then use this data to fine-tune the model and improve its performance on your specific task.

Continued Pre-Training

It is like fine-tuning, but instead of training the model on a specific task, you train it on a larger dataset that is more relevant to your use case. This can help the model learn more about the specific domain or language you are working with. It is unlabeled data, so you don’t need to provide input-output pairs.

Retrival Augmented Generation (RAG)

upload documents to s3 and other source like webcrawler, confluence, salesforce, sharepoint into Bedrock knowledge base. Chose the embedding model and dimension and a store serveing like Amazon OpenSearch. Then We can use agent system or Rag from Amazon Bedrock to retrieve relevant information from the knowledge base and generate responses based on that information.

Amazon Bedrock Guardrails

Content filtering for prompts and responses. It works with text foundation models, word filtering, topic filtering, profanities, PII removal (or masking).

contextual grounding check: It helps prevent hallucination and measures “grounding” (how similar the response is to the contextual data received) and relevance (of reponse to the query). It can filter out responses that are not grounded or relevant.
We can configure the “blocked message response”.

We can add a bedrock guardrail from bedrock and define these rules as pro

Tools and agents

In Bedrock, the tools can be Lambda functions, and we need prompt engineering to use them. The model decompose the promblem into subproblems, and then call action groups, knowledge bases, and tools to solve the subproblems. An action group is a collection of tools.

Fune tuning with Bedrock#

Fune tuning#

Continued Pre-Training#

Retrival Augmented Generation (RAG)#

Amazon Bedrock Guardrails#

Tools and agents#