Maria VectorDB Powered by GlobalSolutions

MariaDB is one of the most popular open-source relational database servers, and with its built-in vector support it is now a powerful platform for AI-driven applications. MariaDB Vector enables storage and similarity search of high-dimensional vector embeddings directly within the database, making it an excellent choice for Retrieval-Augmented Generation (RAG) pipelines and semantic search.

GlobalSolutions has deep expertise in building end-to-end RAG pipelines and vector content ingestion workflows using MariaDB Vector DB. We can help you design, build, and deploy production-ready pipelines that embed, store, and retrieve content at scale — enabling your AI applications to ground responses in your own data.

MariaDB is fast and scalable with a rich ecosystem of storage engines and plugins, providing a full SQL interface for accessing both relational and vector data side by side.

Note: We have ensured the image is hardened to be secured from all existing vulnerabilities.

Why Subscribe to Our Offering in AWS Marketplace

We update the software constantly to the latest version to address security issues.
Customers can kick-start their core work right away with our pre-packaged AMIs.
Production-ready application stacks optimised for vector workloads.
GlobalSolutions expertise available to help build your RAG pipeline from day one.

How to Access Our AMIs from AWS Marketplace

Subscribe: Purchase the Maria VectorDB AMI directly from AWS Marketplace.
Connect via SSH:
- Go to the AWS Console, select your instance, and note the public IP address.
- Make sure port 22 is open in your instance's Security Group.
- Connect using the following command, replacing yourpemfile.pem and <public-ip> with your values:
```
ssh -i yourpemfile.pem ubuntu@<public-ip-of-your-instance>
```
- Once logged in you will land in the home directory of the ubuntu user.

For more information on connecting to EC2 instances please refer to: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/connect-linux-inst-ssh.html

Installation Location

Category	Package	Version	Location (Ubuntu)
Database	MariaDB Server	11.x	`/usr/sbin/mysqld` \| Data: `/var/lib/mysql`
Config	MariaDB Config	—	`/etc/mysql/mariadb.conf.d/`
Service	systemd service	—	`systemctl status mariadb`
Client	MariaDB Client	—	`/usr/bin/mysql`

MySQL and MariaDB Login

Username	Password
root	Instance ID of your EC2

Where to find your Instance ID — You can find the instance ID in your AWS console. When you select your instance it shows in the bottom half of the page with all the instance information. Alternatively, once logged in via SSH you can run:

curl -s http://169.254.169.254/latest/meta-data/instance-id

Getting Started

The Maria VectorDB offering of GlobalSolutions comes prepackaged and ready to use. Once you have SSH'd into the instance, connect to MariaDB using the following command:

mysql -u root -p

When prompted for the password, enter your EC2 Instance ID.

Live RAG Pipeline — Pharmacy Prescription Demo

GlobalSolutions has built and deployed a fully working RAG pipeline on this instance using MariaDB Vector DB as the knowledge store. The pipeline uses the following components:

Database: MariaDB — global_ai_db database, populated with sample pharmacy prescription sales data.
Embedding Model: nomic-embed-text via Ollama — used to vectorize the prescription data and stored directly in MariaDB.
LLM: Ollama — used to generate natural language answers from the retrieved context.

Running the RAG Pipeline

SSH into the server and navigate to the application directory:

cd /home/ubuntu/gs

Then run the pipeline:

python3 rag.py

You will be prompted to enter a natural language query — for example:

"Which medicine had the highest total sales?"
"How many units of Paracetamol were sold?"
"What antibiotics were prescribed and at what price?"

The pipeline will embed your query using nomic-embed-text, retrieve the most semantically relevant records from MariaDB, and pass them as context to the Ollama LLM to generate a grounded, accurate response.

Sample Data in MariaDB

The following pharmacy prescription sales records have been loaded into the global_ai_db database and vectorized using the nomic-embed-text embedding model:

Date	Medicine Name	Category	Quantity Sold	Unit Price	Total Sales
Jan 1, 2026	Paracetamol 500mg	Pain Relief	10	$5.50	$55.00
Jan 2, 2026	Vitamin C 1000mg	Vitamins	12	$8.50	$102.00
Jan 3, 2026	Cough Syrup 100ml	Cold & Flu	8	$7.50	$60.00
Jan 4, 2026	Amoxicillin 250mg	Antibiotics	5	$12.00	$60.00
Jan 5, 2026	Paracetamol 500mg	Pain Relief	20	$5.50	$110.00

How it works: Each row above is converted into a text description, embedded into a high-dimensional vector using nomic-embed-text, and stored alongside the original data in MariaDB. When you query rag.py, your question is embedded the same way and MariaDB returns the closest matching records — which are then passed to the Ollama LLM to produce a natural language answer grounded in the actual data.

Want to build a RAG pipeline like this on your own data? GlobalSolutions has expertise in building end-to-end pipelines — from data ingestion and vectorization to retrieval and LLM integration. Contact us at support@theglobalsolutions.net.

Working with Vector Databases

Creating Vector Databases and Tables

Once connected to MariaDB, create a dedicated database for your vector data:

CREATE DATABASE vectordb;
USE vectordb;

Create a table with a vector column to store your embeddings. The example below uses 768 dimensions, matching the nomic-embed-text model used in this instance:

CREATE TABLE embeddings (
  id          INT AUTO_INCREMENT PRIMARY KEY,
  content     TEXT NOT NULL,
  source      VARCHAR(255),
  created_at  DATETIME DEFAULT NOW(),
  embedding   VECTOR(768) NOT NULL,
  VECTOR INDEX (embedding)
);

Injecting Embedded Content

To insert a document along with its vector embedding (generated by your pipeline), use:

INSERT INTO embeddings (content, source, embedding)
VALUES (
  'Your document text goes here',
  'source-identifier',
  VEC_FromText('[0.012, -0.045, 0.331, ...]')
);

Querying — Similarity Search (RAG Retrieval)

To retrieve the most semantically similar documents to a query embedding:

SELECT id, content, source,
       VEC_DISTANCE_EUCLIDEAN(embedding, VEC_FromText('[0.012, -0.045, ...]')) AS distance
FROM   embeddings
ORDER  BY distance ASC
LIMIT  5;

Tip: GlobalSolutions can help you build the full pipeline — from chunking and embedding your content (using OpenAI, Ollama, or other models) to storing vectors in MariaDB and wiring up retrieval into your RAG application. Contact us at support@theglobalsolutions.net to get started.

AWS Cost Optimizer — CloudInsider

Support

Please contact us at support@theglobalsolutions.net for any questions on this offering in AWS Marketplace.