MariaDB is one of the most popular open-source relational database servers, and with its built-in vector support it is now a powerful platform for AI-driven applications. MariaDB Vector enables storage and similarity search of high-dimensional vector embeddings directly within the database, making it an excellent choice for Retrieval-Augmented Generation (RAG) pipelines and semantic search.
GlobalSolutions has deep expertise in building end-to-end RAG pipelines and vector content ingestion workflows using MariaDB Vector DB. We can help you design, build, and deploy production-ready pipelines that embed, store, and retrieve content at scale — enabling your AI applications to ground responses in your own data.
MariaDB is fast and scalable with a rich ecosystem of storage engines and plugins, providing a full SQL interface for accessing both relational and vector data side by side.
yourpemfile.pem and <public-ip> with your values:ssh -i yourpemfile.pem ubuntu@<public-ip-of-your-instance>
ubuntu user.For more information on connecting to EC2 instances please refer to: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/connect-linux-inst-ssh.html
| Category | Package | Version | Location (Ubuntu) |
|---|---|---|---|
| Database | MariaDB Server | 11.x | /usr/sbin/mysqld | Data: /var/lib/mysql |
| Config | MariaDB Config | — | /etc/mysql/mariadb.conf.d/ |
| Service | systemd service | — | systemctl status mariadb |
| Client | MariaDB Client | — | /usr/bin/mysql |
| Username | Password |
|---|---|
| root | Instance ID of your EC2 |
Where to find your Instance ID — You can find the instance ID in your AWS console. When you select your instance it shows in the bottom half of the page with all the instance information. Alternatively, once logged in via SSH you can run:
curl -s http://169.254.169.254/latest/meta-data/instance-id
The Maria VectorDB offering of GlobalSolutions comes prepackaged and ready to use. Once you have SSH'd into the instance, connect to MariaDB using the following command:
mysql -u root -p
When prompted for the password, enter your EC2 Instance ID.
GlobalSolutions has built and deployed a fully working RAG pipeline on this instance using MariaDB Vector DB as the knowledge store. The pipeline uses the following components:
global_ai_db database, populated with sample pharmacy prescription sales data.nomic-embed-text via Ollama — used to vectorize the prescription data and stored directly in MariaDB.SSH into the server and navigate to the application directory:
cd /home/ubuntu/gs
Then run the pipeline:
python3 rag.py
You will be prompted to enter a natural language query — for example:
The pipeline will embed your query using nomic-embed-text, retrieve the most
semantically relevant records from MariaDB, and pass them as context to the Ollama LLM to
generate a grounded, accurate response.
The following pharmacy prescription sales records have been loaded into the global_ai_db
database and vectorized using the nomic-embed-text embedding model:
| Date | Medicine Name | Category | Quantity Sold | Unit Price | Total Sales |
|---|---|---|---|---|---|
| Jan 1, 2026 | Paracetamol 500mg | Pain Relief | 10 | $5.50 | $55.00 |
| Jan 2, 2026 | Vitamin C 1000mg | Vitamins | 12 | $8.50 | $102.00 |
| Jan 3, 2026 | Cough Syrup 100ml | Cold & Flu | 8 | $7.50 | $60.00 |
| Jan 4, 2026 | Amoxicillin 250mg | Antibiotics | 5 | $12.00 | $60.00 |
| Jan 5, 2026 | Paracetamol 500mg | Pain Relief | 20 | $5.50 | $110.00 |
nomic-embed-text, and stored alongside the original
data in MariaDB. When you query rag.py, your question is embedded the same way and
MariaDB returns the closest matching records — which are then passed to the Ollama LLM to produce
a natural language answer grounded in the actual data.
Once connected to MariaDB, create a dedicated database for your vector data:
CREATE DATABASE vectordb;
USE vectordb;
Create a table with a vector column to store your embeddings. The example below uses 768 dimensions, matching the nomic-embed-text model used in this instance:
CREATE TABLE embeddings (
id INT AUTO_INCREMENT PRIMARY KEY,
content TEXT NOT NULL,
source VARCHAR(255),
created_at DATETIME DEFAULT NOW(),
embedding VECTOR(768) NOT NULL,
VECTOR INDEX (embedding)
);
To insert a document along with its vector embedding (generated by your pipeline), use:
INSERT INTO embeddings (content, source, embedding)
VALUES (
'Your document text goes here',
'source-identifier',
VEC_FromText('[0.012, -0.045, 0.331, ...]')
);
To retrieve the most semantically similar documents to a query embedding:
SELECT id, content, source,
VEC_DISTANCE_EUCLIDEAN(embedding, VEC_FromText('[0.012, -0.045, ...]')) AS distance
FROM embeddings
ORDER BY distance ASC
LIMIT 5;
Our other popular offering is the AWS Cost Optimizer aka CloudInsider, available in AWS Marketplace. This service has helped our customers save significantly on AWS and other cloud spending. It is easy to subscribe and you can see the savings in minutes.
▶ Watch Demo Video Subscribe on AWS MarketplacePlease contact us at support@theglobalsolutions.net for any questions on this offering in AWS Marketplace.