Setting Up the Test Environment
A GPU-accelerated embedding engine, Redis with vector search, and a live LLM connection — production-ready infrastructure from scratch.
In the previous part I shared my test environment and hypotheses. In this part we put theory into practice: install every component from scratch and verify all connections work.
By the end you'll have a GPU-accelerated embedding engine, a Redis instance capable of vector search, and a real LLM connection.
Installation Order
Order matters. Core tools first, then the database, finally the Python environment. This avoids dependency conflicts.
- 01Homebrewpackage manager
- 02Python 3.11+runtime
- 03Docker Desktopcontainerization
- 04Redis Stackcache + vector DB
- 05Python venvisolation
- 06Librariespip install
- 07Connection Testverification
Open Terminal (Cmd + Space → "Terminal") and check whether it's already installed:
$ brew --versionIf not installed:
$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"First check the current version:
$ python3 --versionIf you have Python 3.10 or above, continue. Otherwise:
$ brew install python@3.11We'll run Redis Stack in a container. Check if Docker is installed:
$ docker --versionIf not, download the Apple Silicon version from Docker's official site, open the .dmg and drag it into Applications. Verify:
$ docker run hello-worldRedis Stack adds the RediSearch and RedisJSON modules on top of classic Redis. Both key-value caching and vector similarity search run from a single container.
$ docker run -d \
--name redis-stack \
-p 6379:6379 -p 8001:8001 \
-v redis-data:/data \
redis/redis-stack:latestPort 6379 is where our Python script will connect. Port 8001 is the RedisInsight UI — visit http://localhost:8001 to watch cache entries fill up live.
Verify the container is running:
$ docker exec -it redis-stack redis-cli pingIf you see "PONG", Redis is ready. Useful commands:
$ docker stop redis-stack
$ docker start redis-stack
$ docker logs redis-stack -fCreate the project folder and activate the venv:
$ mkdir ~/semantic-cache-lab && cd ~/semantic-cache-lab
$ python3 -m venv venv
$ source venv/bin/activateWith venv active, create a requirements.txt file:
openai>=1.0.0
sentence-transformers>=2.7.0
torch>=2.0.0
redis>=5.0.0
tiktoken>=0.7.0
pypdf>=4.0.0
langchain>=0.2.0
langchain-community>=0.2.0
pandas>=2.0.0
matplotlib>=3.8.0
python-dotenv>=1.0.0
tqdm>=4.0.0Then install:
$ pip install -r requirements.txtsentence-transformers and torch are large packages — 5-15 minutes depending on your connection. On M-series Macs, MPS support is auto-detected.
Verify MPS is active:
$ python3 -c "import torch; print('MPS:', torch.backends.mps.is_available())"First create the .env file (get your OpenRouter API key from openrouter.ai/keys):
OPENROUTER_API_KEY=sk-or-v1-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
LLM_MODEL=google/gemini-2.0-flash-lite-001
REDIS_HOST=localhost
REDIS_PORT=6379.env
venv/
__pycache__/
data/
results/Then run the test script.
================================================== CONNECTION TEST ================================================== [1/4] Redis connection... ✓ Redis: OK [2/4] Loading embedding model... ✓ Embedding: OK (dim=384, device=mps) [3/4] OpenRouter LLM connection... ✓ OpenRouter: OK Response: Yes! Token usage: 27 [4/4] Token counting... ✓ tiktoken: OK "This is a test sentence." = 6 tokens ================================================== ALL CONNECTIONS SUCCESSFUL ==================================================
Part 4: Building the RAG Pipeline
Infrastructure is ready. In Part 4 we chunk the WEF report, load it into Redis as vectors, and run our first semantic searches.