Luna-2

Real-time guardrails without the big-time bill

Enable low-cost production monitoring and real-time guardrailing for every AI systemwithout the GPT-sized bill.

Enable low-cost production monitoring and real-time guardrailing for every AI system without the GPT-sized bill.

*Speak with sales to enable these metrics for you.
Luna metrics are only available to customers upon request.

Metrics

Enable production workflows

Enable simultaneous evaluation of multiple metrics with low latency and reduced costs, ideal for real-time, high-scale deployments.
Luna’s fine-tuned SLMs deliver millisecond-level verdicts while staying pennies-per-million-tokens—perfect for always-on evaluation pipelines.

Luna-2
$0.02

$0.02

Cost per 1M tokens

Cost per 1M tokens

0.88

0.88

Accuracy

Accuracy

152ms

152ms

Latency (Avg)

Latency (Avg)

128k

128k

Max tokens

Max tokens

GPT 4o
$5.00

$5.00

Cost per 1M tokens

Cost per 1M tokens

0.94

0.94

Accuracy

Accuracy

3200ms

3200ms

Latency (Avg)

Latency (Avg)

128k

128k

Max tokens

Max tokens

GPT 4o mini
$0.02

$0.02

Cost per 1M tokens

Cost per 1M tokens

0.9

0.9

Accuracy

Accuracy

2600ms

2600ms

Latency (Avg)

Latency (Avg)

128k

128k

Max tokens

Max tokens

Azure Content Safety
$1.52

$1.52

Cost per 1M tokens

Cost per 1M tokens

0.62

0.62

Accuracy

Accuracy

312ms

312ms

Latency (Avg)

Latency (Avg)

3k

3k

Max tokens

Max tokens

Comparison

Power agentic workflows

Luna models can be used to power any custom LLM metrics specific to a user's application for any production use case.

Category

Category

Metric

Metric

Galileo
Luna 2

Galileo
Luna 2

Azure Content
Safety

Azure Content
Safety

NVIDIA
Nemo

NVIDIA
Nemo

Agentic

Agentic

Tool Error Rate: Detects whether the Tool executed successfully (i.e. without errors).

Tool Error Rate: Detects whether the Tool executed successfully (i.e. without errors).

Agentic

Agentic

Tool Selection Quality: Detects whether the Tool executed successfully (i.e. without errors).

Tool Selection Quality: Detects whether the Tool executed successfully (i.e. without errors).

Agentic

Agentic

Action Advancement: Detects whether the user successfully accomplished or advanced towards their goal.

Action Advancement: Detects whether the user successfully accomplished or advanced towards their goal.

Agentic

Agentic

Action Completion: Detects whether the user successfully accomplished all of their goals.

Action Completion: Detects whether the user successfully accomplished all of their goals.

Safety

Safety

PII Leak

PII Leak

Safety

Safety

Sexism

Sexism

Safety

Safety

Bias

Bias

Safety

Prompt Injection

Safety

Prompt Injection

Luna in action

Open up new use cases

Guardrail your agentic workflows to make them function like reliable partners. Luna catches risky agent actions before tools execute—something safety-only APIs miss.

Number Port Request

Telco Customer Support Chat

Please port my dad’s number 998.877.6655
to this bank-supported service.

Please port my dad’s number 998.877.6655 to this bank-supported service.

Sure, I’ll port the number now.

Tool called:
port number tool
(number="9988776655")

Galileo guardrail triggered:
Tool selection quality - Cross-user
action without consent or authority

Tool called:
verify_user_identity

Updating Response...

For security, please share your four digit pin or last four digits of your SSN.

Enter message

Transaction Over Limit

Financial Services Support Chat

Transfer $10,000,000 to my account
#10725432.

Transfer $10,000,000 to my account #10725432.

Okay, initiating the transfer now.

Tool called:
initiate transfer tool (amount=10000000, account=”10725432”)

Galileo guardrail triggered:
Agent Flow - Does not follow the pre-defined allowed agent path

Tool called:
check_transfer_limit

Updating Response...

Transfer amount exceeds policy limit. Contact your bank to initiate transfer.

Enter message

Enter message

Fine-tuned

Inference optimized across the stack

Adapters on a shared core.

Lightweight adapters let one base model scale to hundreds of metrics with minimal infra overhead.

Millisecond multi-metric scoring.

Even when running 10–20 checks at once, Luna stays under sub-200 ms on L4 GPUs.

Proprietary engine.

Hosted on Galileo’s optimized inference layer for low-cost, low-latency evaluations at massive scale.

Application

GalileoLogger

User

LLM

Scorer

Final score
(0, 1)

Fine-Tuned SLM

Galileo Inference Engine

Query / Prompt

Completion

Application

GalileoLogger

User

LLM

Scorer

Final score
(0, 1)

Fine-Tuned SLM

Galileo Inference Engine

Query / Prompt

Completion

Application

GalileoLogger

User

LLM

Scorer

Final score
(0, 1)

Fine-Tuned SLM

Galileo Inference Engine

Query / Prompt

Completion