We help with Adult Business Registration & Payment Processor approval — book a free consult

What AI Model Does OurDream.ai & Candy AI Use? (2026 Technical Analysis)

An informed analysis of the AI models OurDream and Candy AI appear to be running — LLMs, image generation, voice synthesis, and memory architecture. What founders building competing platforms can learn from the inferred tech stacks.

Two of the most-searched questions among founders evaluating the AI companion category are "what AI model does OurDream.ai use?" and "what AI model does Candy AI use?" The questions are usually asked for one of two reasons — either someone is trying to evaluate the platform's technical depth, or they are planning to build something competitive and want to know what they are up against.

Neither platform publishes its exact model stack. Both treat the underlying AI as proprietary infrastructure rather than a marketing point. But based on hands-on testing, output pattern analysis, public job postings, and benchmarks against known models, it is possible to construct a credible analysis of what both platforms appear to be running. This guide breaks down what we have inferred about each platform's likely tech stack and what that means for founders building in the same space.

At NSFW Coders we have built full clones of both Candy AI and platforms in OurDream's category. The technical observations below come from that hands-on work plus extensive analysis of public outputs.

Important Disclaimer

Everything in this guide is informed inference, not confirmed disclosure. Neither Candy AI (EverAI Limited) nor OurDream AI publishes its model stack. The analysis is based on output behaviour, response patterns, latency benchmarks, public information about the broader AI ecosystem, and pattern-matching against models we have used directly. Treat the conclusions below as well-reasoned hypotheses rather than verified facts.

Candy AI's Likely Tech Stack

Candy AI presents the most polished AI companion experience in the category. The output patterns suggest a multi-model orchestration approach rather than a single backbone model.

Conversational LLM

Candy AI's chat responses are too unrestricted to be running on mainstream commercial LLMs (GPT-4, Claude, or Gemini), which all have content policies that would block the kind of explicit roleplay Candy AI handles natively. The response style also does not match the recognisable patterns of those mainstream models.

The most likely setup is a fine-tuned open-source base model. Strong candidates based on output behaviour:

  • Mistral / Mixtral fine-tunes. Mistral's open-source models have become a popular base for adult AI platforms because of their permissive licensing and reasonable inference cost. The conversational flow and response length on Candy AI match Mistral fine-tune patterns reasonably well.
  • Llama-derived fine-tunes. Llama 3 and its descendants are similarly popular bases. The model would be fine-tuned on adult conversation datasets with personality conditioning layered on top.
  • DeepSeek variants. DeepSeek's strong context handling and competitive pricing on inference make it a serious candidate for the platform's memory-aware conversation system.

Realistically, Candy AI may be running multiple LLMs in parallel and routing requests based on context — a "premium" model for paid users, a smaller/cheaper model for free-tier chat, a specialised model for NSFW-heavy roleplay sessions.

Image Generation

Candy AI runs two distinct image engines — the standard generator and the V2 engine available on select premium characters. The V2 engine produces noticeably sharper outputs with better face consistency and lighting.

The likely architecture:

  • Standard engine: SD 1.5 or SDXL with custom fine-tunes, likely combined with LoRA per persona for character consistency.
  • V2 engine: SDXL or SDXL-derivative (Juggernaut-XL or a proprietary fine-tune) with ControlNet conditioning for pose control and dedicated LoRAs for premium character consistency.
  • Story mode (2026 update): Image generation conditioned on chat context — likely a prompt-augmentation pipeline that extracts scene information from recent messages and feeds it into the SDXL pipeline as additional context.

Voice and Audio

Candy AI's voice messages and real-time calls support multiple voice options per character. The voice quality on the 2026 version is noticeably improved over earlier years, with breathing patterns and emotional inflection.

The likely stack:

  • TTS: ElevenLabs or a high-quality alternative like XTTS for character voices. The voice quality and emotional range suggest commercial-grade TTS rather than basic open-source options.
  • Real-time voice calls: Streaming TTS plus speech-to-text (likely Deepgram or Whisper) with the LLM in the middle generating responses optimised for spoken delivery.

Memory System

Candy AI's adaptive memory persists for 90+ days on paid tiers, recalling specific details across sessions. This pattern strongly suggests a vector-database-backed memory layer combined with structured fact extraction.

The likely architecture: PostgreSQL or similar for relational data, a vector database (Pinecone, Weaviate, or pgvector) for conversational memory retrieval, plus a summarisation layer that periodically condenses older conversations into compact memory records.

OurDream AI's Likely Tech Stack

OurDream's product surface is similar to Candy AI's in some ways and different in others. The feed-first design and image-centric experience suggest different infrastructure priorities.

Conversational LLM

OurDream's chat quality is competent but not standout. Response patterns suggest a fine-tuned open-source model in the same family Candy AI likely uses — Mistral, Llama, or DeepSeek fine-tunes are all credible candidates.

OurDream's chat system appears to have lighter memory architecture than Candy AI's, suggesting either a smaller vector store or shorter retention windows. This matches the platform's overall product priority — the feed and visual content do more of the engagement work than chat depth.

Image Generation

This is where OurDream invests most heavily. The volume of image content the platform serves (both in chat and in persona feeds) is significantly higher than chat-first competitors, which implies a robust image generation infrastructure.

The likely architecture:

  • Base models: SDXL plus fine-tunes for realistic outputs. Pony Diffusion V6 for anime/stylised characters. LoRA per persona for character consistency.
  • Feed scheduling: A worker queue (Celery or similar) generating persona content in the background on a schedule. This is what keeps each persona's feed fresh between user interactions.
  • Routing logic: Different models for different request types — fast generation for feed posts that are seen many times, slower premium generation for chat-driven image requests.

Feed Engine

The feed-first UX is the platform's defining technical investment. The architecture likely includes:

  • A scheduling system that triggers new posts per persona on a configurable cadence
  • Image generation queues prioritised for scheduled feed posts vs on-demand chat images
  • A follower graph linking users to the personas they follow
  • Feed rendering optimised for mobile-first browsing despite the browser-only distribution

Why Neither Platform Uses GPT-4 or Claude

The single biggest technical constraint for adult AI platforms is content policy. GPT-4, Claude, and Gemini all have policies that refuse explicit adult content. Building a platform on those models means users hitting refusals constantly, which destroys the conversational continuity that drives retention.

Open-source models without those restrictions are the only viable choice. The trade-off is that open-source LLMs typically require more fine-tuning to reach commercial output quality, and operators take on the burden of model hosting and inference optimisation.

The economics also favour open-source at platform scale. Per-message inference on GPT-4 would cost adult AI platforms many times what a self-hosted open-source model costs. With user volumes in the hundreds of thousands, that compounds into significant margin difference.

The Role of Open-Source Models

The adult AI category is one of the strongest beneficiaries of the open-source LLM movement. Models like Mistral, Llama, DeepSeek, and their fine-tuned derivatives provide a foundation that mainstream closed models cannot — both because of content restrictions and because of cost-per-inference economics.

For image generation, Stable Diffusion's open-source ecosystem is even more dominant. SDXL, Pony Diffusion, custom community checkpoints, and the LoRA fine-tuning ecosystem give adult AI operators a level of control and customisation that closed alternatives like DALL-E or Imagen do not offer at any price.

The result is that the entire adult AI category is built on open-source infrastructure. Candy AI, OurDream, DreamGF, and every other serious platform in the space all run substantially on open-source LLMs and Stable Diffusion. The differentiation between platforms lies in fine-tuning quality, orchestration logic, and product surface — not in unique access to mainstream commercial models.

What This Means for Founders Building Competitors

Three practical takeaways for founders evaluating the build path.

The model layer is not where you compete. Every serious platform in the category uses the same open-source bases (Mistral/Llama for chat, SDXL/Pony for images). The differentiation is in fine-tuning, orchestration, and product. Spending budget trying to find a "secret model" is a waste — the published open-source options are exactly what your competitors use.

Fine-tuning depth is the real moat. The platforms that ship best are the ones with the most thoughtful fine-tuning strategy — LoRAs per persona, custom checkpoints for specific styles, prompt-engineering pipelines that extract the most from base models. This work is engineering-heavy and not visible from the outside, which is why it underestimates how much actual work goes into a production platform.

Infrastructure orchestration matters more than model selection. Routing different request types to different models, optimising GPU utilisation, scaling inference horizontally, building moderation pipelines that operate at scale — these are the technical decisions that determine whether a platform survives growth. Model selection is a starting point; orchestration is the long-term competitive advantage.

FAQs

Are Candy AI and OurDream using GPT-4 or Claude under the hood?

Almost certainly not. Both platforms support content that GPT-4 and Claude refuse by policy. The output behaviour also does not match those models. Fine-tuned open-source models (Mistral, Llama, DeepSeek family) are the most likely choice.

Why don't these platforms reveal their model stack?

Three reasons. First, competitive — model selection and fine-tuning approach is one of the few real moats. Second, legal — disclosing exact model use can create regulatory exposure or licensing complications. Third, flexibility — keeping the stack proprietary lets them swap components without public commentary.

Could I build a competing platform on the same models?

Yes. The models are publicly available open source. The competitive question is whether you can match their fine-tuning depth, orchestration sophistication, and product polish — that work takes 30 to 90 days for a serious build and is what NSFW Coders does for our clients.

What's the cheapest viable LLM for an adult AI platform?

Self-hosted Mistral 7B or Llama 3 8B variants are the cheapest viable options. They run on a single GPU with reasonable inference cost. Quality is below the larger models but acceptable for early-stage MVPs.

How important is image quality compared to chat quality?

It depends on the product. Feed-first products like OurDream rely on image quality as the primary engagement driver. Chat-first products like Candy AI need both to be strong. The right balance depends on which UX pattern your platform follows.

Conclusion

The exact AI model stack behind Candy AI and OurDream remains proprietary, but the broad architecture is legible to anyone with experience building in the category. Open-source LLMs (Mistral, Llama, DeepSeek family) handle chat. Stable Diffusion variants (SDXL, Pony, fine-tunes) handle images. Commercial TTS handles voice. Vector databases handle memory. The differentiation between platforms lives in fine-tuning quality, orchestration logic, and product design — not in unique access to secret models.

For founders building competing platforms, this is actually good news. The technical stack is accessible. The work that separates winning platforms from forgettable ones is engineering discipline and product taste, not access to closed model APIs.

If you are scoping a competing platform and want a clear technical recommendation on which models to deploy and how to orchestrate them, a 30-minute discovery call with our team gives us enough to map your specific stack.

Related

More from Technical

Have a project?
Let's build it.

30 minutes. No obligation. NDA on request before you say a word.