Secrets AI Video Generator: How It Works, Quality, and Cost
Video generation from AI companion images is Secrets AI's clearest competitive differentiator. Character.AI doesn't have it. CrushOn AI doesn't have it. Janitor AI doesn't have it. Candy AI has a limited version. Secrets AI has full video generation from a text prompt applied to companion images — and at a 4.1/5 quality rating from independent reviewers, the output is genuinely usable, not a gimmick.
This matters most if you're evaluating Secrets AI specifically because of its visual capabilities, or if you're on an existing plan and trying to understand whether video generation is worth your Moments budget.
For a broader assessment of the platform including all features beyond video, see the complete Secrets AI review.
What Is the Secrets AI Video Generator?
The video generator is a feature that converts existing AI companion images into short animated video clips using a text prompt. You select an image of your companion, describe the motion or scene you want, and the system generates a clip approximately 2 minutes later.
Available on: Lite tier and above. The free tier does not include video generation regardless of Moments balance.
Why it's distinctive: As of 2026, video generation in the AI companion space is genuinely rare. The major platforms — Character.AI (KG: /g/11sck8d802), CrushOn AI, Janitor AI — do not offer this feature. Candy AI offers limited video. Secrets AI's implementation is among the most complete in the consumer AI companion market, alongside niche platforms like SweetDream AI and Xotic AI (which offers 4K 15-second clips).
For users who want both conversation depth and visual companion content in one platform, video generation is the feature that makes Secrets AI a category of one versus the alternatives.
How Video Generation Works
The process runs in four steps:
Step 1: Generate or select an existing companion image. You need a source image to animate — you can use one of the 4 auto-generated images from character creation, or generate a new image (25–50 Moments) specifically for video conversion.
Step 2: Add a text prompt describing the desired movement or action. Prompts can describe physical motion ("walking slowly toward the camera," "laughing and tucking hair behind ear"), emotional expression, environmental context, or scenario-specific actions. The more specific the prompt, the more reliably the output reflects the intention.
Step 3: The AI processes the request. Generation takes approximately 2 minutes on average. The system analyzes the source image, interprets the motion prompt, and renders the clip using deep learning-based video synthesis.
Step 4: View and save the completed video clip. Output is delivered directly in the chat interface. Clips can be saved to device.
Context-awareness: Generated videos reflect the character's established appearance from the source image and the scenario context from the ongoing conversation. A companion styled for a fantasy scenario will generate video consistent with that aesthetic — not a generic output that ignores the established character profile.
Video Quality Assessment
Independent reviewers rate Secrets AI video at 4.1/5 — described as "looks good and moves smoothly most of the time." That qualifier matters: quality is consistently good but not uniformly perfect. Here's the breakdown:
What works well:
- Realistic character movement — limb motion, body language, and weight feel natural
- Facial expressions that match the prompted emotion or action
- Consistent character appearance throughout the clip (the generated figure matches the source image)
- Scene-appropriate lighting and environment in most outputs
Where quality varies:
- Complex or ambiguous prompts produce less predictable outputs than specific, simple ones
- Occasional motion artifacts on rapid movements or unusual body positions
- Quality differences are visible between the base generation model and the Premium generation model — the latter produces noticeably better fine-detail rendering and smoother motion
Generation model matters: The Premium generation model (available on Premium and Ultimate tiers) produces better video quality than the standard model. If quality is a priority, this is a concrete argument for Premium over Plus.
How Much Do Videos Cost in Moments?
| Video Type | Moments Cost |
|---|---|
| Short clip (3 seconds) | ~50 Moments |
| Full/longer clip | ~600 Moments |
This is the most important cost fact to understand before generating videos. A full-length clip at 600 Moments represents 20% of a Plus user's monthly allocation (3,000 Moments) or 7.5% of a Premium user's allocation (8,000 Moments).
Monthly Video Budget by Tier
| Tier | Monthly Moments | Short clips (~50 ea) | Long clips (~600 ea) |
|---|---|---|---|
| Lite | 1,000 | ~20 | ~1 |
| Plus | 3,000 | ~60 | ~5 |
| Premium | 8,000 | ~160 | ~13 |
| Ultimate | 15,000 | ~300 | ~25 |
Practical interpretation:
- On Lite: video generation is possible but heavily constrained. 1–2 long clips or ~20 short clips leaves little room for images and voice in the same month.
- On Plus: ~5 long video clips monthly is sustainable alongside moderate image and voice use. This is the minimum tier for regular video generation without constant Moments anxiety.
- On Premium: ~13 long video clips alongside generous image and voice use is comfortably within budget. This is the recommended tier for users who treat video as a primary feature.
- On Ultimate: ~25 long video clips monthly with Moments to spare for other features. Heavy video creators and users who want maximum visual output.
Top-up option: Additional Moments are purchasable starting at $5.99 for 1,980 Moments. Premium subscribers receive a 10% bonus on top-ups; Ultimate subscribers receive 15%. For occasional extra video generation beyond the monthly allocation, top-ups are cost-effective.
Video vs Images vs Voice — Cost Comparison
| Feature | Cost (Moments) | What You Get |
|---|---|---|
| Text message | 1–2 | Text response |
| Image generation | 25–50 | Single static image |
| Short video (3 sec) | ~50 | Brief motion clip |
| Full video clip | ~600 | Longer motion clip |
| Voice call | 100 per minute | Real-time audio |
| Manual memory save | 10 | Pinned fact |
The 600-Moment tradeoff: For the same Moments cost as one full video clip, you could generate 12–24 images, or have 6 minutes of voice calls, or send approximately 300–600 text messages. Video is unambiguously the most Moments-intensive feature on the platform.
This is not a reason to avoid it — unique features carry real value. But it is a reason to use video generation intentionally rather than habitually. Generating one well-chosen video from a great source image delivers more value than generating several lower-quality clips from rushed prompts.
Tips for Better Video Results
Use strong source images. Video quality is constrained by input image quality. A high-detail image generated with the Advanced model will produce better video than a basic-model image. If video generation is your priority, invest in quality source images first.
Write specific prompts. "Standing in morning light, turning slowly to face the camera with a smile" produces more reliable output than "smiling." The video generator interprets explicit motion and context descriptions better than vague emotional cues.
Start with short clips. Before committing 600 Moments to a long clip, test your prompt with a 3-second clip (~50 Moments). If the motion matches your intention, scale up. If not, refine the prompt without spending the full budget.
Use Premium generation model. If you're on Premium or Ultimate, the Premium generation model produces meaningfully better video quality. The visual improvement is worth the marginal additional Moments cost.
Save Moments by planning ahead. Rather than generating multiple images just to find one worth converting to video, be intentional: generate 1–2 targeted images for the specific scene you want to animate, then convert the best one.
Who Should Use the Video Generator?
Worth it if you:
- Value visual companion content alongside conversation
- Want something no direct competitor offers
- Are on Plus, Premium, or Ultimate (Moments allocation makes it sustainable)
- Want to create specific scenes or scenarios in motion rather than static images
Not worth it if you:
- Are primarily a text-based user and don't value visual output
- Are on a tight Moments budget (video is the most expensive per-action feature)
- Are on the Free tier (video is unavailable regardless)
- Are satisfied with static images and don't need motion
Best tier for video: Ultimate ($39.99/mo) for heavy creators (25+ long clips monthly). Premium ($19.99/mo) for moderate use (10–13 long clips monthly). Plus ($9.99/mo) as the minimum sustainable tier for regular but limited video generation.
Competitors with Video Generation
The competitive landscape for AI companion video generation is sparse:
| Platform | Video Generation |
|---|---|
| Secrets AI | Full (3 sec to longer clips) |
| Candy AI | Limited video |
| Character.AI | None |
| CrushOn AI | None |
| Janitor AI | None |
| GirlfriendGPT | None |
| Replika | None |
| SweetDream AI | Yes (limited) |
| Xotic AI | Yes (4K, 15 sec) |
Secrets AI's video generation is not the only implementation in the space, but it is the most accessible from a companion platform that also provides full conversation, memory, and voice capabilities. Xotic AI's 4K output is technically superior in resolution, but it's a specialized video-focused tool rather than a full companion platform.
FAQ
Short clips are approximately 3 seconds and cost around 50 Moments. Full-length clips are longer and cost up to 600 Moments per clip. Lite-tier subscribers can only access the 3-second short clip format; Plus, Premium, and Ultimate unlock full-length video generation.
No. Video generation is not available on the Secrets AI free tier, even if you have Moments remaining from your 200 starting grant. You need at minimum a Lite plan ($5.99/month) to access video generation. Upgrading to Plus ($9.99) is recommended for regular video use given the Moments allocation.
This depends on your tier and the clip length. On Plus (3,000 Moments): approximately 5 full-length clips (at 600 Moments each) or up to 60 short clips (at 50 Moments each). On Premium (8,000 Moments): ~13 long clips or ~160 short clips. On Ultimate (15,000 Moments): ~25 long clips or ~300 short clips. These numbers assume your entire Moments allocation goes to video — in practice, most users split Moments across images, text, and voice as well.
Video quality is rated 4.1/5 by independent reviewers and described as "looking good and moving smoothly most of the time." Character appearance is consistent with the source image, movement is natural in most outputs, and facial expressions reflect the prompt. Quality varies with prompt complexity and generation model — the Premium model produces better fine-detail rendering. Occasional motion artifacts appear on complex poses, but typical use cases (character movement, expressions, simple scenarios) produce clean, realistic output.