Databricks · 11 min read · May 2026
Databricks AI and Mosaic AI Consulting in 2026
By Thinklytics Partners, Modern Data Platform Practice
What Mosaic AI does, what it costs, where it wins against Snowflake Cortex, and the Unity Catalog discipline that decides whether agents and model serving survive past the proof of concept. Practitioner notes from inside Databricks AI engagements.
Topics covered
- Databricks Mosaic AI
- Databricks AI
- Foundation Model APIs
- Unity Catalog
- Model Serving
- AI Functions
- Databricks consulting
- Vector Search
Frequently asked questions
What is Databricks Mosaic AI and what does it do?
Mosaic AI is the AI and ML platform inside Databricks, branded after the 2023 acquisition of MosaicML and steadily rebuilt around Unity Catalog. It ships as five surfaces: Foundation Model APIs (pay-per-token Llama, DBRX, Mixtral, and Claude inference), AI Functions (SQL-callable LLM functions inside Databricks SQL), Vector Search (Unity-Catalog-native vector index), Model Serving (production endpoint hosting for custom and foundation models), and Mosaic AI Agent Framework (orchestrated multi-step agents grounded in Unity Catalog). The platform's defining trait is Unity Catalog inheritance:…
What does Mosaic AI cost in 2026?
Mosaic AI is metered on DBU consumption with separate rates for each surface. Foundation Model APIs are pay-per-token at model-specific rates (Llama 3 family runs cheaper than Claude). Model Serving for custom models is metered per serving DBU at provisioned-throughput rates. Vector Search is metered per query unit and per index storage GB. AI Functions roll up into the SQL warehouse DBU bill. Most mid-market Mosaic AI deployments land at $80K to $300K annualized in DBU spend during the first production year. Enterprise deployments with heavy custom model serving and many agents regularly…
Databricks Mosaic AI vs Snowflake Cortex, when does Databricks win?
Databricks wins when the team is notebook-first and the workload requires training, fine-tuning, custom model serving, or mixed Python + Spark pipelines. Snowflake Cortex wins when the team is SQL-first and the workload is warehouse-grounded retrieval or analytical Q and A. For RAG on enterprise documents Databricks Vector Search is competitive with Cortex Search; the choice usually follows where the data lives. For training custom foundation models or running production ML serving at scale Databricks remains the deeper platform. Many organizations end up with both. The decision is per…
How long does a Mosaic AI implementation take?
Six to ten weeks for a focused first-wave deployment on one use case (typically a RAG application over a Unity-Catalog-governed document corpus, or a Model Serving endpoint for a single inference workload). Four to six months for a multi-surface rollout that includes AI Functions in SQL, Vector Search in production, and one Mosaic AI Agent. Twelve months and longer for enterprise rollouts that include custom foundation-model fine-tuning, multi-tenant agent governance, and integration with downstream applications. The biggest predictor of duration is Unity Catalog readiness. Workspaces still…
What governance has to be in place before Mosaic AI rolls out?
Four things. First, Unity Catalog must be enabled and properly configured across the workspaces where Mosaic AI surfaces will run, otherwise lineage, access control, and audit trail are all degraded. Second, MLflow tracking enabled so every model call and Agent step is logged. Third, DBU budget alerts because Foundation Model APIs and Model Serving consume DBUs at much higher rates than typical SQL workloads. Fourth, data classification and sensitivity tagging in Unity Catalog so agents and RAG applications respect access policies. All four are table stakes; ship without any of them and the…
Should we use Foundation Model APIs or self-host a model on Model Serving?
Foundation Model APIs for almost every starting workload in 2026. Pay-per-token consumption matches the demand pattern of most AI applications, Databricks manages model lifecycle and security, and the model catalog is broad enough to cover most needs (Llama 3.1 and 3.3, DBRX, Mixtral, Claude on Anthropic-hosted Bedrock). Self-hosting on Model Serving makes sense when the workload has very high sustained QPS where provisioned throughput is cheaper, when latency requirements are stricter than the API can guarantee, or when fine-tuned model weights cannot leave the Databricks workspace for…
Do you take Databricks commissions on Mosaic AI deployments?
No. Thinklytics is a Databricks-fluent consulting firm that does not take licensing commissions from Databricks, Snowflake, Microsoft, AWS, or any other vendor. That means we have recommended against Databricks Mosaic AI in cases where Snowflake Cortex or a non-warehouse approach was the better answer, and recommended for Mosaic AI in cases where the lakehouse-grounded training, fine-tuning, and notebook-first workflow dominated. The recommendation is decided per engagement.
What are red flags when evaluating Mosaic AI consulting firms?
Five show up consistently. (1) The proposal recommends Mosaic AI in week one without auditing Unity Catalog readiness. (2) DBU consumption is estimated without sampling real workload patterns. (3) Vector Search is described as plug-and-play with no detail on chunking strategy or reindex cadence. (4) Mosaic AI Agents are scoped before guardrails, MLflow tracking, and Unity Catalog access policies are in place. (5) The proposed team has no Databricks Certified ML Engineer or Generative AI credentials. Any two of these together is a near-certainty for overrun.