Platform
- You can now use Structured Streaming to Stream Data from Apache Pulsar on Databricks. For more information : https://docs.databricks.com/en/structured-streaming/pulsar.html (DBR 14.1 required)
- Databricks Runtime 14.1 and 14.1 ML are now available as Beta
- GPU Model serving Optimized for LLMs in public preview in selected regions. For more information : https://docs.databricks.com/en/machine-learning/model-serving/llm-optimized-model-serving.html
- Databricks Terraform provider updated to version 1.27.0
- Databricks CLI updated to version 0.206.0
- Databricks SDK for GO updated to version 0.21.0
- Databricks ODBC driver 2.7.5 ( Timedate function, Server-side encryption with customer provided keys)
- Databricks Extension for VSCode updated to version 1.1.3
- Running Jobs as a service principal is GA. For more information https://www.youtube.com/watch?v=Ri9NCLjdJ74&t=1s
- Databricks SDK For python updated to version 0.9.0
- Databricks connect for Databricks Runtime 14.0 with the full Structured Streaming and long running queries
- Configure tableau and PowerbI Oauth with SAML SSO. For more information :https://docs.databricks.com/en/partners/bi/power-bi.html
- Databricks Connect V2 is public preview for Scala. For more information https://www.youtube.com/watch?v=DkzwFTC7WWs&t=145s
- Unified login public preview for accounts created before June 21, 2023. For more information : https://docs.databricks.com/en/release-notes/product/2023/june.html#unified-login
- Databricks runtime 14 is GA
- Github apps integration in Repos is GA
- Spark 3.5.0 is GA. For more information https://spark.apache.org/releases/spark-release-3-5-0.html
- With Databricks Runtime 14.0 and above, shared clusters now use Spark Connect with the Spark Driver from the Python REPL by default. Internal Spark APIs are no longer accessible from user code.
- User-defined table functions (UDTFs) allow you to register functions that return tables instead of scalar values. For more information https://docs.databricks.com/en/udf/python-udtf.html
- Lakehouse Federation is now available on single-user clusters using Databricks Runtime 13.1 and above. Only the connection owner can run queries on federated catalogs.
Workflows
- To prevent runs of Databricks jobs from being skipped because of concurrency limits, you can enable queueing on the job. When it’s enabled if a concurrency limit is reached, the job is placed in a queue until capacity is available. For more information https://docs.databricks.com/en/workflows/jobs/create-run-jobs.html#job-queueing
- Databricks asset bundles is now public preview. It enables end to end analytics and ML projects to be expressed as a collection of sources. This makes it simpler to apply Data engineering best practices. For more information https://docs.databricks.com/en/dev-tools/bundles/index.html
Governance
- You can use Structured Streaming to perform streaming reads from views registered with Unity Catalog. For more information : https://docs.databricks.com/en/structured-streaming/views.html (DBR 14.1 required)
- You can filter sensitive data with row filters and column masks. For more information https://www.youtube.com/watch?v=Ph32H5HTWlA&t=3s
- System tables now include Marketplace schema and pricing information.
- Data explorer is now Catalog Explorer
- You can now delegate privileges for managing items in the allowlist for Unity Catalog. For more information https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/allowlist.html
Databricks SQL
- Lakeview Dashboards allow you to share visualizations and datasets using a simplified and unified data governance model. For more information https://docs.databricks.com/en/dashboards/lakeview.html
- New charts are now available, featuring faster render performance, beautiful colors, and improved interactivity.
- In the graph view of Query Profile, you can now view the Join type on any node containing a join in the query plan.
Machine learning
- Inference tables for model serving endpoints is public preview. You can capture incoming and outgoing responses to those endpoints and log them as a Unity Catalog delta table. For more information : https://docs.databricks.com/en/machine-learning/model-serving/inference-tables.html
- GPU Model serving in public preview.
- On demand Feature computation is now in Unity Catalog is now in public preview. For more information https://docs.databricks.com/en/machine-learning/feature-store/on-demand-features.html
Delta Lake
- Row-level concurrency reduces conflicts between concurrent write operations by detecting changes at the row-level and automatically resolving competing changes in concurrent writes that update or delete different rows in the same data file. For more information https://docs.databricks.com/en/optimizations/isolation-level.html#row-level-concurrency
Partner connect
- You can use Partner connect to connect your databricks workspace to Snowplow.