What’s new in Databricks for September 2023

Platform

You can now use Structured Streaming to Stream Data from Apache Pulsar on Databricks. For more information : https://docs.databricks.com/en/structured-streaming/pulsar.html (DBR 14.1 required)
Databricks Runtime 14.1 and 14.1 ML are now available as Beta
GPU Model serving Optimized for LLMs in public preview in selected regions. For more information : https://docs.databricks.com/en/machine-learning/model-serving/llm-optimized-model-serving.html
Databricks Terraform provider updated to version 1.27.0
Databricks CLI updated to version 0.206.0
Databricks SDK for GO updated to version 0.21.0
Databricks ODBC driver 2.7.5 ( Timedate function, Server-side encryption with customer provided keys)
Databricks Extension for VSCode updated to version 1.1.3
Running Jobs as a service principal is GA. For more information https://www.youtube.com/watch?v=Ri9NCLjdJ74&t=1s
Databricks SDK For python updated to version 0.9.0
Databricks connect for Databricks Runtime 14.0 with the full Structured Streaming and long running queries
Configure tableau and PowerbI Oauth with SAML SSO. For more information :https://docs.databricks.com/en/partners/bi/power-bi.html
Databricks Connect V2 is public preview for Scala. For more information https://www.youtube.com/watch?v=DkzwFTC7WWs&t=145s
Unified login public preview for accounts created before June 21, 2023. For more information : https://docs.databricks.com/en/release-notes/product/2023/june.html#unified-login
Databricks runtime 14 is GA
Github apps integration in Repos is GA
Spark 3.5.0 is GA. For more information https://spark.apache.org/releases/spark-release-3-5-0.html
With Databricks Runtime 14.0 and above, shared clusters now use Spark Connect with the Spark Driver from the Python REPL by default. Internal Spark APIs are no longer accessible from user code.
User-defined table functions (UDTFs) allow you to register functions that return tables instead of scalar values. For more information https://docs.databricks.com/en/udf/python-udtf.html
Lakehouse Federation is now available on single-user clusters using Databricks Runtime 13.1 and above. Only the connection owner can run queries on federated catalogs.

Workflows

To prevent runs of Databricks jobs from being skipped because of concurrency limits, you can enable queueing on the job. When it’s enabled if a concurrency limit is reached, the job is placed in a queue until capacity is available. For more information https://docs.databricks.com/en/workflows/jobs/create-run-jobs.html#job-queueing
Databricks asset bundles is now public preview. It enables end to end analytics and ML projects to be expressed as a collection of sources. This makes it simpler to apply Data engineering best practices. For more information https://docs.databricks.com/en/dev-tools/bundles/index.html

Governance

You can use Structured Streaming to perform streaming reads from views registered with Unity Catalog. For more information : https://docs.databricks.com/en/structured-streaming/views.html (DBR 14.1 required)
You can filter sensitive data with row filters and column masks. For more information https://www.youtube.com/watch?v=Ph32H5HTWlA&t=3s
System tables now include Marketplace schema and pricing information.
Data explorer is now Catalog Explorer
You can now delegate privileges for managing items in the allowlist for Unity Catalog. For more information https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/allowlist.html

Databricks SQL

Lakeview Dashboards allow you to share visualizations and datasets using a simplified and unified data governance model. For more information https://docs.databricks.com/en/dashboards/lakeview.html
New charts are now available, featuring faster render performance, beautiful colors, and improved interactivity.
In the graph view of Query Profile, you can now view the Join type on any node containing a join in the query plan.

Machine learning

Inference tables for model serving endpoints is public preview. You can capture incoming and outgoing responses to those endpoints and log them as a Unity Catalog delta table. For more information : https://docs.databricks.com/en/machine-learning/model-serving/inference-tables.html
GPU Model serving in public preview.
On demand Feature computation is now in Unity Catalog is now in public preview. For more information https://docs.databricks.com/en/machine-learning/feature-store/on-demand-features.html

Delta Lake

Row-level concurrency reduces conflicts between concurrent write operations by detecting changes at the row-level and automatically resolving competing changes in concurrent writes that update or delete different rows in the same data file. For more information https://docs.databricks.com/en/optimizations/isolation-level.html#row-level-concurrency

Partner connect

You can use Partner connect to connect your databricks workspace to Snowplow.

How to pass the Databricks Platform Admin Accreditation?

How to pass the Associate Machine Learning Certification ?

How to pass the Associate Developer for Apache Spark certification?

How to pass the Associate Data Analyst Certification ?

How to pass the Professional Databricks Data Engineering certification ?

How to pass the Associate Databricks Data Engineering Certification ?

La data avec Youssef

Everything you need to know about Databricks / Tout ce qu'il faut connaitre sur Databricks

What’s new in Databricks for December 2023

What’s new in Databricks for November 2023

What’s new in Databricks for October 2023

What’s new in Databricks for September 2023

What’s new in Databricks for July 2023

What’s new in Databricks for June 2023

What’s new in Databricks for September 2023

Articles similaires

Laisser un commentaire Annuler la réponse

Partager :

Articles similaires

Laisser un commentaire Annuler la réponse