Which Bash shortcut is used to re execute a recent command by matching the command name?

Skip to main content

This browser is no longer supported.

Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.

Use notebooks

  • Article
  • 11/23/2022
  • 25 minutes to read

In this article

You use Azure Databricks notebooks to develop data science and machine learning workflows and to collaborate with colleagues across engineering, data science, machine learning, and BI teams. Azure Databricks notebooks provide real-time coauthoring, automatic versioning, and built-in data visualizations.

With Azure Databricks notebooks, you can:

  • Develop using Python, SQL, Scala, and R.
  • Customize your environment with the libraries of your choice.
  • Schedule notebooks to automatically run workflows.
  • Export results and notebooks in .html or .ipynb format.
  • Build and share dashboards.
  • [Experimental] Use advanced editing capabilities.

Configure notebook settings

To configure notebook settings:

  1. Click your username at the top right of the workspace and select User Settings from the drop down.
  2. Click the Notebook Settings tab.

Develop in notebooks

Notebooks use two types of cells: code cells and markdown cells. Code cells contain runnable code. Markdown cells contain markdown code that renders into text and graphics when the cell is executed. You can run each cell individually or run the whole notebook at once.

The notebook toolbar includes menus and icons that you can use to manage and edit the notebook.

Link to notebook in same folder as current notebook Link to folder in parent folder of current notebook Link to nested notebook

Display images

To display images stored in the FileStore, use the syntax:

%md
![test][files/image.png]

For example, suppose you have the Databricks logo image file in FileStore:

dbfs ls dbfs:/FileStore/
databricks-logo-mobile.png

When you include the following code in a Markdown cell:

the image is rendered in the cell:

Display mathematical equations

Notebooks support KaTeX for displaying mathematical formulas and equations. For example,

%md
\\[c = \\pm\\sqrt{a^2 + b^2} \\]

\\[A{_i}{_j}=B{_i}{_j}\\]

$$c = \\pm\\sqrt{a^2 + b^2}$$

\\[A{_i}{_j}=B{_i}{_j}\\]

renders as:

and

%md
\\[ f[\beta]= -Y_t^T X_t \beta + \sum log[ 1+{e}^{X_t\bullet\beta}] + \frac{1}{2}\delta^t S_t^{-1}\delta\\]

where \\[\delta=[\beta - \mu_{t-1}]\\]

renders as:

Include HTML

You can include HTML in a notebook by using the function displayHTML. See HTML, D3, and SVG in notebooks for an example of how to do this.

Note

The displayHTML iframe is served from the domain databricksusercontent.com and the iframe sandbox includes the allow-same-origin attribute. databricksusercontent.com must be accessible from your browser. If it is currently blocked by your corporate network, it must added to an allow list.

Command comments

You can have discussions with collaborators using command comments.

To toggle the Comments sidebar, click the Comments icon at the top right of a notebook.

To add a comment to a command:

  1. Highlight the command text and click the comment bubble:

  2. Add your comment and click Comment.

To edit, delete, or reply to a comment, click the comment and choose an action.

Change cell display

There are three display options for notebooks:

  • Standard view: results are displayed immediately after code cells.
  • Results only: only results are displayed.
  • Side-by-side: code and results cells are displayed side by side.

Use the View menu

to select a display option.

Show line and command numbers

To show or hide line numbers or command numbers, select Line numbers or Command numbers from the View menu. For line numbers, you can also use the keyboard shortcut Control+L.

If you enable line or command numbers, Databricks saves your preference and shows them in all of your other notebooks for that browser.

Command numbers above cells link to that specific command. If you click the command number for a cell, it updates your URL to be anchored to that command. If you want to link to a specific command in your notebook, right-click the command number and choose copy link address.

Find and replace text

To find and replace text within a notebook, select Edit > Find and Replace. The current match is highlighted in orange and all other matches are highlighted in yellow.

To replace the current match, click Replace. To replace all matches in the notebook, click Replace All.

To move between matches, click the Prev and Next buttons. You can also press shift+enter and enter to go to the previous and next matches, respectively.

To close the find and replace tool, click

or press esc.

Autocomplete

You can use Azure Databricks autocomplete to automatically complete code segments as you type them. Azure Databricks supports two types of autocomplete: local and server.

Local autocomplete completes words that are defined in the notebook. Server autocomplete accesses the cluster for defined types, classes, and objects, as well as SQL database and table names. To activate server autocomplete, attach your notebook to a cluster and run all cells that define completable objects.

Important

Server autocomplete in R notebooks is blocked during command execution.

To trigger autocomplete, press Tab after entering a completable object. For example, after you define and run the cells containing the definitions of MyClass and instance, the methods of instance are completable, and a list of valid completions displays when you press Tab.

SQL database and table name completion, type completion, syntax highlighting and SQL autocomplete are available in SQL cells and when you use SQL inside a Python command, such as in a spark.sql command.

— —

In Databricks Runtime 7.4 and above, you can display Python docstring hints by pressing Shift+Tab after entering a completable Python object. The docstrings contain the same information as the help[] function for an object.

Format code cells

Azure Databricks provides tools that allow you to format Python and SQL code in notebook cells quickly and easily. These tools reduce the effort to keep your code formatted and help to enforce the same coding standards across your notebooks.

Format Python cells

Starting with Databricks Runtime 11.2, Azure Databricks uses Black to format code within a notebook. The notebook must be attached to a cluster, and Black executes on the cluster that the notebook is attached to.

How to format Python and SQL cells

You must have Can Edit permission on the notebook to format code.

You can trigger the formatter in the following ways:

  • Format a single cell

    • Keyboard shortcut: Press Cmd+Shift+F.
    • Command context menu:
      • Format SQL cell: Select Format SQL in the command context dropdown menu of a SQL cell. This menu item is visible only in SQL notebook cells or those with a %sql language magic.
      • Format Python cell: Select Format Python in the command context dropdown menu of a Python cell. This menu item is visible only in Python notebook cells or those with a %python language magic.
    • Notebook Edit menu: Select a Python or SQL cell, and then select Edit > Format Cell[s].
  • Format multiple cells

    Select multiple cells and then select Edit > Format Cell[s]. If you select cells of more than one language, only SQL and Python cells are formatted. This includes those that use %sql and %python.

  • Format all Python and SQL cells in the notebook

    Select Edit > Format Notebook. If your notebook contains more than one language, only SQL and Python cells are formatted. This includes those that use %sql and %python.

Limitations

  • Black enforces PEP 8 //peps.python.org/pep-0008/] standards for 4-space indentation. Indentation is not configurable.
  • Formatting embedded Python strings inside a SQL UDF is not supported. Similarly, formatting SQL strings inside a Python UDF is not supported.

View table of contents

To display an automatically generated table of contents, click the arrow at the upper left of the notebook [between the sidebar and the topmost cell]. The table of contents is generated from the Markdown headings used in the notebook.

To close the table of contents, click the left-facing arrow.

View notebooks in dark mode

You can choose to display notebooks in dark mode. To turn dark mode on or off, select View > Theme and select Light theme or Dark theme.

Run notebooks

Before you can run any cell in a notebook, you must attach the notebook to a cluster.

To run all the cells in a notebook, select Run All in the notebook toolbar.

Important

Do not use Run All if steps for mount and unmount are in the same notebook. It could lead to a race condition and possibly corrupt the mount points.

To run a single cell, click in the cell and press shift+enter.

To run all cells before or after a cell, use the cell actions menu

at the far right. Click
and select Run All Above or Run All Below. Run All Below includes the cell you are in; Run All Above does not.

When a notebook is running, the icon in the notebook tab changes from

to
. If notifications are enabled in your browser and you navigate to a different tab while a notebook is running, a notification appears when the notebook finishes.

To stop or interrupt a running notebook, select

in the notebook toolbar. You can also select Run > Interrupt execution, or use the keyboard shortcut I I.

Attach a notebook to a cluster

To attach a notebook to a cluster, you need the Can Attach To cluster-level permission.

To attach a notebook to a cluster, click the cluster selector in the notebook toolbar and select a cluster from the dropdown menu. The menu shows a selection of clusters that you have used recently or that are currently running.

To select from all available clusters, click More… and select an existing cluster from the dropdown menu in the dialog.

You can also create a new cluster by selecting Create new resource… from the dropdown menu.

Important

An attached notebook has the following Apache Spark variables defined.

ClassVariable Name
SparkContext sc
SQLContext/HiveContext sqlContext
SparkSession [Spark 2.x] spark

Do not create a SparkSession, SparkContext, or SQLContext. Doing so will lead to inconsistent behavior.

Detach a notebook from a cluster

To detach a notebook from a cluster, click the cluster selector in the notebook toolbar and hover over the attached cluster in the list to display a side menu. From the side menu, select Detach.

You can also detach notebooks from a cluster using the Notebooks tab on the cluster details page.

When you detach a notebook from a cluster, the execution context is removed and all computed variable values are cleared from the notebook.

Tip

Azure Databricks recommends that you detach unused notebooks from a cluster. This frees up memory space on the driver.

View multiple outputs per cell

Python notebooks and %python cells in non-Python notebooks support multiple outputs per cell. For example, the output of the following code includes both the plot and the table:

import pandas as pd
from sklearn.datasets import load_iris

data = load_iris[]
iris = pd.DataFrame[data=data.data, columns=data.feature_names]
ax = iris.plot[]
print["plot"]
display[ax]
print["data"]
display[iris]

In Databricks Runtime 7.3 LTS, you must enable this feature by setting spark.databricks.workspace.multipleResults.enabled true.

Python and Scala error highlighting

Python and Scala notebooks support error highlighting. The line of code that threw the error is highlighted in the cell. Additionally, if the error output is a stacktrace, the cell in which the error is thrown is displayed in the stacktrace as a link to the cell. You can click this link to jump to the offending code.

Notifications

Notifications alert you to certain events, such as which command is currently running during Run notebooks and which commands are in error state. When your notebook is showing multiple error notifications, the first one will have a link that allows you to clear all notifications.

Notebook notifications are enabled by default. You can disable them in user settings.

Background notifications

If you start a notebook run and then navigate away from the tab or window that the notebook is running in, a notification appears when the notebook is completed. You can disable this notification in your browser settings.

Databricks Advisor

Databricks Advisor automatically analyzes commands every time they are run and displays appropriate advice in the notebooks. The advice notices provide information that can assist you in improving the performance of workloads, reducing costs, and avoiding common mistakes.

View advice

A blue box with a lightbulb icon signals that advice is available for a command. The box displays the number of distinct pieces of advice.

Click the lightbulb to expand the box and view the advice. One or more pieces of advice will become visible.

Click the Learn more link to view documentation providing more information related to the advice.

Click the Don’t show me this again link to hide the piece of advice. The advice of this type will no longer be displayed. This action can be reversed in Notebook Settings.

Click the lightbulb again to collapse the advice box.

Advice settings

To enable or disable Databricks Advisor, go to user settings or click the gear icon in the expanded advice box.

Toggle the Turn on Databricks Advisor option to enable or disable advice.

The Reset hidden advice link is displayed if one or more types of advice is currently hidden. Click the link to make that advice type visible again.

Open or run a Delta Live Tables pipeline

For notebooks that are assigned to a Delta Live Tables pipeline, you can open the pipeline details, start a pipeline update, or delete a pipeline using the Delta Live Tables dropdown menu in the notebook toolbar.

To open the pipeline details, click Delta Live Tables and click the pipeline name, or click

> View in Pipelines.

To start an update of the pipeline, click Delta Live Tables and click Start next to the pipeline name.

To delete a pipeline the notebook is assigned to, click Delta Live Tables >

> Delete.

Azure Databricks supports several methods for sharing code among notebooks. Each of these permits you to modularize and share code in a notebook, just as you would with a library.

For more complex interactions between notebooks, see Modularize or link code in notebooks.

Use %run to import a notebook

The %run magic executes all of the commands from another notebook. A typical use is to define helper functions in one notebook that are used by other notebooks.

In the example below, the first notebook defines a helper function, reverse, which is available in the second notebook after you use the %run magic to execute shared-code-notebook.

Because both of these notebooks are in the same directory in the workspace, use the prefix ./ in ./shared-code-notebook to indicate that the path should be resolved relative to the currently running notebook. You can organize notebooks into directories, such as %run ./dir/notebook, or use an absolute path like %run /Users//directory/notebook.

Note

  • %run must be in a cell by itself, because it runs the entire notebook inline.
  • You cannot use %run to run a Python file and import the entities defined in that file into a notebook. To import from a Python file, see Reference source code files using git. Or, package the file into a Python library, create an Azure Databricks library from that Python library, and install the library into the cluster you use to run your notebook.
  • When you use %run to run a notebook that contains widgets, by default the specified notebook runs with the widget’s default values. You can also pass in values to widgets; see Use Databricks widgets with %run.

Reference source code files using git

For notebooks stored in a Azure Databricks repo, you can reference source code files in the repository. The following example uses a Python file rather than a notebook.

Create a new example repo to show the file layout:

To configure an existing Git repository, see Set up Databricks Repos.

Create two files in the repo:

  1. A Python file with the shared code.
  2. A notebook that uses the shared Python code.

The Python file shared.py contains the helper.

Now, when you open the notebook, you can reference source code files in the repository using common commands like import.

For more information on working with files in Git repositories, see Work with files in the UI.

Manage notebook state and outputs

After you attach a notebook to a cluster and run one or more cells, your notebook has state and displays outputs. This section describes how to manage notebook state and outputs.

In this section:

  • Clear notebooks state and outputs
  • Show results
  • Download results
  • Hide and show cell content
  • Notebook isolation

Clear notebooks state and outputs

To clear the notebook state and outputs, select one of the Clear options at the bottom of the Run menu.

Menu optionDescription
Clear all cell outputs Clears the cell outputs. This is useful if you are sharing the notebook and do not want to include any results.
Clear state Clears the notebook state, including function and variable definitions, data, and imported libraries.
Clear state and outputs Clears both cell outputs and the notebook state.
Clear state and run all Clears the notebook state and starts a new run.

Show results

When a cell is run, Azure Databricks returns 1000 rows of a DataFrame. With Databricks Runtime 8.4 and above, if there are more than 1000 rows, you can re-execute the query to show up to 10,000 rows.

Download results

By default downloading results is enabled. To toggle this setting, see Manage the ability to download results from notebooks.

You can download a cell result that contains tabular output to your local machine. Click the three-button menu next to the tab title. The menu options depend on the number of rows in the result and on the Databricks Runtime version. Downloaded results are saved on your local machine as a CSV file named export.csv.

Hide and show cell content

Cell content consists of cell code and the result of running the cell. You can hide and show the cell code and result using the cell actions menu

at the top right of the cell.

To hide cell code:

  • Click
    and select Hide Code

To hide and show the cell result, do any of the following:

To show hidden cell code or results, click the Show links:

See also Collapsible headings.

Notebook isolation

Notebook isolation refers to the visibility of variables and classes between notebooks. Azure Databricks supports two types of isolation:

  • Variable and class isolation
  • Spark session isolation

Note

Azure Databricks manages user isolation using access modes configured on clusters.

  • No isolation shared: Multiple users can use the same cluster. Users share credentials set at the cluster level. No data access controls are enforced.
  • Single user: Only the named user can use the cluster. All commands run with that user’s privileges. Table ACLs in the Hive metastore are not enforced. This access mode supports Unity Catalog.
  • Shared: Multiple users can use the same cluster. Users are fully isolated from one another, and each user runs commands with their own privileges. Table ACLs in the Hive metastore are enforced. This access mode supports Unity Catalog.

Variable and class isolation

Variables and classes are available only in the current notebook. For example, two notebooks attached to the same cluster can define variables and classes with the same name, but these objects are distinct.

To define a class that is visible to all notebooks attached to the same cluster, define the class in a package cell. Then you can access the class by using its fully qualified name, which is the same as accessing a class in an attached Scala or Java library.

Spark session isolation

Every notebook attached to a cluster running Apache Spark 2.0.0 and above has a pre-defined variable named spark that represents a SparkSession. SparkSession is the entry point for using Spark APIs as well as setting runtime configurations.

Spark session isolation is enabled by default. You can also use global temporary views to share temporary views across notebooks. See CREATE VIEW. To disable Spark session isolation, set spark.databricks.session.share to true in the Spark configuration.

Important

Setting spark.databricks.session.share true breaks the monitoring used by both streaming notebook cells and streaming jobs. Specifically:

  • The graphs in streaming cells are not displayed.
  • Jobs do not block as long as a stream is running [they just finish “successfully”, stopping the stream].
  • Streams in jobs are not monitored for termination. Instead you must manually call awaitTermination[].
  • Calling the Create a new visualization on streaming DataFrames doesn’t work.

Cells that trigger commands in other languages [that is, cells using %scala, %python, %r, and %sql] and cells that include other notebooks [that is, cells using %run] are part of the current notebook. Thus, these cells are in the same session as other notebook cells. By contrast, a notebook workflow runs a notebook with an isolated SparkSession, which means temporary views defined in such a notebook are not visible in other notebooks.

Version history

Azure Databricks notebooks maintain a history of notebook versions, allowing you to view and restore previous snapshots of the notebook. You can perform the following actions on versions: add comments, restore and delete versions, and clear version history.

To access notebook versions, click the “Last edit…” message in the toolbar. The notebook versions appear at the right side of the browser tab. You can also select File > Version history.

Add a comment

To add a comment to the latest version:

  1. Click the version.

  2. Click Save now.

  3. In the Save Notebook Revision dialog, enter a comment.

  4. Click Save. The notebook version is saved with the entered comment.

Restore a version

To restore a version:

  1. Click the version.

  2. Click Restore this revision.

  3. Click Confirm. The selected version becomes the latest version of the notebook.

Delete a version

To delete a notebook’s version entry:

  1. Click the version.

  2. Click the trash icon

    .

  3. Click Yes, erase. The selected version is deleted from the history.

Clear version history

To clear the version history for a notebook:

  1. Select File > Clear version history.

  2. Click Yes, clear. The notebook version history is cleared.

    Warning

    The version history cannot be recovered after it has been cleared.

Version control with Git

To sync your work in Azure Databricks with a remote Git repository, Databricks recommends using Git integration with Databricks Repos.

Azure Databricks also has legacy support for linking a single notebook to Git-based version control tools: Git version control for notebooks [legacy]

Test notebooks

This section covers several ways to test code in Databricks notebooks. You can use these methods separately or together. See also Unit testing for notebooks.

Many unit testing libraries work directly within the notebook. For example, you can use the built-in Python unittest package to test notebook code.

def reverse[s]:
    return s[::-1]

import unittest

class TestHelpers[unittest.TestCase]:
    def test_reverse[self]:
        self.assertEqual[reverse['abc'], 'cba']

r = unittest.main[argv=[''], verbosity=2, exit=False]
assert r.result.wasSuccessful[], 'Test failed; see logs above'

Test failures appear in the output area of the cell.

You can use widgets to distinguish test invocations from normal invocations in a single notebook.

To hide test code and results, select the associated menu items from the cell dropdown menu. Any errors that occur appear even when results are hidden.

To run tests periodically and automatically, you can use scheduled notebooks. You can configure the job to send notification emails to an address you specify.

Separate test code from the notebook

To separate your test code from the code being tested, see Share code in notebooks.

An example using %run:

For code stored in a Databricks Repo, you can use the web terminal to run tests in source code files just as you would on your local machine.

You can also run this test from a notebook.

For notebooks in a Databricks Repo, you can set up a CI/CD-style workflow by configuring notebook tests to run for each commit. See Databricks GitHub Actions.

Feedback

Submit and view feedback for

Which bash shortcut or command is used to re

To re-execute the last command you ran, you could run !- 1 . However, bash provides a shortcut consisting of two exclamation points which will substitute the most recent command and execute it: !!

Which command is used to re

Which Bash shortcut or command is used to re-execute a recent command by matching the command name? Pressing Esc+.

Which bash shortcut for command displays the list of previous commands?

The following shortcuts are used for searching for commands in the bash history:.
Up arrow key – retrieves the previous command. ... .
Ctrl+P and Ctrl+N – alternatives for the Up and Down arrow keys, respectively..

Which bash shortcut or command copies the last argument of previous commands?

Which Bash shortcut or command copies the last argument of previous commands? Pressing Esc+.

Chủ Đề