Skip to content

Azure Synapse

Getting Started

Start and Stop Dedicated SQL Pool

  1. Click the Integrate tab.

    Manage tab

  2. Under Pipelines, click either Start Dedicated SQL Pool or Pause Dedicated SQL Pool. Then click the trigger button to open a menu, and select Trigger now. On the next screen, click OK.

    Start or Stop SQL Pools

Home Home

The Home tab is where you start when you first open Azure Synapse Studio.

From here, you can access shortcuts for common tasks such as creating SQL scripts or notebooks by clicking the New dropdown menu button. Recently opened resources are also displayed.

Data Data

The Data tab is where you can explore everything in your database and linked datasets.

Under the Workspace tab, you can explore the dedicated SQL pool database and any Spark databases.

Under the Linked tab, you can explore external objects (e.g. Data Lake accounts) and explore and create any integration datasets from external linked data (e.g. Data Lake, Blob Storage, web service, etc) to be used in pipelines.

How to Bring in Data from Linked Services

Note: This example shows how to get data from Data Lake, although there are many source types available.

  1. Click the « + » button the add a new resource, then click Integration Dataset.

    Add New Resource

  2. Select Azure Data Lake Storage Gen2 (you may need to search for this), then click Continue.

    Azure Data Lake Storage Gen2

  3. Select the format type, then click Continue.

  4. Enter a name, then click the drop-down menu under Linked service and select your data lake.

    Set properties

  5. Under Connect via integration runtime, ensure that interactive authoring is enabled. If it is not, click the edit button to enable it, then click Apply.

    Enable interactive authoring

  6. Set additional properties as appropriate, then click OK.

How to Explore Data in the Data Lake

Browse to find your dataset file (CSV, Parquet, JSON, Avro, etc) and right click it. A menu will open with options to preview the data, or create resources such as SQL scripts and notebooks.

Read Data from CSV File

How to Explore the Dedicated SQL Pool

Under the Workspace tab, you can explore databases similarly to SQL Server Management Studio. Right click any table, highlight New SQL script, and click Select TOP 100 rows to create a new query. You can then view the results as either a table or a chart.

Explore Dedicated SQL Pool

Importing Data to the Dedicated SQL Pool

To import data to the dedicated SQL pool, you can either: - create a pipeline with a Copy data activity (most efficient for large datasets) - use the Bulk Load Wizard.

Develop Develop

From here, you can create and save resources such as SQL scripts, notebooks, and Power BI reports.

To add a new resource, click the « + » button. A dropdown menu will open.

Add a Resource

To make your changes visible to others, you need to click the Publish button.

SQL Scripts

Be sure to connect to your dedicated SQL pool to run SQL scripts.

SQL Scripts

Notebooks

To run notebook cells, you first need to select your Apache Spark pool.

Notebooks

To change languages for a single cell, you can use the following magic commands: %%pyspark, %%spark, %%csharp, %%sql. You can also change the default language using the Language dropdown menu.

Change language

Dataflows

To add a source to a dataflow, click the « + » button under Source Settings, then select Azure Data Lake Storage Gen2 (you may need to search for this). Click Continue, select the data format, then on the next page, select your Linked Service.

Dataflows

Power BI Reports

You can view and create Power BI reports directly in Azure Synapse. Please contact the Collabotative Analytics Environment support team to validate that a linked service is set up.

Integrate Integrate

This is where you can create pipelines for ingesting, preparing and transforming all of your data, like in Azure Data Factory.

Example: Copy Data from External Blob to Data Lake

  1. Click the « + » button to add a new resource, then click Pipeline.

Add a Resource

  1. Under Move & transform, drag and drop Copy data into the window.

Drag and drop Copy data

  1. Click on the Source tab, then click New to add the source dataset (where you want to copy the data from).

Add source dataset

  1. Select Azure Blob Storage, then select the format type (CSV, Parquet, JSON, etc). Set any additional properties if relevant, then click OK.

  2. Click Sink, then click New to set the sink dataset (where you want the data to be copied to). Choose Azure Data Lake Storage Gen2, then select the format type. Under Linked service, choose your data lake and ensure that interactive authoring is enabled (see How to Bring in Data from Linked Services under Data for more information).

Debugging and Running Pipelines

To run a pipeline in debug mode, click the Debug button at the top of the pipeline window. Results will appear in the Output tab.

Debugging pipelines

To run a pipeline without debugging, click the Add trigger button, then Trigger now.

When you are ready to publish your pipelines, click the Validate all button, then click the Publish all button. Note that this will publish for all users to see everything that you currently have open (pipelines, SQL scripts, notebooks, etc).

Publish pipelines

Monitor Monitor

From the Monitor tab, you can monitor live pipeline runs (the inputs and outputs of each activity and any errors) and view historical pipeline runs, trigger runs, SQL requests, etc.

Manage Manage

This is where you can: - Add new SQL or Apache Spark pools - Add new linked services - Grant others access to the workspace - Set up git integration

Microsoft Documentation