Azure Data Lake basics for the SQL Server DBA / developer and… for everyone!

Posted in: Advanced Analytics, Big Data, Cloud, Microsoft SQL Server, Technical Track

The basics

If you’re a Microsoft SQL Server DBA or developer and have not been introduced to the Microsoft Azure Data Lake and would like to understand what it’s all about and how to get started, this article is for YOU.

Azure Data Lake is part of the Big Data Analytics Stack:

The Azure Data Lake has two components

The Data Lake Store, where you can store the data in files of different formats

  1. The Data Lake Analytics that you use to create jobs to copy data from different data sources to the Data Lake Store or Azure Blob Storage and to transform/manipulate/summarize data from source files to destination files using U-SQL language.

Yes, it’s that simple… the following links teach you step-by-step how to do it, all provided by Microsoft! From the Azure Dashboard, you create the resources you need:

  1. Create, configure or delete the Azure Data Store resource – https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-get-started-portal
  2. Create the Azure Data Lake Analytics resource and create a basic U-SQL job – https://docs.microsoft.com/en-us/azure/data-lake-analytics/data-lake-analytics-get-started-portal

Next steps

There is so much more we can do once we have the Data Lake Store and Data Lake Analytics resource set up!

Here are some examples to start with:

  1. Use Data Factory pipeline to:
      1. Load data into Azure Data Lake Store by using Azure Data Factory – example: https://docs.microsoft.com/en-us/azure/data-factory/load-azure-data-lake-store
      2. Add a U-SQL task to manipulate data from files in the Azure Data Store or Azure Blob.
  2. Run SSIS packages to copy data to Azure Blob or Azure Data Lake Store using Azure Feature Pack for Integration Services (SSIS) – https://docs.microsoft.com/en-us/sql/integration-services/azure-feature-pack-for-integration-services-ssis?view=sql-server-2017

More to learn

  1. Learn more about U-SQL
  2. Manage resources from Azure Data Lake Store Explorer (integrate Blob Storage and other data sources with Azure Data Lake)
  3. Comparison between Data Lake Store and Blob Storage
  4. Copy data from Azure Storage Blobs to Data Lake Store with a command-line utility
  5. Best practices for using Azure Data Lake Store
  6. Copy data to or from Azure Data Lake Store by using Azure Data Factory
  7. What are Azure Data Bricks?
  8. Newest updates and links for Azure Data Lake Analytics and U-SQL

If you’re looking for a comprehensive resource on Data Warehousing with Azure Data Factory, I just had the pleasure of contributing to this new resource. Learn more.

 

email

Interested in working with Michelle? Schedule a tech call.

About the Author

Michelle has 30 years in IT, and has been working with SQL Server for the past 20 years. She has designed methodologies that consist of documentation, utilities, and scripts to automate architecture, design, and performance tuning initiatives for her clients. Michelle is able to see the wider vision of her clients’ business. She is passionate about solving problems quickly and providing value to her clients. She speaks English, Hebrew, Spanish, and a bit of French.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *