On Friday, April 16th, 2021, I will be speaking at Global Azure Norway! Global Azure is a virtual 3-day event where communities from around the world organize live streams that are open for everyone to join. All the live streams add up to one global event with more than 500 speakers and sessions, so you should be able to find something interesting 🤓
On Saturday, January 30th, 2021, I will be speaking at the first Data Toboggan event! This is a free event focusing on Azure Synapse Analytics. There are 14 sessions scheduled in 12 hours, covering topics such as data integration, machine learning, data warehousing, data governance, and more. Join us 🤓
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse Analytics
Do you regularly need to get data for your projects?
Data is at the core of every Business Intelligence, Data Science, and Machine Learning project. You need data to understand what has happened in the past, to predict what may happen in the future, to discover patterns and anomalies, and to gain the insight necessary for making faster and better decisions.
But before you can do any of those things, you need to ingest, store, transform, integrate, and prepare your data. Guess what? You can do all of those things in Azure Synapse Analytics – without having to write any code!
In this session, we will cover the fundamentals of data integration in Azure Synapse Analytics. First, we will go through what Pipelines and Data Flows are. Then, we will quickly build a solution for ingesting and transforming data. Finally, we will look at how to orchestrate and schedule our pipelines, and how to monitor our solution once it has been deployed.
Join us at Data Toboggan!
Check out the Data Toboggan schedule, because I’m guessing that there’s at least a session or two that’s interesting for you. And if this is not your thing, can you please share it with your coworkers and network who might be interested? Register today, follow @datatoboggan and #DataToboggan on Twitter, and I hope to see you there! 😊
In the previous post, we started by creating an Azure Data Factory, then we navigated to it. In this post, we will navigate inside the Azure Data Factory. Let’s look at the Azure Data Factory user interface and the four Azure Data Factory pages.
Azure Data Factory Pages
On the left side of the screen, you will see the main navigation menu. Click on the arrows to expand and collapse the menu:
Once we expand the navigation menu, we see that Azure Data Factory consists of four main pages: Data Factory, Author, Monitor, and Manage:
In Azure Data Factory, you can connect to a Git repository using either GitHub or Azure DevOps. When connecting, you have to specify which collaboration branch to use. In most cases, the default branch is used. Historically, the default branch name in git repositories has been “master“. This is problematic because it is not inclusive and is very offensive to many people.
The Git project, GitHub, and Azure DevOps are making changes to allow users to specify a different default branch name. GitHub and Azure DevOps will be changing their default branch names to “main” in 2020. I fully support this change and will be doing the same in my projects.
In this post, we will go through how to rename the default branch from “master” to “main” in Azure Data Factory Git repositories hosted in GitHub and Azure DevOps. Then we will reconnect Azure Data Factory and configure it to use the new “main” branch as the collaboration branch.
For these examples, I’m using my personal demo projects. I’m not taking into consideration any branch policies, other users, third-party tools, or external dependencies. As always, keep in mind that this is most likely a larger change, both technically and organizationally, in production and enterprise projects. 😊
The Short Version
Create a new “main” branch in your Git repository
Set the new “main” branch as the default branch in your Git repository
Delete the old “master” branch in your Git repository
Disconnect from your Git repository in Azure Data Factory
Reconnect to your Git repository in Azure Data Factory using the new “main” branch as the collaboration branch
This month’s T-SQL Tuesday is hosted by Jess Pomfret (@jpomfret). She wants to hear about life hacks to make your life easier! In this post, I share two of my most-used keyboard shortcuts. One for moving text lines up and down without copying and pasting, and one for moving windows around without dragging and dropping. I use these all the time :)
Moving text lines up and down
Previously, I was moving text lines up and down in a couple of different ways. Have you ever marked all the text on a line, copied it, then pasted it again? Yeah, I did that all the time. And then I discovered there’s an easier way! Yay :)
There are a couple of different flavors to this keyboard shortcut.
In Office applications like PowerPoint and OneNote, you use Shift+Alt+Up and Shift+Alt+Down:
In other applications like SQL Server Management Studio, Azure Data Studio, and Visual Studio Code, you simply use Alt+Up and Alt+Down.
Moving windows around or between screens
Similarly, I was previously dragging windows around multiple monitors using my mouse. Then I discovered you can use Win+Arrows to move windows around. And then I discovered that you can use Win+Shift+Arrows to immediately move windows to the same position on other monitors. Are you showing a full-screen application while presenting? Just win-shift-arrow it to the extended screen and you look like a total pro. Whaaat! :D
Keyboard all the things!
There you go. Two of my favorite, useful, and timesaving keyboard shortcuts! I use these so much that I don’t think about them anymore – until someone goes “whoa whoa whoa wait what magic did you just do!?” :D
It’s December 31st, 2019. WHAAAAAT? 🤯 I have no idea how we’re almost in 2020, but here we are! Just a few hours left of the year. (Hi to my friends around the world who are already in 2020! 👋🏻) Like many others, I enjoy reflecting on the year that’s almost over. This year, I’ve decided to collect some of my highlights from 2019.
(Warning! There will be lots of tweets and pictures.)
This is a total brag fest that I’m writing solely for myself. It’s my 2019 highlight reel that I can look back on when days get rough and I need a reminder that life is actually pretty awesome and I’m insanely lucky and privileged to be here. And when we get to 2025, future Cathrine can re-read everything and go “oh yeah, I remember that, we were so young and inexperienced back then, awww!” …like I do now with my old posts. It’s fun. You should try it! 😁
Lessons Learned in 2019
I also started writing about some of the more difficult parts of my year and what I learned from it… And in the middle of it, I realized that I’m not quite ready to share those thoughts yet. I still have lots of processing to do before I can turn my struggles into any kind of useful advice for others. I’m hoping to be able to do that in 2020.
After reading that book in 2018 and reflecting on it for all of 2019, I’ve started learning to take responsibility for my own feelings, to set healthy boundaries for myself, and to choose my f*cks wisely.
For the past 25 days, I have written one blog post per day about Azure Data Factory. My goal was to start completely from scratch and cover the fundamentals in casual, bite-sized blog posts. This became the Beginner’s Guide to Azure Data Factory. Today, I will share a bunch of resources to help you continue your own learning journey.
I’ve already seen from your questions and comments that you are ready to jump way ahead and dive into way more advanced topics than I ever intended this series to cover 😉 And as much as I love Azure Data Factory, I can’t cover everything. So a little further down, I will share where and how and from who you can continue learning about Azure Data Factory.
Congratulations! You’ve made it through my entire Beginner’s Guide to Azure Data Factory 🤓 We’ve gone through the fundamentals in the first 23 posts, and now we just have one more thing to talk about: Pricing.
And today, I’m actually going to talk! You see, in November 2019, I presented a 20-minute session at Microsoft Ignite about understanding Azure Data Factory pricing. And since it was recorded and the recording is available for free for everyone… Well, let’s just say that after 23 posts, I think we could both appreciate a short break from reading and writing 😅
(And as a side note, I’m originally publishing this post on December 24th. Here in Norway, we celebrate Christmas all day today! This is the biggest family day of the year for me, full of food and traditions. So instead of spending a lot of time writing today, I’m going to link to my video and spend the rest of the day with my family. Yay! 🎅🏻🎄🎁)
In the previous post, we looked at foreach loops and how to control them using arrays. But you can also control them using more complex objects! In this post, we will look at lookups. How do they work? What can you use them for? And how do you use the output in later activities, like controlling foreach loops?
Lookups are similar to copy data activities, except that you only get data from lookups. They have a source dataset, but they do not have a sink dataset. (So, like… half a copy data activity? :D) Instead of copying data into a destination, you use lookups to get configuration values that you use in later activities.
And how you use the configuration values in later activities depends on whether you choose to get the first row only or all rows.
But before we dig into that, let’s create the configuration datasets!
In the previous post, we looked at how to use variables in pipelines. We took a sneak peek at working with an array, but we didn’t actually do anything with it. But now, we will! In this post, we will look at how to use arrays to control foreach loops.
You can use foreach loops to execute the same set of activities or pipelines multiple times, with different values each time. A foreach loop iterates over a collection. That collection can be either an array or a more complex object. Inside the loop, you can reference the current value using @item().
Let’s take a look at how this works in Azure Data Factory!