Skip to content

Developing in Containers using Visual Studio Code (T-SQL Tuesday #140)

T-SQL Tuesday #140: Developing in Containers using Visual Studio Code

This month’s T-SQL Tuesday is hosted by Anthony Nocentino (@nocentino). He wants to know what we’ve been up to with containers. Perfect timing, because I have just spent the last couple of weeks learning how to develop in containers using Visual Studio Code! I was planning to write this for myself anyway, but perhaps it can be interesting for others as well 🤓

What is the use case?

One of my clients are using dbt (Data Build Tool) for their data transformations. In short, this means that developers write data transformations in SQL as SELECT statements. All SQL code can be combined with Jinja templates. Inside of these Jinja templates, developers can reference other tables, use control logic, or define common SQL code snippets as reusable macros. Dbt then compiles the SQL+Jinja code into pure SQL.

For example, if a macro looks like this:

{% macro convert_date_to_int(column_name ) %}
    CAST(CONVERT(CHAR(8), {{ column_name }}, 112) AS INT)
{% endmacro %}

And a developer writes something like this:

    order_id AS OrderID,
    {{ convert_date_to_int('order_date') }} AS OrderDate
FROM {{ ref('stg_orders') }}

Dbt will compile everything into this:

    order_id AS OrderID,
    CAST(CONVERT(CHAR(8), order_date, 112) AS INT) AS OrderDate
FROM stg.orders

After that, dbt uses these SELECT statements and turns them into actual tables and views in the data warehouse (or data lakehouse). Dbt can also run code tests, generate documentation, and produce lineage graphs showing dependencies between tables.

Don’t ask me how it does all of those things, though, because the whole point of using dbt is that I don’t have to understand it. I just write SQL code with a sprinkle of Jinja 😂 I do know how to ask dbt to do these things, though. For example:

$ dbt compile

But wait! Where the heck do I type that command?

Continue reading →

Microsoft Data Platform MVP 2021-2022: Vaccinated Edition

It’s July 1st, 2021, and I’m currently sitting here with a slightly sore arm and all the emotions. Why? Because… First, I got my first Pfizer shot. Then, I was renewed as a Microsoft Data Platform MVP 2021-2022! 🥳🤓😭🤩😁

Cathrine Wilhelmsen smiling with a bandaid on her arm after getting vaccinated and being renewed as a Microsoft Data Platform MVP 2021-2022

The Vaccine

I’ll start with the most personal thing… the vaccine. This past year has been difficult, as it has been for so many of us. I’m extremely privileged to live in Norway, my family is safe and healthy, I have a secure job, and I know so very well that I’m one of the incredibly lucky ones. But I also lost most of my life when the pandemic started. I could no longer travel to meet my friends, help others by speaking at events, or do any of the things that have been my most important coping mechanisms for my mental health.

Last spring I was constantly afraid, I struggled to sleep and focus, and I couldn’t keep up with work. The result was that I crashed hard. I took a break from everything and even shut down my website for a while because everything was overwhelming. By the end of summer, I thought that things were slowly getting better. They weren’t. In October, I realized that I had completely burned out, finally asked for help from my boss and my doctor, and ended up on sick leave. I thought I’d be fine after a couple of weeks, while my boss smiled and kindly told me to prepare for a rollercoaster ride that would last for months. He was right, of course. It took me 8 months before I was back working full-time. And I’m still not back to my old self yet.

But now… I feel hopeful. I get to go into the office again in August to see my coworkers, and do things I used to take for granted like go out for a coffee or visit the library. Maybe I get to see friends again before the end of the year. (It’s been two long years without them!) Things are slowly starting to feel more normal, instead of everything being scary and overwhelming. Getting the vaccine is the first step in starting to live again and not just getting through the days, and I am so ready for that!

Phew! That was… a lot 😊

Microsoft Data Platform MVP 2021-2022

On top of all those 👆🏻 emotions, I was renewed as a Microsoft Data Platform MVP 2021-2022! 🥳 I don’t feel like I deserve the award this time because I haven’t been able to do much this past year, but I am so, so, so grateful that Microsoft showed empathy and understanding and decided to give me another chance 💙

I’m excited, and that’s a feeling I haven’t felt for a while. It feels good. I’m looking forward to a sort of kind of new start? Or maybe a refresh? F5. Let’s go with that one 🤓

Ask the Experts: Data Integration in Azure Synapse Analytics (at Data Toboggan 2021)

On Saturday, June 12th, 2021, I will be moderating an Ask the Experts session at Data Toboggan! This is a free event focusing on Azure Synapse Analytics. There are over twenty sessions and lightning talks scheduled, covering topics such as architecture, performance, tools, data integration, machine learning and much more.

If you have any questions about Data Integration in Azure Synapse Analytics (or Azure Data Factory), join us! You don’t want to miss this session 🤓

Data Toboggan logo showing a toboggan (sled) going down a hill, with the text "Ask the Experts: Data Integration in Azure Synapse Analytics"

Ask the Experts from Microsoft and the Community

I’m incredibly thankful and excited that experts from both Microsoft and the community will be participating. It’s a unique opportunity for you to learn more about the technologies and products, as well as from experiences based on real-world projects.

These are the experts who will be participating:

What do you want to know?

When should you use Azure Synapse Analytics vs. Azure Data Factory? How can you optimize your Pipelines and Data Flows? Can Azure Purview bring additional benefits for Data Engineers? How do you set up automated deployment? What are some best practices to follow and pitfalls to avoid?

I’m sure you can think of many other questions! Please submit them through this form before the event. You can of course also ask questions during the session, it’s a live session after all! I just want to make sure that we have some questions queued up from the start to get the most out of our time 😊

(If we don’t get through all the questions during the session, I will try to get them answered in a follow-up blog post.)

Join us at Data Toboggan!

Take a look at the schedule, make sure you’re registered, and submit your questions. If this session is not for you, can you please spread the word to anyone in your network who might be interested? I hope to see you there! 🤓

Speaking at Global Azure Norway 2021

On Friday, April 16th, 2021, I will be speaking at Global Azure Norway! Global Azure is a virtual 3-day event where communities from around the world organize live streams that are open for everyone to join. All the live streams add up to one global event with more than 500 speakers and sessions, so you should be able to find something interesting 🤓

I will be presenting my session called Pipeline and Data Flows: Introduction to Data Integration in Azure Synapse Analytics. Do you want to learn about something else? You can find all the worldwide sessions on the Global Azure website, or the local sessions on the Global Azure Norway website.

Speaker card showing Cathrine Wilhelmsen presenting at Global Azure Norway
Continue reading →

Speaking at Data Toboggan 2021

On Saturday, January 30th, 2021, I will be speaking at the first Data Toboggan event! This is a free event focusing on Azure Synapse Analytics. There are 14 sessions scheduled in 12 hours, covering topics such as data integration, machine learning, data warehousing, data governance, and more. Join us 🤓

I will be presenting a session called Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse Analytics.

Data Toboggan logo showing a toboggan (sled) going down a hill
Data Toboggan: The slope that enables predictive analytics

Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse Analytics

Do you regularly need to get data for your projects?

Yep! 🙋‍♀️

Data is at the core of every Business Intelligence, Data Science, and Machine Learning project. You need data to understand what has happened in the past, to predict what may happen in the future, to discover patterns and anomalies, and to gain the insight necessary for making faster and better decisions.

But before you can do any of those things, you need to ingest, store, transform, integrate, and prepare your data. Guess what? You can do all of those things in Azure Synapse Analytics – without having to write any code!

In this session, we will cover the fundamentals of data integration in Azure Synapse Analytics. First, we will go through what Pipelines and Data Flows are. Then, we will quickly build a solution for ingesting and transforming data. Finally, we will look at how to orchestrate and schedule our pipelines, and how to monitor our solution once it has been deployed.

Join us at Data Toboggan!

Check out the Data Toboggan schedule, because I’m guessing that there’s at least a session or two that’s interesting for you. And if this is not your thing, can you please share it with your coworkers and network who might be interested? Register today, follow @datatoboggan and #DataToboggan on Twitter, and I hope to see you there! 😊

Overview of Azure Data Factory User Interface

This post is part 3 of 26 in the series Beginner's Guide to Azure Data Factory

In the previous post, we started by creating an Azure Data Factory, then we navigated to it. In this post, we will navigate inside the Azure Data Factory. Let’s look at the Azure Data Factory user interface and the four Azure Data Factory pages.

Azure Data Factory Pages

On the left side of the screen, you will see the main navigation menu. Click on the arrows to expand and collapse the menu:

Animation of expanding and collapsing the pages menu in the Azure Data Factory user interface

Once we expand the navigation menu, we see that Azure Data Factory consists of four main pages: Home, Author, Monitor, and Manage:

Screenshot of the Azure Data Factory user interface showing the four main pages: Data Factory, Author, Monitor, and Manage
Continue reading →

Renaming the default branch in Azure Data Factory Git repositories from “master” to “main”

In Azure Data Factory, you can connect to a Git repository using either GitHub or Azure DevOps. When connecting, you have to specify which collaboration branch to use. In most cases, the default branch is used. Historically, the default branch name in git repositories has been “master“. This is problematic because it is not inclusive and is very offensive to many people.

The Git project, GitHub, and Azure DevOps are making changes to allow users to specify a different default branch name. GitHub and Azure DevOps will be changing their default branch names to “main” in 2020. I fully support this change and will be doing the same in my projects.

In this post, we will go through how to rename the default branch from “master” to “main” in Azure Data Factory Git repositories hosted in GitHub and Azure DevOps. Then we will reconnect Azure Data Factory and configure it to use the new “main” branch as the collaboration branch.

For these examples, I’m using my personal demo projects. I’m not taking into consideration any branch policies, other users, third-party tools, or external dependencies. As always, keep in mind that this is most likely a larger change, both technically and organizationally, in production and enterprise projects. 😊

The Short Version

  1. Create a new “main” branch in your Git repository
  2. Set the new “main” branch as the default branch in your Git repository
  3. Delete the old “master” branch in your Git repository
  4. Disconnect from your Git repository in Azure Data Factory
  5. Reconnect to your Git repository in Azure Data Factory using the new “main” branch as the collaboration branch
Continue reading →

Keyboard shortcuts for moving text lines and windows (T-SQL Tuesday #123)

Keyboard shortcuts for moving text lines and windows (T-SQL Tuesday #123)

This month’s T-SQL Tuesday is hosted by Jess Pomfret (@jpomfret). She wants to hear about life hacks to make your life easier! In this post, I share two of my most-used keyboard shortcuts. One for moving text lines up and down without copying and pasting, and one for moving windows around without dragging and dropping. I use these all the time :)

Moving text lines up and down

Previously, I was moving text lines up and down in a couple of different ways. Have you ever marked all the text on a line, copied it, then pasted it again? Yeah, I did that all the time. And then I discovered there’s an easier way! Yay :)

There are a couple of different flavors to this keyboard shortcut.

In Office applications like PowerPoint and OneNote, you use Shift+Alt+Up and Shift+Alt+Down:

Recording of my two monitors. The left monitor shows the keystrokes.

In other applications like SQL Server Management Studio, Azure Data Studio, and Visual Studio Code, you simply use Alt+Up and Alt+Down.

Moving windows around or between screens

Similarly, I was previously dragging windows around multiple monitors using my mouse. Then I discovered you can use Win+Arrows to move windows around. And then I discovered that you can use Win+Shift+Arrows to immediately move windows to the same position on other monitors. Are you showing a full-screen application while presenting? Just win-shift-arrow it to the extended screen and you look like a total pro. Whaaat! :D

Recording of my two monitors. The left monitor shows the keystrokes.

Keyboard all the things!

There you go. Two of my favorite, useful, and timesaving keyboard shortcuts! I use these so much that I don’t think about them anymore – until someone goes “whoa whoa whoa wait what magic did you just do!?” :D

What are your favorite keyboard shortcuts?

Personal Highlights from 2019

It’s December 31st, 2019. WHAAAAAT? 🤯 I have no idea how we’re almost in 2020, but here we are! Just a few hours left of the year. (Hi to my friends around the world who are already in 2020! 👋🏻) Like many others, I enjoy reflecting on the year that’s almost over. This year, I’ve decided to collect some of my highlights from 2019.

(Warning! There will be lots of tweets and pictures.)

This is a total brag fest that I’m writing solely for myself. It’s my 2019 highlight reel that I can look back on when days get rough and I need a reminder that life is actually pretty awesome and I’m insanely lucky and privileged to be here. And when we get to 2025, future Cathrine can re-read everything and go “oh yeah, I remember that, we were so young and inexperienced back then, awww!” …like I do now with my old posts. It’s fun. You should try it! 😁

Lessons Learned in 2019

I also started writing about some of the more difficult parts of my year and what I learned from it… And in the middle of it, I realized that I’m not quite ready to share those thoughts yet. I still have lots of processing to do before I can turn my struggles into any kind of useful advice for others. I’m hoping to be able to do that in 2020.

But for now, I’ll share the short version:

Mark Manson taught me The Subtle Art of Not Giving a F*ck and it changed my life.

After reading that book in 2018 and reflecting on it for all of 2019, I’ve started learning to take responsibility for my own feelings, to set healthy boundaries for myself, and to choose my f*cks wisely.

So should you 💙

Continue reading →

Azure Data Factory Resources

This post is part 26 of 26 in the series Beginner's Guide to Azure Data Factory

For the past 25 days, I have written one blog post per day about Azure Data Factory. My goal was to start completely from scratch and cover the fundamentals in casual, bite-sized blog posts. This became the Beginner’s Guide to Azure Data Factory. Today, I will share a bunch of resources to help you continue your own learning journey.

I’ve already seen from your questions and comments that you are ready to jump way ahead and dive into way more advanced topics than I ever intended this series to cover 😉 And as much as I love Azure Data Factory, I can’t cover everything. So a little further down, I will share where and how and from who you can continue learning about Azure Data Factory.

But first…

That’s a wrap! Woohoo 🥳

Continue reading →