In favor of the SQL assignment syntax
This post is about aliasing columns using the assignment operator, which is simply =. I always did this on Microsoft’s SQL Server1 and Azure variants. This syntax sugar I really miss on Snowflake. Instead of:
SELECT
calculation AS NEWALIAS
I use:
SELECT
NEWALIAS = calculation
It seems underwhelming, but analytical queries differ from your everyday transactional queries. Transactional queries select a few columns with little data, analytical queries can refer to ten or more tables with very wide selections and filters. Once built, they have to be maintained for a long time, so we better help our future selves and make it easy to find, read and edit SQL statements. I’ll focus on complicated select statements for now.
Let’s start off with something you might find in your average data warehouse:
CREATE OR REPLACE VIEW [common].[Employment] as
WITH DATASET_HRSystem AS (
SELECT
CAST(CONCAT
([Employments].[Emp_code]
,'-',[Employments].[Formation_CostCenter]
,'-',[Employments].[EmploymentNumber]
,'-',[Employments].[Formation_StartDate]
,'-',[Employments].[Formation_OU]
,'-',[Employments].[Formation_Project]
,'-',[Employments].[Formation_Function]
,'-',[Employments].[Formation_sequence])
AS NVARCHAR(200)) AS [BK_Employment]
,CAST([Employments].[Emp_code] AS NVARCHAR(50)) AS [Employee]
,CAST([Employments].[Type_emp_code] AS NVARCHAR(20)) AS [TypeEmployee]
,CAST(CONCAT([Employments].[AdministrationNumber],'-',[Employments].[Formation_CostCenter]) AS NVARCHAR(50)) AS [CostCenter]
,CAST(NULL AS NVARCHAR(50)) AS [CostCarrier]
,CAST([Employments].[EmploymentNumber] AS INT) AS [EmploymentNumber]
,CAST(
CASE
WHEN [Employments].[Formation_BeginDate] < [Employments].[StartDate_Schedule]
THEN [Employments].[StartDate_Schedule]
ELSE [Employments].[Formation_BeginDate]
END
AS DATE) AS [StartDate]
,CAST(
COALESCE(
CASE
WHEN COALESCE([EndDate_Schedule],'2999-12-31') < COALESCE([Formation_EndDate],'2999-12-31')
THEN [EndDate_Schedule]
ELSE [Formation_EndDate]
END
,'2999-12-31')
AS DATE) AS [EndDate]
,CAST([Employments].[Type_contract_desc] AS NVARCHAR(100)) AS [Duration]
,CAST([Employments].[Employ_date__total_duration] AS DATE) AS [EmployedDate]
,CAST(ISNULL([Employments].[Formation_EndDate],'2999-12-31') AS DATE) AS [ExpectedEnddate]
,CAST([Employments].[Date_resigned] AS DATE) AS [DischargeDate]
,CAST(
CASE
WHEN [Type_contract_desc] IN ('Temporary','Temporary without definitive enddate')
THEN 'Temporary Contract'
WHEN [Type_contract_desc] IN ('Permanent','Normal employment')
THEN 'Permanent Contract'
ELSE [Type_contract_desc]
END
AS NVARCHAR(100)) AS [EmploymentType]
,CAST([Employments].[Employment_desc] AS NVARCHAR(100)) AS [EmploymentDescription]
,CAST([Employments].[Employment_code] AS NVARCHAR(20)) AS [EmploymentCode]
,CAST(CASE WHEN [Employments].[Reason_exit_code] IS NULL THEN NULL ELSE CONCAT([Employments].[Reason_exit_code],'-',[Employments].[Employer]) END AS NVARCHAR(50)) AS [ExitReasonCode]
,CAST([Employments].[Type_empl] AS NVARCHAR(100)) AS [TypeEmployeeDescription]
,CAST([Employments].[Employer] AS NVARCHAR(255)) AS [Employer]
,CAST([Employments].[Formatie_OU] AS NVARCHAR(25)) AS [OrganisationUnit]
,CAST(COALESCE([Main_emplmnt],0) AS NVARCHAR(50)) AS [MainEmployment]
,CAST([Employments].[Formation_project] AS NVARCHAR(25)) AS [Project]
,CAST(NULL AS NVARCHAR(25)) AS [Location]
,CAST([Employments].[Formation_Percent] AS NUMERIC(18,4)) AS [Formation_Percent]
,CAST([Employments].[FTE] AS NUMERIC(18,4)) AS [FTE]
,CAST([Employments].[Formation_function] AS NVARCHAR(100)) AS [FunctionCode]
FROM [s_hrsystem].[Employments] AS [Employments]
)
SELECT
[BK_Employment]
,[Employee]
,[TypeEmployee]
,[CostCenter]
,[CostCarrier]
,[EmploymentNumber]
,[StartDate]
,[EndDate]
,[Duration]
,[EmployedDate]
,[ExpectedEnddate]
,[DischargeDate]
,[EmploymentType]
,[EmploymentDescription]
,[EmploymentCode]
,[ExitReasonCode]
,[TypeEmployeeDescription]
,[Employer]
,[OrganisationUnit]
,[MainEmployment]
,[Project]
,[Location]
,[Formation_Percent]
,[FTE]
,[FunctionCode]
FROM DATASET_HRSystem
Why I don’t like to write SQL this way:
- Hidden aliases, even CTRL-F is little help because of repeating terms such as “EndDate”
- Reading from left to right, noise of the calculation has to be processed first
- Some rows are still too long and break through the barrier
- Fixing all rows gets old fast, and columns spaced too far reduce readability even further
- Any change to the code will need fixing the spaces and tabs
- The alias is at the bottom and end of multi-line calculation, which is counterintuitive
- On smaller screens or windows, the aliases might not be visible and require scrolling
The solution is simple, just use the assignment operator (=) and turn it around!
CREATE OR REPLACE VIEW [common].[Employment] as
WITH DATASET_HRSystem AS (
SELECT
[BK_Employment] = CAST(CONCAT
([Employments].[Emp_code]
,'-',[Employments].[Formation_CostCenter]
,'-',[Employments].[EmploymentNumber]
,'-',[Employments].[Formation_StartDate]
,'-',[Employments].[Formation_OU]
,'-',[Employments].[Formation_Project]
,'-',[Employments].[Formation_Function]
,'-',[Employments].[Formation_sequence])
AS NVARCHAR(200))
,[Employee] = CAST([Employments].[Emp_code] AS NVARCHAR(50))
,[TypeEmployee] = CAST([Employments].[Type_emp_code] AS NVARCHAR(20))
,[CostCenter] = CAST(CONCAT([Employments].[AdministrationNumber],'-',[Employments].[Formation_Kostenplaats]) AS NVARCHAR(50))
,[CostCarrier] = CAST(NULL AS NVARCHAR(50))
,[EmploymentNumber] = CAST([Employments].[EmploymentNumber] AS INT)
,[StartDate] = CAST(
CASE
WHEN [Employments].[Formation_BeginDate] < [Employments].[StartDate_Schedule]
THEN [Employments].[StartDate_Schedule]
ELSE [Employments].[Formation_BeginDate]
END
AS DATE)
,[EndDate] = CAST(
COALESCE(
CASE
WHEN COALESCE([EndDate_Schedule],'2999-12-31') < COALESCE([Formation_EndDate],'2999-12-31')
THEN [EndDate_Schedule]
ELSE [Formation_EndDate]
END
,'2999-12-31')
AS DATE)
,[Duration] = CAST([Employments].[Type_contract_desc] AS NVARCHAR(100))
,[EmployedDate] = CAST([Employments].[Employ_date__total_duration] AS DATE)
,[ExpectedEnddate] = CAST(ISNULL([Employments].[Formation_EndDate],'2999-12-31') AS DATE)
,[DischargeDate] = CAST([Employments].[Date_resigned] AS DATE)
,[EmploymentType] = CAST(
CASE
WHEN [Type_contract_desc] IN ('Temporary','Temporary without definitive enddate')
THEN 'Temporary Contract'
WHEN [Type_contract_desc] IN ('Permanent','Normal employment')
THEN 'Permanent Contract'
ELSE [Type_contract_desc]
END
AS NVARCHAR(100))
,[EmploymentDescription] = CAST([Employments].[Employment_desc] AS NVARCHAR(100))
,[EmploymentCode] = CAST([Employments].[Employment_code] AS NVARCHAR(20))
,[ExitReasonCode] = CAST(CASE WHEN [Employments].[Reason_exit_code] IS NULL THEN NULL ELSE CONCAT([Employments].[Reason_exit_code],'-',[Employments].[Employer]) END AS NVARCHAR(50))
,[TypeEmployeeDescription] = CAST([Employments].[Type_empl] AS NVARCHAR(100))
,[Employer] = CAST([Employments].[Employer] AS NVARCHAR(255))
,[OrganisationUnit] = CAST([Employments].[Formatie_OU] AS NVARCHAR(25))
,[MainEmployment] = CAST(COALESCE([Main_emplmnt],0) AS NVARCHAR(50))
,[Project] = CAST([Employments].[Formation_project] AS NVARCHAR(25))
,[Location] = CAST(NULL AS NVARCHAR(25))
,[Formation_Percent] = CAST([Employments].[Formation_Percent] AS NUMERIC(18,4))
,[FTE] = CAST([Employments].[FTE] AS NUMERIC(18,4))
,[FunctionCode] = CAST([Employments].[Formation_function] AS NVARCHAR(100))
FROM [s_hrsystem].[Employments] AS [Employments]
)
SELECT
[BK_Employment]
,[Employee]
,[TypeEmployee]
,[CostCenter]
,[CostCarrier]
,[EmploymentNumber]
,[StartDate]
,[EndDate]
,[Duration]
,[EmployedDate]
,[ExpectedEnddate]
,[DischargeDate]
,[EmploymentType]
,[EmploymentDescription]
,[EmploymentCode]
,[ExitReasonCode]
,[TypeEmployeeDescription]
,[Employer]
,[OrganisationUnit]
,[MainEmployment]
,[Project]
,[Location]
,[Formation_Percent]
,[FTE]
,[FunctionCode]
FROM DATASET_HRSystem
Now we have solved the previous points:
- All column aliases are right in the front
- All reading is from left to right
- All are in the same place, no matter how long the calculation
- Changes to the code have no influence on alias position
- First the alias, then the multi-line calculation is intuitive
- Even small screens and windows can navigate the query easily
This has really helped me structure my data warehouses and navigate my queries a lot easier.