Trendlist: [New post] Using Mermaid to Create a ProcFwk Pipeline Lineage Diagram

Thursday, July 29, 2021

[New post] Using Mermaid to Create a ProcFwk Pipeline Lineage Diagram

Using Mermaid to Create a ProcFwk Pipeline Lineage Diagram

by Paul Andrew

The other day Kamil (@NowinskiK) messaged me on Teams asking if I liked Mermaid's 🙂

I prepared myself for a conversation about which was better; the fish half being on top vs on the bottom. Ha! Of course not!!

Kamil was actually referring to the…

Javascript based diagramming and charting tool that renders Markdown-inspired text definitions to create and modify diagrams dynamically.

GitHub Pages: https://mermaid-js.github.io/mermaid/

GitHub Repo: https://github.com/mermaid-js

I confess until this point, I hadn't encountered the tool.

The conversation that followed, with Kamil, was about creating a pipeline lineage diagram from implementations of the Azure pipeline orchestration framework (AKA: procfwk, AKA: my baby 🙂 ).

Given the bespoke natural of the framework, a lot of traditional data cataloging tools can't simply scrap lineage metadata from Data Factory or Synapse Integration Pipelines directly as the framework means worker pipeline triggering is abstracted way via the control pipelines. They would need to traverse the procfwk metadata model.

To cut a long story short, this traversal is what has now been done. Below you'll find a T-SQL script for the procfwk database that outputs a Mermaid 'flowchart' diagram showing the execution stages for a given batch, with pipelines per stage and across stage pipeline dependencies connected via a dotted line.

To render the graphic we used a Wiki page within Azure DevOps.

I'll apply a few more tweaks to this and then include it in the next release of the procfwk via it's own repository, probably wrapped up as a Stored Procedure and better documented.

T-SQL

SET NOCOUNT ON;  --local variables DECLARE @BatchName VARCHAR(255) = 'Daily'; --set as required  DECLARE @PageContent NVARCHAR(MAX) = ''; DECLARE @BaseData TABLE 	( 	[OrchestratorId] INT NOT NULL, 	[OrchestratorName] NVARCHAR(200) NOT NULL, 	[StageId] INT NOT NULL, 	[StageName] VARCHAR(225) NOT NULL, 	[PipelineId] INT NOT NULL, 	[PipelineName] NVARCHAR(200) NOT NULL 	)   --get reusable metadata INSERT INTO @BaseData SELECT 	o.[OrchestratorId], 	o.[OrchestratorName], 	s.[StageId], 	s.[StageName], 	p.[PipelineId], 	p.[PipelineName] FROM 	[procfwk].[Pipelines] p 	INNER JOIN [procfwk].[Orchestrators] o 		ON p.[OrchestratorId] = o.[OrchestratorId] 	INNER JOIN [procfwk].[Stages] s 		ON p.[StageId] = s.[StageId] 	INNER JOIN [procfwk].[BatchStageLink] bs 		ON s.[StageId] = bs.[StageId] 	INNER JOIN [procfwk].[Batches] b 		ON bs.[BatchId] = b.[BatchId] WHERE 	p.[Enabled] = 1 	AND b.[BatchName] = @BatchName;  --add orchestrator(s) sub graphs ;WITH orchestrators AS 	( 	SELECT DISTINCT 		[OrchestratorId], 		[OrchestratorName], 		'subgraph ' + [OrchestratorName] + CHAR(13) +  		'style ' + [OrchestratorName] + ' fill:#E2F0D9,stroke:#E2F0D9' + CHAR(13) +  		'##o' + CAST([OrchestratorId] * 10000 AS VARCHAR) + '##' + CHAR(13) + 'end' + CHAR(13) 		 AS OrchestratorSubGraphs 	FROM 		@BaseData 	) SELECT 	@PageContent += OrchestratorSubGraphs FROM 	orchestrators;  --add stage sub graphs ;WITH stages AS 	( 	SELECT DISTINCT 		[OrchestratorId], 		[StageName], 		[StageId] 	FROM 		@BaseData 	), 	stageSubs AS 	( 	SELECT 		[OrchestratorId], 		STRING_AGG('subgraph ' + [StageName] + CHAR(13) +  			'style ' + [StageName] + ' fill:#FFF2CC,stroke:#FFF2CC' + CHAR(13) +  			'##s' + CAST([StageId] AS VARCHAR) + '##' + CHAR(13) + 'end', CHAR(13) 			) AS 'StageSubGraphs' 	FROM 		stages 	GROUP BY 		[OrchestratorId] 	) SELECT		 	@PageContent = REPLACE(@PageContent,'##o' + CAST([OrchestratorId] * 10000 AS VARCHAR) + '##',[StageSubGraphs]) FROM 	stageSubs;  --add pipelines within stage ;WITH pipelines AS 	( 	SELECT 		[StageId], 		STRING_AGG( 			CONCAT('p',CAST([PipelineId] * 10 AS VARCHAR),'(',[PipelineName],')',CHAR(13), 			'style ','p',CAST([PipelineId] * 10 AS VARCHAR),' fill:#F2F2F2,stroke:#F2F2F2'),CHAR(13) 			) AS 'PipelinesInStage' 	FROM 		@BaseData 	GROUP BY 		[StageId] 	) SELECT  	@PageContent = REPLACE(@PageContent,'##s' + CAST([StageId] AS VARCHAR) + '##',[PipelinesInStage]) FROM 	pipelines  --add stage nodes ;WITH stageNodes AS 	( 	SELECT DISTINCT 		[StageId], 		's' + CAST([StageId] * 100 AS VARCHAR) + '[' + [StageName] + ']' + CHAR(13) + 		'style s' + CAST([StageId] * 100 AS VARCHAR) + ' fill:#FFF2CC,stroke:#FFF2CC' + CHAR(13) AS StageNode 	FROM 		@BaseData 	) SELECT 	@PageContent = @PageContent + [StageNode] FROM 	stageNodes ORDER BY 	[StageId];  --add stage to pipeline relationships SELECT	 	@PageContent = @PageContent + 's' + CAST([StageId] * 100 AS VARCHAR)  	+ ' --> ' + 'p' + CAST([PipelineId] * 10 AS VARCHAR) + CHAR(13) FROM 	@BaseData;  --add stage to stage relationships ;WITH maxStage AS 	( 	SELECT  		MAX([StageId]) -1 AS maxStageId 	FROM 		@BaseData 	), 	stageToStage AS 	( 	SELECT DISTINCT 		's' + CAST(b.[StageId] * 100 AS VARCHAR)  		+ ' ==> ' + 's' + CAST((b.[StageId] + 1) * 100 AS VARCHAR) + CHAR(13) AS Content 	FROM 		@BaseData b 		CROSS JOIN maxStage 	WHERE 		b.[StageId] <= maxStage.[maxStageId] 	) SELECT 	@PageContent = @PageContent + [Content] FROM 	stageToStage;  --add pipeline to pipeline relationships SELECT  	@PageContent = @PageContent + 'p' + CAST(pd.[PipelineId] * 10 AS VARCHAR)  	+ ' -.- ' + 'p' + CAST(pd.[DependantPipelineId] * 10 AS VARCHAR) + CHAR(13) FROM  	[procfwk].[PipelineDependencies] pd 	INNER JOIN @BaseData b1 		ON pd.[PipelineId] = b1.[PipelineId] 	INNER JOIN @BaseData b2 		ON pd.[DependantPipelineId] = b2.[PipelineId];  --add batch subgraph SELECT 	@PageContent = 'subgraph ' + [BatchName] + CHAR(13) + 	'style ' + @BatchName + ' fill:#DEEBF7,stroke:#DEEBF7' + CHAR(13) + @PageContent FROM 	[procfwk].[Batches] WHERE 	[BatchName] = @BatchName;  SET @PageContent = @PageContent + 'end';  --add mermaid header SELECT 	@PageContent = '::: mermaid' + CHAR(13) + 'graph LR' + CHAR(13) + @PageContent;  --return output PRINT @PageContent

Mermaid (Markdown)

This is an example of the output from the procfwk default metadata.

::: mermaid graph LR subgraph Daily style Daily fill:#DEEBF7,stroke:#DEEBF7 subgraph FrameworkFactory style FrameworkFactory fill:#E2F0D9,stroke:#E2F0D9 subgraph Extract style Extract fill:#FFF2CC,stroke:#FFF2CC p10(Wait 1) style p10 fill:#F2F2F2,stroke:#F2F2F2 p20(Wait 2) style p20 fill:#F2F2F2,stroke:#F2F2F2 p30(Intentional Error) style p30 fill:#F2F2F2,stroke:#F2F2F2 p40(Wait 3) style p40 fill:#F2F2F2,stroke:#F2F2F2 end subgraph Load style Load fill:#FFF2CC,stroke:#FFF2CC p90(Wait 8) style p90 fill:#F2F2F2,stroke:#F2F2F2 p100(Wait 9) style p100 fill:#F2F2F2,stroke:#F2F2F2 end subgraph Transform style Transform fill:#FFF2CC,stroke:#FFF2CC p50(Wait 4) style p50 fill:#F2F2F2,stroke:#F2F2F2 p60(Wait 5) style p60 fill:#F2F2F2,stroke:#F2F2F2 p70(Wait 6) style p70 fill:#F2F2F2,stroke:#F2F2F2 p80(Wait 7) style p80 fill:#F2F2F2,stroke:#F2F2F2 end end subgraph WorkersFactory style WorkersFactory fill:#E2F0D9,stroke:#E2F0D9 subgraph Serve style Serve fill:#FFF2CC,stroke:#FFF2CC p110(Wait 10) style p110 fill:#F2F2F2,stroke:#F2F2F2 end end s100[Extract] style s100 fill:#FFF2CC,stroke:#FFF2CC s200[Transform] style s200 fill:#FFF2CC,stroke:#FFF2CC s300[Load] style s300 fill:#FFF2CC,stroke:#FFF2CC s400[Serve] style s400 fill:#FFF2CC,stroke:#FFF2CC s100 --> p10 s100 --> p20 s100 --> p30 s100 --> p40 s200 --> p50 s200 --> p60 s200 --> p70 s200 --> p80 s300 --> p90 s300 --> p100 s400 --> p110 s100 ==> s200 s200 ==> s300 s300 ==> s400 p30 -.- p60 p30 -.- p70 p70 -.- p100 p100 -.- p110 end

Paul Andrew | July 29, 2021 at 1:59 pm | Categories: Data & AI | URL: https://techdailychronicle.com/?p=142999

Comment

Trendlist

Thursday, July 29, 2021

[New post] Using Mermaid to Create a ProcFwk Pipeline Lineage Diagram

Using Mermaid to Create a ProcFwk Pipeline Lineage Diagram

T-SQL

Mermaid (Markdown)

No comments:

Post a Comment

Generate a catchy title for a collection of newfangled music by making it your own

Report Abuse

Labels

Thursday, July 29, 2021

[New post] Using Mermaid to Create a ProcFwk Pipeline Lineage Diagram

New post on Tech Daily Chronicle

Using Mermaid to Create a ProcFwk Pipeline Lineage Diagram

T-SQL

Mermaid (Markdown)

No comments:

Post a Comment

Generate a catchy title for a collection of newfangled music by making it your own