Whilst notebooks are great, there comes a time and place when you just want to use Python and PySpark in it’s pure form. Databricks has the ability to execute Python jobs for when notebooks don’t feel very enterprise data pipeline ready - %run and widgets just look like schoolboy hacks. Also the lack of debugging in Databricks is painful at times. By having a PySpark application we can debug locally in our IDE of choice (I’m using VSCode).Read More
We have released a big update to the CI/CD Tools on GitHub today.
These updates are for cluster management within Databricks. They allow for you to Create or Update Clusters. Stop/Start/Delete and Resize.Read More
Databricks provides some nice connectors for reading and writing data to SQL Server. These are generally want you need as these act in a distributed fashion and support push down predicates etc etc. But sometimes you want to execute a stored procedure or a simple statement.Read More
When you create a Databricks workspace using the Azure portal you obviously specify the Resource Group to create it in. But in the background a second resource group is created.Read More
If you are deploying lots of Linked Services to an environment it would be nice to run a test that proves they connect successfully. This can validate that many thingsRead More
Sometimes you find that the Azure PowerShell commandlets do not offer all of the functionality of the REST API/Portal. In these cases you can fall back to the REST API which can be called from PowerShell of course.Read More
One thing that bugs me in SQL Server is how hard it is to get information about your tables to analyse usage, indexes and size.Read More
The data required “unpivoting” so that the measures became just three columns for Volume, Retail & Actual - and then we add 3 rows for each row as Years 16, 17 & 18.
Their are various ways of doing this in Spark, using Stack is an interesting one. But I find this complex and hard to read.Read More
A while back now I started to create some PowerShell modules for assisting with DevOps CI and CD scenarios.Read More