Does the ‘Quick Sort’ Algorithm Live Up to Its Name in Data Science?

Fastest Sorting Algorithm?

Knock Knock it’s Quick Sort here… I think you may have heard about me in class, or even from your friends or work. Anyway, in case you haven’t, I want to introduce myself.

A lot of people that work with me say that I am the fastest sorting algorithm, with a time complexity in the best case of O(NlogN), and worst-case time complexity of O( N 2). What I mean by this is that when you give me a medium-sized list, I sort it for you very quickly, but when you give me a large array I gotta be honest with you, I am a little lazy and tedious so it takes me a bit longer to sort the list for you compared to my cousin Merge Sort.

Worst case time of O( N 2) usually happens when the pivot element is picked as the smallest or biggest element of your array, so you gotta watch out for that. This basic condition I pose makes me very slow, so be careful. For a better performance, it’s smartest to pick scattered pivots and not the biggest or smallest element of the array.

Oh, how rude of me – don’t let me bore you with the minute details of my life. Instead, let my friend Bianca take you on a trip to one of my clients that is a store owner – she’s going to show you exactly how I sorted his merchandise:

This clothing retail client, Harry, just had a weekend sale, and he wants to find out what were the top most popular brands of shirts that have been sold so he can stock up for a new promotion. Now since Harry has over 25 brands of shirts this is where I shine: Quick Sort algorithm to the rescue

Now you gotta know a little secret about me! I am also a divide and conquer algorithm, and by divide and conquer I mean I split the big problem into smaller ones, solve those first and then put everything together. Below you can see the steps I follow in order to sort any array.

Step 1 − Make any element as pivot

Step 2 − Partition the array on the basis of pivot

Step 3 − Apply quick sort on left partition recursively

Step 4 − Apply quick sort on right partition recursively

So I basically split the array into smaller arrays then swap elements inside the array based on the comparison with the “pivot” element.

But wait, a lot of people ask me… How to choose the pivot element?

The pivot is chosen arbitrarily, you can choose it at the beginning of your array, at the end or middle. For the sake of not making it very complicated I suggest choosing the first element of the array as the pivot so you can further find its sorted position into the array following the steps above.

There is one BIG RULE you need to keep in mind!!!!

The aim of me the quick sort algorithm is to have all elements on the left of the pivot smaller than the pivot and the elements on the right of the pivot bigger than the pivot.

As long as you follow this rule you will be on your way to sorting your array the proper way using the Quick Sort algorithm.

Subscribe

* indicates required
Posted in ,

Bianca Jordan

info@biancadata.com

6 Comments

  1. Vince P. on September 14, 2020 at 10:42 am

    That was actually an entertaining read. I like your style of explaining things. So what sorting algorithm would you say is actually used the most in the real world?

    • Bianca Jordan on September 14, 2020 at 10:52 am

      Hey Vince, thank you for your comment! Well it kind of depends on the problem at hand, for example Quick sort is very useful for Medium arrays and Merge sort is performing best on long arrays.

  2. jonah sobistan on September 14, 2020 at 10:47 am

    hi Bianca thank you for the post… but I’m still wondering… in your opinion does ‘quick sort’ live up to its name or not? Should they instead call it ‘tedious sort’?

  3. John Eubanks on September 14, 2020 at 12:24 pm

    Bianca,

    Great post on sorting. However, wouldn’t it be much easier to do the sorting at the data layer using SQL? Just my thoughts as I rely a LOT on SQL for complex queries, including sorting and arranging data.

    • Bianca Jordan on September 14, 2020 at 1:33 pm

      Hi John, thank you for your comment! While SQL is an amazing querying language for selecting, sorting and classifying data, if you would like to pursue Data Science, sorting algorithms like Quick sort are necessary especially when working with Big Data which makes it faster than SQL in practice…of course this depending on the type of Big Data system you are using 🙂

      • John Eubanks on October 4, 2020 at 7:06 pm

        Bianca,

        Many thanks for the response. Although I use SQL a LOT, I can see something such as Python being a possibility to negate super complex queries. And, actually, make the processing quicker and more efficient as it would take a large load off of the server instance…especially, when hundreds of users are hitting the database and we are trying to create reporting tools to graph the data. Does that make sense?

        That being said, I am beginning to pick up on Python, bit by bit, when I have the time now (now that I am building out real-time data dashboards). I appreciate what you are doing, very much, and look forward to using some of your tutorials in real-world scenarios.

Leave a Comment





Subscribe

* indicates required