Remove duplicates from an array using C programming language.

Remove duplicates from an array using C programming language.
Posted on 28-08-2023

Remove duplicates from an array using C programming language.

We can remove duplicates from an array in C using various methods. One common approach is to iterate through the array and keep track of the unique elements encountered so far. Here's a sample code that demonstrates how to remove duplicates from an array:

Code:

#include <stdio.h>

void removeDuplicates(int arr[], int *size) {
if (*size <= 1)
return;

int newSize = 1; // New size of the array with duplicates removed

for (int i = 1; i < *size; i++) {
int isDuplicate = 0;

// Check if the current element is a duplicate of any previous element
for (int j = 0; j < newSize; j++) {
if (arr[i] == arr[j]) {
isDuplicate = 1;
break;
}
}

// If not a duplicate, add it to the new array and increase the new size
if (!isDuplicate) {
arr[newSize] = arr[i];
newSize++;
}
}

*size = newSize; // Update the size of the array
}

int main() {
int arr[] = {2, 4, 6, 4, 8, 10, 2, 6};
int size = sizeof(arr) / sizeof(arr[0]);

printf("Original array: ");
for (int i = 0; i < size; i++) {
printf("%d ", arr[i]);
}
printf("\n");

removeDuplicates(arr, &size);

printf("Array after removing duplicates: ");
for (int i = 0; i < size; i++) {
printf("%d ", arr[i]);
}
printf("\n");

return 0;
}

 

Explanation:

  1. #include <stdio.h>: This includes a header file that allows you to use functions for input and output in C.

  2. void removeDuplicates(int arr[], int *size): This is a function named removeDuplicates that takes an array arr and a pointer to an integer size as its arguments. The function doesn't return anything (void).

  3. if (*size <= 1) return;: This checks if the size of the array is 1 or less. If it is, there's no need to remove duplicates, so the function simply exits.

  4. int newSize = 1;: This initializes a variable newSize to keep track of the number of unique elements found so far. It's initially set to 1 because the first element is always unique.

  5. for (int i = 1; i < *size; i++) {: This loop starts from the second element (index 1) and goes up to the last element of the array.

  6. Inside this loop:

    • int isDuplicate = 0;: This initializes a flag called isDuplicate to 0, indicating that the current element is not a duplicate by default.

    • Nested loop: for (int j = 0; j < newSize; j++) {: This loop checks if the current element at index i is the same as any of the previous unique elements.

      • if (arr[i] == arr[j]) {: If a match is found, it means the current element is a duplicate, so we set isDuplicate to 1 and break out of this inner loop.

      • break;: This statement ends the inner loop prematurely if a duplicate is found, since there's no need to check further.

    • if (!isDuplicate) {: After the inner loop, if isDuplicate is still 0, it means the current element is not a duplicate.

      • arr[newSize] = arr[i];: So, we copy this non-duplicate element to the position indicated by newSize in the array arr.

      • newSize++;: We increment newSize by 1 to reflect that we've added a new unique element to the array.

  7. After the loop completes:

    • *size = newSize;: We update the original size of the array with the value of newSize.
  8. int main() {: This is the main function where the program execution starts.

  9. int arr[] = {2, 4, 6, 4, 8, 10, 2, 6};: An example array is defined with some duplicate elements.

  10. Printing the original array using a loop.

  11. removeDuplicates(arr, &size);: This function call removes duplicates from the array arr and updates the size using a pointer to size.

  12. Printing the modified array after duplicates are removed.

  13. return 0;: The main function ends and returns 0 to indicate successful execution.

The code effectively iterates through the array, identifies duplicates, and creates a new array without duplicates while updating the array size.

 

Algorithm:

Algorithm: Remove Duplicates from Array

  1. Start

  2. Include the necessary header file for input and output operations.

  3. Define a function named "removeDuplicates" that takes an array 'arr' and a pointer to an integer 'size':

    • If the size of the array is 1 or less, return.

    • Initialize a variable 'newSize' to 1.

  4. Iterate over the array from the second element (index 1) to the last element:

    • Initialize a flag 'isDuplicate' as 0 to indicate the current element is not a duplicate.

    • Loop over the array elements seen so far (up to 'newSize'):

      • If the current element is equal to any previous unique element, set 'isDuplicate' to 1 and break the loop.

    • If 'isDuplicate' is still 0, copy the current element to the position indicated by 'newSize' in the array. Increment 'newSize' by 1.

  5. Update the original 'size' with the 'newSize'.

  6. In the main function:

    • Define an example array with duplicate elements.

    • Print the original array.

    • Call the "removeDuplicates" function to remove duplicates and update the size.

    • Print the modified array without duplicates.

  7. Return 0 to indicate successful execution.

  8. End

 

Time Complexity:

The time complexity of the above program for removing duplicates from an array depends on the size of the array and the number of unique elements in it. Let's break it down:

  1. The outer loop iterates through each element in the array exactly once. In the worst case, it iterates through all the elements of the array (n times, where n is the number of elements in the array).

  2. For each element, the inner loop iterates through the unique elements seen so far, which is at most the number of unique elements found (let's call it "u"). In the worst case, if all elements are unique, this inner loop iterates up to "n" times.

Considering both loops, the worst-case time complexity of the program is O(n^2), where "n" is the number of elements in the array. This worst-case scenario occurs when all elements are unique, leading to the maximum number of iterations in the inner loop for each element.

However, the average-case time complexity is generally lower, especially if the number of unique elements is significantly smaller than the total number of elements. In practice, the program's actual performance will depend on the distribution of duplicate elements in the array. If duplicates are rare, the program will perform relatively quickly; if duplicates are common, the performance may degrade closer to the worst-case scenario.

 

Space Complexity:

The space complexity of the above program for removing duplicates from an array is primarily determined by the additional space used by the algorithm, apart from the input array. Let's break down the space complexity:

  1. Input Array: The input array itself occupies space in memory. Its space complexity is O(n), where "n" is the number of elements in the array.

  2. Additional Space: The algorithm uses a constant amount of additional space to store variables like loop counters, flags, and temporary values. These variables do not depend on the size of the input array and have a space complexity of O(1).

  3. New Array: The program removes duplicates by modifying the input array in-place, without using an additional array to store the unique elements. Therefore, the space used by the new array is constant, and its space complexity is O(1).

Considering all these factors, the overall space complexity of the program is O(n) because the input array's space complexity dominates, and the additional space used by the algorithm is relatively small and constant regardless of the input size.

Thank You