Skip to main content

Arrays

Status

This note is complete, reviewed, and considered stable.

An array is a linear data structure used to store multiple values of the same data type in a contiguous block of memory.

The defining idea of an array is predictable memory layout.

Key properties (unchanged, now explained):

  • Linear structure: Elements are arranged one after another conceptually.
  • Homogeneous data type: All elements have the same size, which is critical for address calculation.
  • Contiguous memory allocation: Every element is placed next to the previous one in memory.
  • Index-based access: Each element is accessed using an integer index.
  • Zero-based indexing (mostly): The first element is usually at index 0.

Example:

arr = [10, 20, 30, 40]
index: 0 1 2 3

Why this matters: Because all elements are the same size and contiguous, the computer can directly compute where any element lives in memory, without scanning.

Why Arrays Exist

Arrays solve two core problems:

  • Group related data under one variable
  • Fast random access using index-based addressing

Without arrays, we would need separate variables:

a, b, c, d

Arrays allow:

arr[0], arr[1], arr[2], arr[3]

Array Dimensions

One-Dimensional Array (1D)

A 1D array is the simplest form: a linear list of elements.

Example:

int arr[5] = {1, 2, 3, 4, 5}

Each element:

  • Has the same size
  • Is stored next to the previous one

Important mental model: A 1D array is just a pointer to the first element, plus index arithmetic.

Two-Dimensional Array (2D)

A 2D array represents data in rows and columns.

Example:

int matrix[2][3] = {
{1, 2, 3},
{4, 5, 6}
}

Memory Reality (Critical)

Despite looking like a grid, 2D arrays are stored linearly.

Row-major order:

1 | 2 | 3 | 4 | 5 | 6

Address calculation:

address = base + ((row * total_columns) + col) * element_size

This formula is non-negotiable knowledge for understanding performance and cache behavior.

Three-Dimensional Array (3D)

A 3D array is an extension of the same idea.

Example:

int arr[2][3][4]

Interpretation:

  • 2 blocks
  • Each block has 3 rows
  • Each row has 4 columns

Conceptually:

  • Array of matrices
  • Matrix of rows
  • Row of elements

Memory reality:

  • Still one long contiguous block
  • Indexed using multi-level arithmetic

Static vs Dynamic Arrays

Static Arrays

A static array has a size fixed at compile time.

Example:

int arr[10];

Characteristics:

  • Memory allocated on the stack
  • Size known at compile time
  • Extremely fast access
  • No resizing overhead

Advantages:

  • Predictable memory usage
  • Very low overhead
  • Cache-friendly

Limitations:

  • Cannot grow or shrink
  • Risk of stack overflow for large sizes
  • Wasted memory if over-allocated

Static arrays are common in:

  • Embedded systems
  • Performance-critical code
  • Low-level programming

Dynamic Arrays

A dynamic array can grow or shrink at runtime.

Examples:

  • Python list
  • Java ArrayList
  • C++ vector

Key idea:

A dynamic array is still backed by a static array internally.

When capacity is reached:

  1. Allocate a larger array
  2. Copy existing elements
  3. Free old memory
  4. Update reference

Amortized Analysis

Occasionally, appending to a dynamic array triggers a resize, which costs O(n) because all existing elements must be copied to a new memory block.

However, this cost does not happen on every append.

What actually happens is:

  • The array grows by allocating extra unused capacity
  • Most append operations simply place the element into already-allocated space
  • Resizing happens infrequently (typically when capacity is exhausted)

When the total cost of many appends is averaged over all operations, the expensive resize cost gets spread out.

Average cost per append = O(1) amortized

This is why dynamic arrays scale efficiently and are widely used in real-world systems despite occasional expensive operations.

How Arrays Work Internally

Memory Addressing

The defining strength of arrays comes from direct address computation, not from any form of searching.

Given:

  • Base address of the array = B
  • Index being accessed = i
  • Size of each element = S

Address calculation:

address(arr[i]) = B + (i × S)

What this implies internally:

  • No iteration over elements
  • No branching or decision-making
  • Only a fixed amount of arithmetic

The CPU computes the target address and reads it directly.

Because this computation always takes the same amount of time, array access has constant time complexity: O(1).

Contiguous Memory Requirement

Arrays must occupy a continuous memory region.

Consequences:

  • Insertion at beginning → shift all elements
  • Insertion in the middle → shift half the array
  • Deletion → shifting required
  • Resizing → full copy

This is the fundamental tradeoff of arrays.

Fast reads, expensive structural changes.

Common Array Operations & Complexity

OperationTime ComplexitySpace Complexity
Access by indexO(1)O(1)
UpdateO(1)O(1)
TraversalO(n)O(1)
Insert at endO(1)*O(1)*
Insert at beginningO(n)O(1)
Insert at middleO(n)O(1)
Delete at endO(1)O(1)
Delete at beginningO(n)O(1)
Delete at middleO(n)O(1)
Search (unsorted)O(n)O(1)
Search (sorted)O(log n)O(1)

Note: For dynamic arrays, "Insert at end" is amortized O(1). The actual complexity is O(n) when resizing occurs, but averaged over many operations, it behaves as O(1). Static arrays always have O(1) insertion at the end (if capacity allows).

Common Array Algorithms & Patterns

These patterns show up repeatedly across DSA problems. Once internalized, most array questions reduce to recognizing which pattern applies.

Traversal

Process each element exactly once, in order.

Typical use cases:

  • Computing sum, max, min
  • Frequency counting
  • Validations and checks

General pattern:

for i in range(n):
process(arr[i])

Python example (sum of array):

arr = [3, 1, 4, 1, 5]

total = 0
for x in arr:
total += x

print(total)

Time: O(n) Space: O(1)

Searching

  • Works on any array
  • Checks elements one by one until found

Python example:

def linear_search(arr, target):
for i in range(len(arr)):
if arr[i] == target:
return i
return -1

Time: O(n) Space: O(1)

  • Requires the array to be sorted
  • Repeatedly halves the search space

Python example:

def binary_search(arr, target):
left, right = 0, len(arr) - 1

while left <= right:
mid = (left + right) // 2

if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1

return -1

Time: O(log n) Space: O(1) (iterative)

Prefix Sum

Precompute cumulative sums so that range queries become fast.

Definition:

prefix[i] = arr[0] + arr[1] + ... + arr[i]

Python example:

arr = [2, 4, 6, 8]
prefix = [0] * len(arr)

prefix[0] = arr[0]
for i in range(1, len(arr)):
prefix[i] = prefix[i - 1] + arr[i]

Range sum query [l, r]:

def range_sum(prefix, l, r):
if l == 0:
return prefix[r]
return prefix[r] - prefix[l - 1]

Why this matters:

  • Converts repeated O(n) range queries into O(1)

Used in:

  • Range sum problems
  • Subarray sum questions
  • Difference array techniques

Sliding Window

Used when working with contiguous subarrays.

Instead of recomputing sums from scratch:

  • Add the incoming element
  • Remove the outgoing element

Python example (max sum of subarray of size k):

def max_subarray_sum(arr, k):
window_sum = sum(arr[:k])
max_sum = window_sum

for i in range(k, len(arr)):
window_sum += arr[i]
window_sum -= arr[i - k]
max_sum = max(max_sum, window_sum)

return max_sum

Time: O(n) Space: O(1)

Two Pointer Technique

Uses two indices moving through the array in a controlled way.

Common forms:

  • Left / Right pointers
  • Slow / Fast pointers

Python example (check if pair with given sum exists in sorted array):

def has_pair_with_sum(arr, target):
left, right = 0, len(arr) - 1

while left < right:
s = arr[left] + arr[right]

if s == target:
return True
elif s < target:
left += 1
else:
right -= 1

return False

Used for:

  • Pair sum problems
  • Removing duplicates
  • Reversing arrays
  • Partitioning

Time: O(n) Space: O(1)

Sorting Arrays

Sorting is often used as a preprocessing step to unlock faster algorithms (binary search, two pointers, greedy logic).

AlgorithmTime ComplexityExtra Space
Bubble SortO(n²)O(1)
Selection SortO(n²)O(1)
Insertion SortO(n²)O(1)
Merge SortO(n log n)O(n)
Quick SortO(n log n) avgO(log n)

Python built-in sort (Timsort):

arr = [5, 2, 9, 1, 3]
arr.sort() # In-place
# or
sorted_arr = sorted(arr)

Built-in sorting is stable, highly optimized, and usually the best practical choice.

Final Mental Model

Most array problems reduce to one of these:

  • Single pass → Traversal
  • Lookup → Search
  • Range queries → Prefix sum
  • Fixed-size subarrays → Sliding window
  • Ordered data → Two pointers / Binary search
  • Optimization → Sorting

Recognizing the pattern is usually harder than writing the code.

When to Use Arrays

Use arrays when:

  • Fast random access is critical
  • Data is dense
  • Memory locality matters
  • Iteration performance is important

Avoid arrays when:

  • Frequent middle insertions/deletions
  • Highly dynamic size changes
  • Sparse data representation is needed

Key Takeaways

  • Arrays are memory + math
  • O(1) access comes from address calculation
  • Contiguous memory is both strength and weakness
  • Dynamic arrays hide resizing but cannot avoid copying
  • Most data structures are built on top of arrays