Computer Networks

  

This section includes Computer Networks Q&A commonly asked in Amazon interviews, explained clearly with real-world examples and interview-focused insights to help you build strong fundamentals and answer confidently.

 

1. Amazon Prime video buffers frequently on poor networks. Would you choose TCP or UDP, and how would you mitigate packet loss?

 

Video streaming cares more about smooth playback than perfect delivery. With TCP, every lost packet must be retransmitted in order, which increases buffering and delay. On poor networks, this causes the video to pause often. UDP avoids this because it does not wait for retransmissions, so the video can keep playing even if a few packets are lost.

Packet loss is handled at the application level instead of the transport layer. Techniques like adaptive bitrate streaming automatically lower video quality when the network is weak. Buffering ahead, forward error correction (FEC), and sending key frames more frequently also help recover from lost packets without stopping playback.

Real-life example:
When your internet slows down and the video quality drops from HD to SD but keeps playing without stopping, that’s UDP-style behavior combined with smart application-level handling to deal with packet loss.

2. Amazon uses QUIC over UDP in some services — what problems does it solve compared to TCP?

 

QUIC avoids the slow TCP + TLS handshake by combining them into a single step, so connections start faster. It also knows how to handle packet loss better, meaning one lost packet doesn’t block everything else like TCP does.

Real-life example:
When a video or page starts loading instantly on a mobile network, even while switching between Wi-Fi and data, QUIC helps keep the connection smooth and fast.

3. Difference between flow control and congestion control.

Flow Control manages data transfer between the sender and receiver.
It ensures the sender does not send data faster than the receiver can handle,
preventing buffer overflow.

Congestion Control manages data traffic inside the network.
It prevents the network from becoming overloaded when too many packets are sent at once.

 

Basis Flow Control Congestion Control
Purpose Controls data flow between sender and receiver Controls data flow in the network
Focus Receiver’s capacity Network traffic load
Main Goal Prevents receiver buffer overflow Prevents network congestion
Problem Occurs When Receiver is slow Too many packets are in the network
Example Sender waits if receiver buffer is full Sender slows down when network is crowded

 

4. An Amazon API has high latency only for certain regions — how do you debug at HTTP level?

 

When an Amazon API is slow only in some regions, the first step is to check the HTTP request and response details.
At the HTTP level, we focus on timing, headers, and network behavior to understand where the delay is happening.

How to Debug at HTTP Level

  • Check DNS lookup time, connection time, and response time using curl or the browser Network tab
  • Compare request and response headers from fast regions and slow regions
  • Observe HTTP status codes, response size, and retry behavior

Real-Life Example

If users in India experience slow API responses while users in the US get fast responses,
the issue could be related to CDN routing or longer network paths.
By comparing HTTP timing values, we can identify whether the delay occurs before the request
reaches the server or during the server’s response.

 

5. Difference between A, AAAA, CNAME, ALIAS records.

 

A Record
Points a domain name to an IPv4 address.
Example: example.com → 93.184.216.34

AAAA Record
Points a domain name to an IPv6 address.
Example: example.com → 2606:2800:220:1:248:1893:25c8:1946

CNAME Record
Points one domain name to another domain name, not an IP address.
Example: www.example.com → example.com

ALIAS Record
Works like a CNAME but can be used on the root domain.
Example: example.com → loadbalancer.aws.com

 

6. What is the difference between HTTPS, TLS, and SSL?

 

SSL (Secure Sockets Layer)
SSL was the original technology used to secure data between a user and a server.
It is now outdated and no longer considered safe for modern websites.

TLS (Transport Layer Security)
TLS is the improved and secure version of SSL used today.
It encrypts data during transfer so attackers cannot read it.

HTTPS (HyperText Transfer Protocol Secure)
HTTPS is the secure version of HTTP that uses TLS.
It protects login details, payments, and personal information on websites.

Comparison Table

Basis SSL TLS HTTPS
What it is Old security protocol Modern security protocol Secure web protocol
Status Deprecated Actively used Used on secure websites
Purpose Old data encryption Secure data encryption Secure website communication
Used by Older systems Modern applications Browsers & web servers
Example Legacy SSL certificate TLS encryption https://amazon.com

 

7. Where Amazon Uses WebSockets and Where It Avoids Them?

 

Amazon uses WebSockets where real-time updates are required.
Examples include live order tracking, customer chat support, instant notifications,
and dashboards where data must update without refreshing the page.
WebSockets keep a continuous connection, making updates fast and smooth.

Amazon avoids WebSockets for normal API operations like browsing products,
searching items, or loading pages. These actions do not need a constant connection,
so regular HTTP requests are simpler, cheaper, and easier to handle at large scale.

Real-life example:
Live order status updates may use WebSockets, but product search or adding items
to the cart uses normal HTTP requests.

 

8. What happens when a DNS record is misconfigured in production?

 

When a DNS record is misconfigured in production, users may not be able to access the application or may be redirected to the wrong location. Since DNS decides how a domain connects to servers, even a small mistake can affect the entire system.

What usually happens:

  • Website or API becomes unreachable (loading errors or timeouts)
  • Users are sent to the wrong server or an outdated IP address
  • Emails stop working due to incorrect MX records
  • SSL warnings appear if the domain points to the wrong service

Real-life example:
If an e-commerce website changes its server IP but forgets to update the DNS A record, users may see the site as “down” even though the server is running properly. This can cause lost sales, failed payments, and customer complaints until the DNS issue is fixed.

9. How would you debug a DNS issue where the site works in the US but not in India?

 

When a website works in the US but not in India, the issue is usually related to
DNS resolution or regional routing. Since DNS servers can return
different results based on location, it’s important to check how the domain is resolving
in both regions.

How to Debug It:

  • Check DNS resolution from India and US using tools like nslookup or online DNS checkers to see if both regions get the same IP address.
  • Verify DNS propagation to confirm recent changes have updated globally.
  • Check CDN or Geo-routing settings if the site uses services like Cloudflare or AWS Route 53.
  • Look at TTL (Time to Live) to see if old DNS records are still cached in India.
  • Test connectivity using ping or traceroute from an Indian server.

Real-life Example:

If a company updates its server IP but DNS has not fully propagated in India,
users there may still reach the old IP address. This makes the website fail
in India, while users in the US see it working normally.

Python

  

Welcome to the Python Interview Questions section. This section covers commonly asked Python questions in Amazon interviews, explained in a simple and easy-to-understand manner to help you clearly understand each concept.

 

1. What problem does reference counting fail to solve, and how does Python fix it?

Reference counting works by keeping track of how many variables are pointing to an object. When the count becomes zero, the object is removed from memory. This method is simple and fast, but it has one major problem – it cannot handle circular references.

A circular reference happens when two or more objects refer to each other. For example, Object A refers to Object B, and Object B refers back to Object A. Even if no other part of the program is using them, their reference count never becomes zero. Because of that, memory is not freed, and this creates a memory leak.

Python solves this problem by adding something called a garbage collector. In addition to reference counting, Python runs a background process that looks for groups of objects that are only referencing each other and are not being used anywhere else in the program. When it finds such unused circular objects, it safely removes them from memory.

So in short, reference counting alone cannot clean circular references, but Python fixes this by combining reference counting with an automatic garbage collection system.

2. What happens internally when you do: a = b = []

When you write:

a = b = []

Python does not create two different lists. It creates only one empty list in memory. Then both variables a and b are made to point to that same list.

Internally, Python first creates the list object. After that, it assigns the reference of that list to b, and then assigns the same reference to a. So both variables are pointing to the same memory location.

Because of this, if you change the list using one variable, the change will also appear in the other variable.

Example Copy Code
a = b = []

a.append(10)

print("a:", a)
print("b:", b)

3. Explain Python’s object model (everything is an object) and its impact.

In Python, everything is an object. Numbers, strings, lists, functions, and even classes are all objects stored in memory. When you create a variable, you are not storing the value directly – you are storing a reference to that object.

Every object has three things: identity (its memory location), type (like int, str, list), and value (actual data). For example, when you write a = 10, Python creates an integer object 10 and a points to it.

This model makes Python very flexible. You can pass functions as arguments, return them from other functions, and store them inside variables because they are also objects. It also affects how variables behave – especially with mutable objects like lists, where multiple variables can point to the same object and changes reflect everywhere.

Because of this design, Python stays simple, consistent, and powerful, but you must understand references to avoid confusion.

4. Why Are Sets Faster Than Lists for Membership Checks?

 

When we check if an item exists inside a list, Python goes through each element one by one until it finds a match.
This means if the list is large, it may take more time because it has to scan many elements.

Sets work differently. A set uses a special structure called a hash table.
Instead of checking every item, Python directly calculates where the value should be stored using a hash function.
So when you check membership in a set, it can usually find the item almost instantly.

Key Difference

  • List: Searches items one by one (linear search)
  • Set: Uses hashing to directly locate the item

Example

If you have 1 million numbers:

  • Checking 999999 in my_list may take time because Python may scan many values.
  • Checking 999999 in my_set is much faster because Python jumps directly to the location using hashing.

That’s why sets are preferred when you need fast membership checks.

 

5. You are given a square matrix (n × n). Rotate the matrix 90 degrees clockwise, in place, meaning you should not use any extra matrix for the result.

 

When a file is very large, loading the whole file into memory at once can slow down your program or even crash it. So instead of reading everything together, we can read it step by step using a generator.

A generator reads one line at a time and gives it back only when needed. This way, memory usage stays low and the program runs smoothly even with very big files.

What happens here:

  • The file is opened safely using with (so it closes automatically).
  • It reads one line at a time.
  • yield sends one line back instead of storing all lines in memory.
Here is a simple generator to read a large file safely: Copy Code
def read_large_file(file_path):
    with open(file_path, "r") as file:
        for line in file:
            yield line.strip()

for line in read_large_file("bigfile.txt"):
    print(line)

6. How does method resolution order (MRO) work?

 

When a class inherits from more than one parent class, Python needs to decide
which method to run first if the same method exists in multiple parent classes.
This order is called Method Resolution Order (MRO).

Python follows a specific rule called the C3 linearization algorithm.
In simple words, it checks methods in this order:

  • First, the current class
  • Then its parent classes (from left to right)
  • Then their parent classes
  • It continues upward until it reaches the base object class

Python also ensures the order stays consistent and that a parent class
is never checked before its own parent. This prevents confusion and
keeps multiple inheritance predictable.

 

Example: Copy Code
class A:
    def show(self):
        print("A")

class B(A):
    def show(self):
        print("B")

class C(A):
    def show(self):
        print("C")

class D(B, C):
    pass

obj = D()
obj.show()

7. Difference between __new__ and __init__


__new__
and __init__ are both used when creating objects in Python, but they do different jobs.

__new__ is responsible for creating the object.
It is called first and returns a new instance of the class. This method is mainly used when you need control over object creation, such as in immutable objects.

__init__ is responsible for initializing the object.
It runs after __new__ and sets or modifies the object’s attributes.

Example Copy Code
class Example:
    def __new__(cls):
        print("__new__ called")
        return super().__new__(cls)

    def __init__(self):
        print("__init__ called")


obj = Example()

8. How would you debug a memory leak in Python production?

When a Python app keeps using more memory and does not release it, it may have a memory leak. The first step is to monitor memory usage using tools like top or cloud dashboards. If memory keeps increasing even with normal traffic, something is not being freed.

Then use tools like tracemalloc to see which part of the code is allocating memory. You can also check which objects are increasing over time.

Common reasons include growing global variables, caches that never clear, large lists or dictionaries that keep adding data, and circular references. If needed, check the garbage collector using the gc module.

In practice, reproduce the issue in staging, track memory usage step by step, find what keeps growing, and fix the root cause.

9. What is __slots__ and why would Amazon use it?

 

In Python, every object normally stores its attributes inside a dictionary called
__dict__. This makes objects flexible, but it also uses more memory.

__slots__ is a special feature that tells Python to create a fixed set
of attributes for a class instead of using a dictionary. This reduces memory usage
and can make attribute access slightly faster.

Why Big Companies Like Amazon Would Use It

Companies like Amazon handle millions of objects in memory at the same time —
for example, product records, user sessions, logs, or API response objects.
If each object uses extra memory because of __dict__,
the total memory cost becomes very high.

Using __slots__ helps:

  • Reduce memory usage when creating large numbers of similar objects
  • Improve performance in high-traffic systems
  • Prevent accidental creation of unwanted attributes
  • Keep data models more controlled and predictable

 

10. Difference between shallow copy and deep copy.

 

When we use a dictionary in Python, we access values using square brackets like data[key].
This works because dictionaries internally use special methods like __getitem__ and __setitem__.
We can create our own class and implement these methods to make it behave like a dictionary.

Dsa

  

Welcome to the DSA in C++ Interview Questions section. This section covers commonly asked Data Structures and Algorithms questions in C++, frequently seen in top tech company interviews, explained in a simple and easy-to-understand way to help you clearly understand the logic and approach behind each problem.

 

1. Given a string, write a program to find the length of the longest substring that does not contain any repeating characters.

To solve this problem, we use the sliding window technique.
We keep two pointers to track the current substring and use an array (or map) to remember the last position of each character.
If we see a repeated character, we move the left pointer forward to remove the duplicate from the window.

This way, we scan the string only once and keep updating the maximum length.

Example Copy Code
#include <iostream>
#include <vector>
#include <string>
using namespace std;

int lengthOfLongestSubstring(string s) {
    vector<int> lastIndex(256, -1);  // store last position of characters
    int maxLength = 0;
    int start = 0;  // left pointer of window

    for (int i = 0; i < s.length(); i++) {
        if (lastIndex[s[i]] >= start) {
            start = lastIndex[s[i]] + 1;  // move start after duplicate
        }

        lastIndex[s[i]] = i;  // update last seen index
        maxLength = max(maxLength, i - start + 1);
    }

    return maxLength;
}

int main() {
    string s = "abcabcbb";
    cout << "Length of longest substring: " 
         << lengthOfLongestSubstring(s) << endl;
    return 0;
}

2. Check if Parentheses Are Valid

To solve this, we use a stack.

When we see an opening bracket, we push it into the stack.
When we see a closing bracket, we check the top of the stack:

  • If it matches, we remove (pop) it

  • If it doesn’t match or the stack is empty, the string is not valid

At the end, if the stack is empty, it means all brackets were matched correctly.

Example Copy Code
#include <iostream>
#include <stack>
#include <string>
using namespace std;

bool isValid(string s) {
    stack<char> st;

    for (char ch : s) {
        if (ch == '(' || ch == '{' || ch == '[') {
            st.push(ch);
        } else {
            if (st.empty()) return false;

            char top = st.top();
            st.pop();

            if ((ch == ')' && top != '(') ||
                (ch == '}' && top != '{') ||
                (ch == ']' && top != '[')) {
                return false;
            }
        }
    }

    return st.empty();
}

int main() {
    string s = "{[()]}";
    
    if (isValid(s))
        cout << "Valid Parentheses" << endl;
    else
        cout << "Invalid Parentheses" << endl;

    return 0;
}

3. You are given two linked lists. Both lists are already sorted in increasing order. Write a function to merge them into one single sorted linked list and return the head of the new list.

Since both lists are already sorted, we don’t need to sort again.
We just compare the current nodes of both lists and always pick the smaller value. Then we move forward in that list.

We keep doing this until one list becomes empty. After that, we attach the remaining part of the other list.

This way, we build a new sorted list step by step.

Example Copy Code
#include <iostream>
using namespace std;

struct ListNode {
    int val;
    ListNode* next;
    ListNode(int x) : val(x), next(NULL) {}
};

ListNode* mergeTwoLists(ListNode* l1, ListNode* l2) {
    ListNode dummy(0);
    ListNode* tail = &dummy;

    while (l1 != NULL && l2 != NULL) {
        if (l1->val < l2->val) {
            tail->next = l1;
            l1 = l1->next;
        } else {
            tail->next = l2;
            l2 = l2->next;
        }
        tail = tail->next;
    }

    if (l1 != NULL)
        tail->next = l1;
    else
        tail->next = l2;

    return dummy.next;
}

// Helper function to print list
void printList(ListNode* head) {
    while (head != NULL) {
        cout << head->val << " ";
        head = head->next;
    }
}

int main() {
    // First list: 1 -> 3 -> 5
    ListNode* l1 = new ListNode(1);
    l1->next = new ListNode(3);
    l1->next->next = new ListNode(5);

    // Second list: 2 -> 4 -> 6
    ListNode* l2 = new ListNode(2);
    l2->next = new ListNode(4);
    l2->next->next = new ListNode(6);

    ListNode* merged = mergeTwoLists(l1, l2);

    printList(merged);

    return 0;
}

4. You are given a list of intervals, where each interval has a start and an end value. If any intervals overlap, merge them into a single interval and return the final list. Intervals that do not overlap should remain as they are.

 

First, we sort the intervals based on their starting values.

Then we go through them one by one and compare the current interval with the last merged interval.

  • If they overlap, we merge them.
  • If they don’t overlap, we add the interval as it is.

This way, all overlapping intervals are combined efficiently in a single pass.

 

Example Copy Code
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

vector<vector<int>> mergeIntervals(vector<vector<int>>& intervals) {
    vector<vector<int>> result;

    if (intervals.empty())
        return result;

    sort(intervals.begin(), intervals.end());

    result.push_back(intervals[0]);

    for (int i = 1; i < intervals.size(); i++) {
        if (intervals[i][0] <= result.back()[1]) {
            result.back()[1] = max(result.back()[1], intervals[i][1]);
        } else {
            result.push_back(intervals[i]);
        }
    }

    return result;
}

int main() {
    vector<vector<int>> intervals = {{1,3}, {2,6}, {8,10}, {15,18}};

    vector<vector<int>> output = mergeIntervals(intervals);

    cout << "Input Intervals: ";
    for (auto &i : intervals)
        cout << "[" << i[0] << "," << i[1] << "] ";

    cout << "\nMerged Intervals: ";
    for (auto &i : output)
        cout << "[" << i[0] << "," << i[1] << "] ";

    return 0;
}

5. Find the Lowest Common Ancestor in a Binary Tree

In a binary tree, the Lowest Common Ancestor is the first node where both given nodes meet when coming up from the bottom.

The idea is simple:

  • If the current node is null, return null.

  • If the current node matches one of the given nodes, return it.

  • Recursively check the left and right subtree.

  • If both left and right return non-null values, then the current node is the LCA.

  • If only one side returns a value, pass that value upward.

This works because we search both sides of the tree and stop at the first point where both nodes are found under the same parent.

Example Copy Code
#include <iostream>
using namespace std;

struct TreeNode {
    int val;
    TreeNode* left;
    TreeNode* right;
    TreeNode(int x) : val(x), left(NULL), right(NULL) {}
};

TreeNode* lowestCommonAncestor(TreeNode* root, TreeNode* p, TreeNode* q) {
    if (root == NULL || root == p || root == q)
        return root;

    TreeNode* left = lowestCommonAncestor(root->left, p, q);
    TreeNode* right = lowestCommonAncestor(root->right, p, q);

    if (left != NULL && right != NULL)
        return root;

    return (left != NULL) ? left : right;
}

int main() {
    // Creating tree manually
    /*
            3
           / \
          5   1
         / \
        6   2
    */

    TreeNode* root = new TreeNode(3);
    root->left = new TreeNode(5);
    root->right = new TreeNode(1);
    root->left->left = new TreeNode(6);
    root->left->right = new TreeNode(2);

    int n1, n2;
    cout << "Enter two node values: ";
    cin >> n1 >> n2;

    // Finding nodes manually (simple search)
    TreeNode* p = root->left->left;   // node 6
    TreeNode* q = root->left->right;  // node 2

    TreeNode* lca = lowestCommonAncestor(root, p, q);

    if (lca != NULL)
        cout << "Lowest Common Ancestor: " << lca->val << endl;
    else
        cout << "Nodes not found in tree" << endl;

    return 0;
}

6. You are given a main string and a pattern string; find all starting indices in the main string where a substring is an anagram of the pattern. For example, if the string is "cbaebabacd" and the pattern is "abc", the output is [0, 6] because "cba" and "bac" are valid anagrams of "abc".

 

An anagram means the same characters with the same frequency but in a different order.
We slide a window of the same length as the pattern string across the main string and compare character counts.
Whenever the counts match, we store the starting index.

This approach is fast because we reuse previous calculations instead of checking from scratch every time.

Example Copy Code
#include <iostream>
#include <vector>
#include <string>
using namespace std;

vector<int> findAnagrams(string s, string p) {
    vector<int> result;
    if (s.size() < p.size()) return result;

    vector<int> countP(26, 0), countS(26, 0);

    for (char c : p)
        countP[c - 'a']++;

    int windowSize = p.size();

    for (int i = 0; i < s.size(); i++) {
        countS[s[i] - 'a']++;

        if (i >= windowSize)
            countS[s[i - windowSize] - 'a']--;

        if (countS == countP)
            result.push_back(i - windowSize + 1);
    }

    return result;
}

int main() {
    string s = "cbaebabacd";
    string p = "abc";

    vector<int> output = findAnagrams(s, p);

    cout << "Input String: " << s << endl;
    cout << "Pattern: " << p << endl;
    cout << "Output Indices: ";

    for (int index : output)
        cout << index << " ";

    return 0;
}

7. You are given an array of numbers and a target value. Find two different indices such that the numbers at those indices add up to the target, and return the indices.

 

Instead of checking every pair, we store numbers in a hash map while scanning the array.
For each number, we check if the remaining value (target − current number) already exists.
This gives a fast solution in one pass.

 

Example Copy Code
#include <iostream>
#include <vector>
#include <unordered_map>
using namespace std;

vector<int> twoSum(vector<int>& nums, int target) {
    unordered_map<int, int> seen;

    for (int i = 0; i < nums.size(); i++) {
        int need = target - nums[i];
        if (seen.count(need))
            return {seen[need], i};
        seen[nums[i]] = i;
    }
    return {};
}

int main() {
    vector<int> nums = {2, 7, 11, 15};
    int target = 9;

    vector<int> result = twoSum(nums, target);

    cout << "Output: [" << result[0] << ", " << result[1] << "]";
    return 0;
}

8. You are given an array of integers and a number k. Count how many continuous subarrays have a total sum equal to k.

We use a prefix sum and a hash map.
As we move through the array, we store how many times a prefix sum has appeared.
If currentSum - k exists in the map, it means a subarray with sum k is found.

This avoids checking all subarrays and works in one pass.

Example Copy Code
#include <iostream>
#include <vector>
#include <unordered_map>
using namespace std;

int subarraySum(vector<int>& nums, int k) {
    unordered_map<int, int> prefixCount;
    prefixCount[0] = 1;

    int sum = 0, count = 0;

    for (int num : nums) {
        sum += num;

        if (prefixCount.count(sum - k))
            count += prefixCount[sum - k];

        prefixCount[sum]++;
    }

    return count;
}

int main() {
    vector<int> nums = {1, 1, 1};
    int k = 2;

    cout << "Output: " << subarraySum(nums, k);
    return 0;
}

9. You are given an unsorted array of integers. Find the length of the longest sequence of consecutive numbers that appear in the array. The solution must run in O(n) time.

 

We put all numbers into a hash set so lookup is fast.
Then we only start counting when a number has no previous consecutive number (number − 1).
From there, we keep checking the next numbers until the sequence breaks.

Each number is processed once, so the solution stays linear.

Example Copy Code
#include <iostream>
#include <vector>
#include <unordered_set>
using namespace std;

int longestConsecutive(vector<int>& nums) {
    unordered_set<int> s(nums.begin(), nums.end());
    int longest = 0;

    for (int num : s) {
        if (!s.count(num - 1)) {
            int current = num;
            int length = 1;

            while (s.count(current + 1)) {
                current++;
                length++;
            }

            longest = max(longest, length);
        }
    }
    return longest;
}

int main() {
    vector<int> nums = {100, 4, 200, 1, 3, 2};

    cout << "Output: " << longestConsecutive(nums);
    return 0;
}

Database

  

This section covers commonly asked database interview questions that are frequently seen in top tech company interviews. All topics are explained in a clear and simple way so you can understand core database concepts like SQL, indexing, normalization, transactions, joins, and performance tuning – along with how they are used in real-world applications.

 

1. How can you find duplicate orders placed by the same user within 5 minutes?

 

To solve this, we compare each order with the previous order of the same user.
If the time difference between two orders is less than or equal to 5 minutes,
we treat it as a possible duplicate.

SELECT user_id,
       order_id,
       order_time,
       LAG(order_time) OVER (PARTITION BY user_id ORDER BY order_time) AS previous_order_time
FROM orders
HAVING TIMESTAMPDIFF(MINUTE, previous_order_time, order_time) <= 5;

This query checks each user’s previous order using LAG()
and calculates the time difference. If it is within 5 minutes,
those orders are flagged.

 

2. How do you find orders where shipping time was longer than the average shipping time?

 

First, calculate the average shipping time for all orders.
Then compare each order’s shipping duration with that average.
If it is higher, it means the order was delayed.

SELECT *
FROM orders
WHERE DATEDIFF(shipped_date, order_date) >
      (SELECT AVG(DATEDIFF(shipped_date, order_date)) FROM orders);

This query finds orders whose shipping time is greater than the overall
average shipping time.

 

3. When would a subquery perform better than a JOIN?

 

In most real cases, JOIN is faster and more optimized. But sometimes a subquery
can perform better, especially when you only need a small filtered result
instead of combining full tables.

A subquery can be useful when:

  • You only need a single value like MAX(), COUNT(), or AVG().
  • You want to filter records first before comparing them.
  • The inner query returns a small result set.
  • The database optimizer handles the subquery efficiently.

Example:

SELECT *
FROM orders
WHERE user_id IN (
    SELECT user_id 
    FROM users 
    WHERE status = 'active'
);

Here, the subquery first finds active users. Then the outer query fetches
their orders. If the active user list is small, this can work efficiently.

 

4. Explain how indexes affect INSERT performance.

Indexes make SELECT queries faster, but they slow down INSERT operations.
This happens because every time you insert a new row, the database must also
update all related indexes.What happens during INSERT:

  • The new row is added to the table.
  • Every index on that table must also be updated.
  • If many indexes exist, more work is required.

So more indexes = slower INSERT speed.

Example:
If a table has 5 indexes and you insert 10,000 rows, the database updates
all 5 indexes for each row. This increases write time.

That’s why in high-write systems, we keep only necessary indexes.

 

5. What is a deadlock in SQL? Give an example and explain how to fix it.

 

A deadlock happens when two transactions are waiting for each other to release locks,
and neither of them can move forward. As a result, the database gets stuck and
automatically cancels one transaction to break the cycle.

Simple Example:

-- Transaction 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;

-- Transaction 2
BEGIN;
UPDATE accounts SET balance = balance - 50 WHERE id = 2;

-- Now:
-- Transaction 1 tries to update id = 2
-- Transaction 2 tries to update id = 1
-- Both wait forever → Deadlock

Here, each transaction holds a lock on one row and waits for the other row.
Since both are waiting, nothing moves.

How to Fix It:

  • Always access tables and rows in the same order.
  • Keep transactions short.
  • Avoid unnecessary locks.
  • Use proper indexing to reduce lock time.
  • Let the application retry automatically if a deadlock occurs.

In real systems like banking or e-commerce, consistent ordering of updates is the most common and practical fix.

 

6. What is the difference between Optimistic Locking and Pessimistic Locking?

 

Both locking methods are used to control data conflicts when multiple users
try to update the same record.

Pessimistic Locking:

In pessimistic locking, the system assumes conflict will happen.
So it locks the row immediately when someone reads it for update.
Other users must wait until the lock is released.

  • Safer but can slow down performance.
  • Used in banking or payment systems.
  • Example: SELECT … FOR UPDATE

Optimistic Locking:

In optimistic locking, the system assumes conflict is rare.
It does not lock the row. Instead, it checks before updating
whether someone else has modified the data.

  • Better for high-traffic applications.
  • Uses a version number or timestamp column.
  • If version changes, update fails and must retry.

Simple Real Example:

Think of editing a Google Doc. Multiple people can edit at the same time.
If conflict happens, the system handles it later. That is optimistic locking.
But in a bank transaction, the system locks your balance while updating it.
That is pessimistic locking.

 

7. What is the difference between Strong Consistency and Eventual Consistency? Where would Amazon use each?

 

Strong consistency means once data is updated, every user immediately sees the latest value.
There is no delay. What you write is exactly what others read.

Eventual consistency means the update may take a short time to reflect everywhere.
Different users might temporarily see slightly different data, but after some time, all copies become the same.

Where Amazon Uses Strong Consistency:

  • Payments and checkout systems
  • Inventory updates (to avoid overselling)
  • User account balance or wallet systems

In these cases, showing outdated data can cause serious problems like double payments or wrong stock counts.

Where Amazon Uses Eventual Consistency:

  • Product reviews
  • Product view counts
  • Recommendation systems

For example, if a review takes 2–3 seconds to appear for everyone, it’s not a big issue.
So eventual consistency is faster and more scalable for such features.

 

8. How does Amazon manage very high numbers of write operations every second?

 

Amazon handles massive write traffic by designing systems that can scale horizontally.
Instead of one big server, they use many smaller servers working together.

Key Techniques:

  • Database Sharding: Data is split across multiple servers so writes are distributed.
  • Partitioning: Large tables are divided into smaller pieces.
  • Use of NoSQL Databases: Systems like DynamoDB are built for high write speed.
  • Asynchronous Processing: Some writes are processed in background queues.
  • Caching: Reduces repeated database hits.

For example, when millions of users add products to cart during a sale,
Amazon does not store all writes in one database. The load is spread across
many partitions and regions so the system stays fast.

This combination of distributed systems and smart architecture allows Amazon to handle traffic spikes without slowing down.

 

Amazon Index
Get Certified

Complete this course and get your official Java certificate.

Start Now
Need Help? Talk to us at +91-8448-448523 or WhatsApp us at +91-9001-991813 or REQUEST CALLBACK
Enquire Now