Crafting API's

In this blog, we will explore the various types of APIs and dive into best practices for crafting API requests, including pagination, sorting, and filtering. Let's embark on this journey of understanding APIs and how to make the most of them.

Understanding Web API Types(HTTP/HTTPS APIs):

APIs come in different flavours, each with its own characteristics and use cases. The three primary types of APIs we'll explore are SOAP, REST, and GraphQL.

SOAP (Simple Object Access Protocol)

SOAP, an acronym for Simple Object Access Protocol, is one of the older API types. It relies on XML for data exchange and offers robust security features. However, SOAP can be a bit heavy in terms of overhead, making it less suitable for lightweight applications.

REST (Representational State Transfer)

REST, short for Representational State Transfer, has gained popularity for its simplicity and flexibility. It can use various data formats such as JSON, XML, YAML, or any other, and it follows loose guidelines for structuring API endpoints. A REST API consists of several key components:

Headers: Metadata about the request or response.
Body: The data being sent or received.
Query Parameters: Additional information passed in the URL.
Methods: Common HTTP methods like GET, POST, DELETE, PUT, and PATCH.
Status: HTTP status codes to indicate the result of the request.
Versioning: For maintaining backward compatibility with older clients.
Error Handling: How errors are communicated back to the client.

GraphQL

GraphQL is a relatively new API type that addresses some of the limitations of REST. It excels in preventing common problems like overfetching and underfetching data. Here are some key features of GraphQL:

Preventing Overfetching: GraphQL allows clients to specify precisely what data they need, reducing unnecessary data transfer.
N+1 Query Problem: Instead of making multiple requests for related data, GraphQL can fetch them together efficiently.
Nested Queries: GraphQL can perform nested queries, populating data based on foreign keys.
Payload Queries: Even a GET request in GraphQL is essentially a POST, as the query is sent in the payload.
Setup Overhead: The only drawback of GraphQL is the initial setup overhead, but it pays off in the long run.

Crafting API Requests like a Pro

When interacting with APIs, it's essential to make efficient requests. Let's explore some best practices for pagination, sorting, and filtering in your API requests.

Documentation

We can use Swagger or Open API for these purposes

Pagination

Pagination is crucial when dealing with large datasets. To request specific pages of data, you typically need to set the following parameters:

Page Size/Limit: Specifies how many items should be on each page.
Offset: Indicates where the current page starts in the dataset.

Sorting

Sorting allows you to order your API response according to specific criteria. You can include sorting instructions in your request to receive data in the desired order.

Filtering

Filtering helps narrow down the results you receive. You can include filtering criteria in your API request, either through query parameters or headers. For example, you can request all men's formal shirts in blue with a URL like /men-formal-shirts?f=Color%3ABlue_0074D9. Logical and comparison operators can also be used to refine filters.

Remember that these functionalities can often be achieved via headers as well as query parameters, depending on your API's design.

Caching in API Requests (304)

It's essential to touch on caching in API requests. HTTP inherently supports caching, making it possible to reduce unnecessary data transfer between the client and server. Here's how it works:

The server sends an ETag (Entity Tag) to the client, which serves as an identifier for the version of a resource.
The ETag is generated by hashing the previous response from the server.
If the client already has the data with a matching ETag, the server responds with a status code of 304 (Not Modified), indicating that the data hasn't changed since the client's last request.

Incorporating caching into your API design can significantly improve performance and reduce the load on your servers.

Understanding Rate Limiting/ Throttling

Rate limiting is a critical technique used to restrict the number of requests a client can make to an API within a specified time frame, for example, only three requests per second (3 req/sec) for any client

Just like we can do caching at different levels in a system , rate limiting also can be done at multiple places

Types of Rate Limiting

There are several ways to implement rate limiting, depending on your specific needs. Two common types are:

IP-based Rate Limiting: This method limits requests based on the IP address of the client. It's effective for preventing abuse from a specific IP but may not be suitable for identifying individual users.
User ID-based Rate Limiting: For applications with user authentication, this method restricts requests based on user IDs. It's more granular than IP-based limiting and allows you to set different rate limits for different users
We can also use a combination of IP-based and User-ID based

Implementing Rate Limiting - Fixed Window Algorithm - A Basic Approach

One straightforward approach to rate limiting is the Fixed Window Algorithm. Here's how it works:

Use a hash table to keep track of the request count for each client (identified by IP or user ID).
When a request is made, increment the corresponding counter.
If a client exceeds the defined rate limit (e.g., 3 req/sec), respond with a message like "Too many requests from this IP, please try again later."

While this approach works, it's not the most efficient solution, as it doesn't evenly distribute requests over the time frame. Clients could send multiple requests in rapid succession at the start of the window, potentially causing congestion.

Implementing Rate Limiting - A More Efficient Solution: Sliding Window Algorithm

To address the inefficiencies of the Fixed Window Algorithm, we introduce the Sliding Window Algorithm. Here's how it differs:

Maintain a sorted set of timestamps in a hashtable for each client.
When a request arrives, add its timestamp to the set.
Before processing the request, check the timestamps within a sliding window.

The sliding window approach ensures that the rate limit is evenly applied over the specified time frame, making it a more efficient and fair method of rate limiting.

API Design Principles

1. Clear and Consistent Naming

Choose intuitive names for your endpoints, methods, and parameters. Clarity in naming enhances usability and simplifies integration.

2. Well-Defined Schemas

Define consistent data schemas for requests and responses. This ensures uniformity and interoperability, typically using JSON or XML.

3. Pagination for Large Data Sets

When dealing with extensive data, implement pagination to divide results into manageable chunks. Include parameters like page and page_size in your API for easy navigation.

4. Rate Limiting

To prevent abuse and ensure fair usage, enforce rate limiting. Set request limits per user or application, and provide informative error responses when limits are exceeded.

5. Filtering and Sorting

Allow clients to filter and sort results based on specific criteria. Common parameters include filter, sort_by, and order.

6. API Versioning

As your API evolves, introduce versioning to maintain backward compatibility. Include version information in the URL or headers.

Real-World Examples

1. GitHub API

Endpoint: https://api.github.com/users/{username}/repos

Pagination: GitHub's API uses page and per_page parameters for pagination. For example, ?page=2&per_page=10 retrieves the second page of repositories with ten items per page.

Rate Limiting: GitHub enforces rate limits based on the user's authentication level. Unauthenticated users have lower limits than authenticated ones.

Filtering and Sorting: You can filter repositories by criteria like ?language=python or sort them with ?sort=stars.

Versioning: GitHub's API includes the version in the URL, e.g., https://api.github.com/v3/users/{username}/repos.

2. Twitter API

Endpoint: https://api.twitter.com/1.1/statuses/user_timeline.json

Pagination: Twitter's API utilizes count and max_id parameters for pagination. For instance, ?count=20&max_id=1234567890 retrieves 20 older tweets.

Rate Limiting: Twitter has strict rate limits for different API endpoints, such as 15 requests per 15-minute window for user timelines.

Filtering and Sorting: You can filter tweets using query parameters like ?q=#hashtag or sort by date with ?sort_by=created_at.

Versioning: Twitter includes the version in the URL, e.g., https://api.twitter.com/1.1/statuses/user_timeline.json