REST API - A Comprehensive Guide

In 2000, Roy Fielding introduced an architectural style to standardize the way for different systems to communicate over the web using HTTP protocols in his doctoral dissertation. Since then, due to its simplicity and expressiveness, REST APIs spread like a fire in the programming community and eventually became the de facto standard for building web services.

What is a REST API

A REST API is an interface that allows two software applications to communicate with each other using HTTP requests to perform various operations(i.e. retrieving, creating, modifying, deleting data). A REST API confronts with following characteristics:

Client-Server Architecture - The client (e.g., a web browser or mobile app) sends requests to the server, which processes them and returns a response.
Stateless communication - the information that is needed is transmitted in the request and response, neither the client nor the server has to retain any session information.
Resourceful - “resources” are objects or entities that can be accessed or manipulated such as users, posts, orders, etc. These resources are identified by URLs, and standard HTTP methods like GET, POST, PUT, and DELETE are used to perform actions on them.

Why is REST Popular?

Prior to REST, there was SOAP(Simple Object Access Protocol) - an RPC-based protocol that exchanged data in XML format. Unlike SOAP, REST emphasizes simplicity and flexibility which quickly attained popularity among the programmers.

Simplicity - REST APIs use standard HTTP protocols, making them simple and intuitive for developers.
Flexibility - REST can handle different types of data formats, though JSON is the most commonly used format for REST APIs.
Stateless - Each request contains all necessary information, making REST highly scalable since the server doesn’t need to track sessions.
Decoupling - REST promotes a client-server architecture, allowing independent evolution of client and server applications.

Best Practices

Now that you have a basic understanding of REST APIs, let's dive into some best practices for designing a clean and effective REST API.

1. Know your purpose

When designing APIs, it is important to ask “why” you need the APIs. Identify who are the clients and how they will interact with the API server. In general, there are two patterns to implement APIs - gateway vs Backend-for-Frontend. When your APIs need to serve multiple types of clients(web app, mobile app, external integrators), a general API gateway that offers common operations like CRUD around the entities is more preferable. On the other hand, if the APIs are only responsible to one type of client, the Backend-for-Frontend pattern is more suitable.

Both have their own pros and cons. Gateway pattern is more easy to maintain but with limited customization(e.g. You don’t want to implement an endpoint that performs a custom workflow just to fulfill one type of client, that say for such cases, usually clients need to handle the complexity of aggregating results from multiple endpoints in order to fulfill a workflow on their end). In contrast, the Backend-for-Frontend pattern abstracted this layer of complexity by providing a single endpoint to the client, however, it is less maintainable when the need for such customized endpoints increases.

2. Use clear and consistent resource naming

When defining your API endpoints, clarity and consistency are key. Resources (like users, orders, products) should be represented by nouns(avoid using verbs like GET /users/getUsers) and should always follow a predictable pattern. It is also important to be consistent with using pluralization, for example:

Pattern 1: Use singular nouns

GET /user/123   -- retrieve specific user with id
GET /user -- retrieve a collection of users
POST /user -- create new user
PUT /user/123 -- update specific user with id
DELETE /user/123 -- remove specific user with id

Pattern 2: Use plural nouns(Recommended)

GET /users/123   -- retrieve specific user with id
GET /users -- retrieve a collection of users
POST /users -- create new user
PUT /users/123 -- update specific user with id
DELETE /users/123 -- remove specific user with id

Pattern 3: Use plural for operations affecting multiple entities, otherwise singular

GET /user/123   -- retrieve specific user with id
GET /users -- retrieve a collection of users
POST /user -- create new user
PUT /user/123 -- update specific user with id
DELETE /user/123 -- remove specific user with id

You can do POST /users for bulk create

3. Stick to standard HTTP methods

Comply with the standard HTTP methods to perform operations on your resources:

GET - to retrieve resources
POST - to create new resources
PUT - to update/to replace an existing resource
PATCH - to apply partial updates to an existing resource
DELETE - to remove a resource

It is also noted that avoid transmitting data via request body in GET and DELETE endpoints. Although technically there is no hard constraint to restrict from doing it, it is more common to pass data as query parameters in GET and DELETE endpoints.

4. Use proper status code

HTTP status codes should be used to convey the result of an API request. Common status codes include:

200 OK: Successful GET, PUT, or POST requests
201 Created: When a new resource is created (typically after a POST)
204 No Content: Successful request without returning any content (e.g., after DELETE)
400 Bad Request: When the client sends invalid data
401 Unauthorized: Authentication is required
403 Forbidden: Authorization failure
404 Not Found: Resource not found
500 Internal Server Error: Generic server error

You may find some status codes are applicable in a same use case, for example, in a creation endpoint(POST), it is ok to use either 200 or 201. However, make sure it is consistent across the APIs, for example, you don’t want POST /users to return 200 while POST /orders to return 201, that would confuse the clients that interact with your APIs.

5. Filtering, sorting, pagination

APIs should be designed to handle large datasets efficiently. You should take care of the response size, not too big that it renders your server unresponsive. Here are common features that you should have in place whenever dealing with a large dataset - filtering, sorting, and pagination.

Filtering

Filtering allows clients to scope down the search criteria. For example:

// restrict posts to own by userId = 1 only
GET /posts?userId=1

Sorting

Sorting allows response to be returned in an ordered manner, either ascending or descending. There are multiple patterns to implement sorting, here are few examples:

Example 1: dual keys(order_by + ordering)

GET /posts?order_by=created_at&ordering=asc

Example 2: single key(order_by:ordering)

GET /posts?sort=created_at:asc

Example 3: sort by multiple fields

GET /posts?sort[0]=created_at:desc&sort[0]=id:asc

Pagination slices result set into small chunks and return to clients chunk by chunk. This is somewhat similar to how “skip” and “take” work in databases. The key concept is to provide flexibility for clients to select which page and page size to return. Page number can start with 0 or 1(though, it is more common start with 1), it is totally up to how your application works under the hood. Again, just make it clear to your clients and be consistent throughout all endpoints.

Example: return page number 1 with page size of 10.

GET /posts?page=1&per_page=10

Besides, it is common that filtering, sorting and pagination are in place for an endpoint, for example:

GET /posts?status=available&sort=created_at:asc&page=1&per_page=10

Do keep in mind that, for sorting and pagination, usually you would need to provide default values in case they are not provided, for instance setting sort by created_at descending, page=1 and per_page=10 if clients do not provide.

6. API Versioning

Once APIs are rolled out, it is your responsibility to ensure its availability and integrity as always. However, there will be time you need to add new features or update the request/response body to meet clients’ requirements, but you don’t want the new changes to affect existing clients, this is where API versioning comes in place. A good APIs often comes with versioning like your codebase. There are several way to implement versioning, some common approaches are:

Approach 1: Custom field in request headers

GET /posts
Content-Type: application/json
Api-Version: 1

Approach 2: Embedded as part of URL

GET /v1/posts

Approach 3: In query parameters

GET /posts?v=1

7. Standardize requests and responses

Although there is no restriction on the content type for request/response, JSON appears to be the most widely adopted format because of its readability and compatibility across platforms.

In terms of naming fields, I found that .net developers tend to prefer camelCase while JS and Python developers tend to prefer snake_case. There is no right or wrong here as long as the style is consistent across all endpoints.

camelCase example:

{
   “userId”: “User001”,
    “createdAt”: “2024-09-20T12:34:56Z”
}

Snake_case example:

{
   “user_id”: “User001”,
    “created_at”: “2024-09-20T12:34:56Z”
}

Passing arrays in query parameters is another tricky task. There are several convention, for example an endpoint to lookup for ids in [1,2] can be represent in following patterns:

Pattern 1:

GET /posts?ids=1&ids=2

Pattern 2:

GET /posts?ids[]=1&ids[]=2

Pattern 3:

GET /posts?ids[0]=1&ids[1]=2

Pattern 4:

GET /posts?ids=1,2

Next, ensure your API responses are properly formatted and return consistent structures across all endpoints. A common approach is to wrap the response body in a pre-defined format, for example, universally, your endpoints would concern about three things: data, error and pagination info which can be represent as following:

// request body
{
   “data”: {
       “foo”: “bar”
   },
   “error”: {
       “code”: “bad_request”,
       “verbose”: “user_id must be defined”
    },
   “pagination” : {
       “page”: 1,
       “per_page”: 10,
       “total_page”: 10,
       “total_records”: 98
   }
}

8. Handling dates

Another thing to take note is dealing with dates, consistency in the format used is crucial. The most widely accepted format for dates in APIs is ISO 8601, which follows this structure: YYYY-MM0DDTHH:MM:SSZ

{
   "created_at": "2024-09-20T12:34:56Z"
}

This format is readable, sortable, and compatible across systems. It includes both the date and time along with the time zone, ensuring that there are no ambiguities when dealing with records from different regions. Unless you are taking a Backend-For-Frontend pattern and your client only resides in a single region, most of the time you would just pass the responsibility to render the date time in preferable format to the client side.

9. Handling files

Handling file uploads and downloads in a REST API introduces additional complexity and considerations compared to dealing with standard JSON data. Files such as images, documents, videos, and others are often an essential part of modern applications, so it’s crucial to handle them efficiently and securely. Here are key aspects and best practices for handling files in REST APIs.

File upload

There are multiple ways to upload files via REST APIs. The most common methods include: a. Multipart Form Data(Most Common)

When uploading files, the HTTP POST request method along with multipart/form-data encoding is typically used. This allows for sending files along with additional metadata like form fields. Example:

curl -X POST http://example.com/upload \ -F "file=@path_to_file" \ -F "description=Sample file"

This method is widely supported and allows uploading both the file and form data in a single request.

b. Base64 encoding

Another option is to encode the file as a base64 string and send it in the request body. However, this method is less efficient because base64-encoded files are about 33% larger than their binary equivalents. Example:

{ 
   "file": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAA...", 
   "description": "Sample file" 
}

While base64 encoding can work in some situations, it's less performant and more resource-intensive than using multipart form data.

c. Pre-signed URLs(recommended for large files)

For large file uploads, it's often better to generate pre-signed URLs that allow clients to upload files directly to a storage service like AWS S3. This offloads the handling of large files to a dedicated file storage service, reducing the load on your server.

The general flow:

The client requests an upload URL from your API.
Your server generates a pre-signed URL (with a time-limited expiration) and returns it.
The client uploads the file directly to the cloud storage (e.g., S3) using that URL.

This method minimizes server bandwidth usage, avoids timeouts, and offers better scalability.

File downloads

Downloading files is a relatively straightforward process, where the server responds with the file in its binary format.

a. Using Content-Disposition Header

When a file is returned by the server, it’s common to include the Content-Disposition header to control how the file is handled by the browser. This header can force the browser to display a download dialog.

Example:

Content-Disposition: attachment; filename="report.pdf" 
Content-Type: application/pdf

This allows clients to download the file and have it named appropriately based on the filename provided.

b. Supporting Range Requests for Large Files For large file downloads, you should implement HTTP Range Requests, which allow clients to download the file in chunks. This is especially useful for video streaming, resuming interrupted downloads, or saving bandwidth for large files.

There are also a few things you need to consider when dealing with files such like:

a. File extension validation - Check the MIME type and file extension to ensure only allowed file types (e.g., .jpg, .png for images) are uploaded.

b. Size limitation - Set a maximum file size limit to prevent excessive storage usage or attacks through massive files.

c. File names sanitization - It is common practice to rename the file when saving to file storage to prevent name collision. Besides, never trust user-uploaded file names directly, as they can be used to attempt path traversal attacks such like “../../etc/passwd”. Sanitize file names by removing any dangerous characters, or generate random file names and store them.

d. Metadata handling - often you will need to store additional information about the file for the ease to query, retrieve and manage, such information includes file’s name, upload date, uploader, file type, and size.

10. Authentication and authorization

Secure your API by implementing proper authentication and authorization mechanisms. The most common methods include:

OAuth 2.0: Standard protocol for secure authentication
API keys: Simpler authentication, often used for server-to-server communication
JWT (JSON Web Token): Popular choice for stateless authentication

Covering each authentication mechanism here will be too long for this article, I will cover them in the future post if possible.

11. Optimization

Performance is critical for any API. Some ways to optimize include:

Caching - Implement HTTP caching using response headers (ETag, Cache-Control) to reduce load on the server.
Compression - Compress responses using gzip or brotli to reduce response size.
Rate Limiting - To prevent abuse, apply rate limiting (e.g., 100 requests per minute) to protect your API from overuse.
Request Idempotency - A mechanism to prevent same intended operations to perform multiple times, for example clients miss-clicked submit button twice may result in two identical entities being created. This can be solved by passing a time-based token to the server. If the server detects an identical token within a time frame, only the first request will be processed while the rest will be ignored.

12. Documentation

Clear and thorough documentation is vital for developers using your API. Use tools like Swagger/OpenAPI to create interactive documentation that describes endpoints, request/response structures, parameters, and error codes.

Good documentation tools:

Some examples of good API document:

Conclusion

Designing an effective REST API involves balancing simplicity, performance, and security. By following these best practices, you can ensure your API is clean, scalable, and developer-friendly. A well-structured REST API empowers clients to build powerful applications on top of your services while ensuring maintainability and adaptability over time.