Before digging into AWS' Simple queue service, it's important to have a brief understanding of queues in general.
A queue is a data structure that holds information in a linear fashion. The queue is based on the principle of First-In-First-Out(FIFO). It means the one that comes in first, is the one that goes out first. This concept is derived from real-world queues like the one you see outside an ATM or as an appointment token for a doctor.
The Simple Queue Service offered by AWS is a service that takes inspiration from the general principle of the queue and helps in building decoupled applications. It's important to be clear at the outset that SQS need not be FIFO based always.
The real use case of SQS comes when we want to build two components (hereafter called producer and consumer) of an application that need not communicate synchronously. Producer's job is to send a message. The consumer's job is to consume that message and process it. A producer is not concerned about Consumer's processing and its response. This is in stark contrast to Http-based synchronous communication where the client sends a request and waits for the response from the server before it can go ahead with its processing.
SQS based architectures
These are distributed systems comprising three pillars:
Distributed components(Producers and Consumers): communicate with SQS via three main APIs viz
SendMessage
,ReceiveMessage
,DeleteMessage
.SQS Queues: They themselves are distributed on AWS servers.
Message: AWS maintains message redundancy. So, a single message will have multiple copies residing across multiple servers.
The whole message lifecycle shown in the diagram below will help you understand how SQS and the architecture based on it work.
Points to note :
Message A is sent using
SendMessage
API. Message A is present across multiple servers. Since SQS queues themselves are distributed, this redundancy helps in serving messages from multiple servers to multiple consumers.Component 2 calls
ReceiveMessage
API to consume the message. As soon as message A is received by Component 2 the message becomes unavailable on SQS to other consumers for a period of Visibility Timeout. This facilitates single-message delivery and avoids a message from being consumed more than onceAs soon as the message is successfully processed, Component 2 calls
DeleteMessage
API that deletes all copies of Message A from SQS.
Types of Queues
SQS supports 2 types of queues
Standard: This is the default one. This queue guarantees:
At least Once Message Delivery: This means a message will be delivered at least once. That also means on rare occasions messages may be delivered more than once(duplication). This is due to the redundant and distributed nature of SQS architecture.
Best Effort In Ordering: Messages will not always necessarily be delivered in the sequence they were received.
FIFO: This queue guarantees:
No Duplication: Uses
deduplicationId
to make sure no message is duplicated.Ordered delivery of messages: uses
messageGroupId
to achieve this.
Important Configs for SQS
Visibility Timeout: The length of time that a message received from a queue, by one consumer, won't be visible to the other message consumers. Typically, you should set the visibility timeout to the maximum time that it takes your application to process and delete a message from the queue.
Delivery Delay: The length of time for which a message won't be visible to any message consumer after it is sent to the queue. Sometimes you need to give time to your application before it starts consuming it.
Receive message wait time: The maximum amount of time that Amazon SQS waits for messages to become available after the queue gets a receive request.
To fully understand SQS it's important to understand some related concepts
Identifiers
MessageId: Each message receives a system-assigned messageId that SQS returns to you in
SendMessageResponse
Receipt Handle: Every time you receive a message from a queue, you receive a receipt handle for that message. This handle is associated with the action of receiving the message, not with the message itself. To delete the message or to change the message visibility, you must provide the receipt handle (not the message ID). Thus you must always receive a message before you can delete it. So the producer can just send a message. It cannot delete it.
DeduplicationId: If a message is delivered successfully to FIFO sqs, any subsequent message with same
deduplicationId
when sent, will be accepted by the queue but it will not be delivered. It considers it a duplicated message.MessageGroupId: Each message is given a groupId. All messages that belong to a group are delivered in the same order as they were received by the FIFO queue. This is implemented by partitions in the FIFO queue. Each message is assigned to a particular partition based on
messageGroupId
. All messages in that partition are ordered and thus delivered in the same order. Note messages delivered to different groups may not be delivered in order
Polling
Polling refers to actively sampling the status of an external device by a client program. Consumers actively poll for messages from the SQS queue. AWS provides two types of polling:
Short Polling(Default): Consumers request messages from a subset of servers. Returns all messages from those servers. Servers not part of the first iteration of the subset would be picked in subsequent poll requests.
Long Polling: All servers are requested for messages. When the wait time for the ReceiveMessage API action is greater than 0, long polling is in effect. Long polling helps reduce the cost of using Amazon SQS by eliminating the number of empty responses and false empty responses.
Dead Letter Queue
DL queue handles the lifecycle of unconsumed messages. When a message is not processed properly it is sent to the DL queue. Redrive policy specifies the source queue, the dead-letter queue, and the conditions under which Amazon SQS moves messages from the former to the latter if the consumer of the source queue fails to process a message a specified number of times as defined by maxReceiveCount
.
Data Protection
Data protection refers to protecting data in two ways :
Messages in Transit: This includes the protection of messages as they travel to and from Amazon SQS. You can protect data in transit using Secure Sockets Layer (SSL) or client-side encryption.
Messages at Rest: This includes messages while they are stored on disks in Amazon SQS data centers. You can protect data at rest by requesting Amazon SQS to encrypt your messages before saving them to disk in its data centers and then decrypt them when the messages are received.
SQS provides server-side encryption(SSE) to manage encryption. SSE encrypts messages as soon as Amazon SQS receives them. Amazon SQS stores messages in encrypted form and decrypts them only when sending them to an authorized consumer.
Conclusion
SQS is a widely used service. In fact, there is barely a modern product that doesn't need SQS or queues in general, in their architecture design. This is because we are increasingly moving away from monolithic architecture in favor of microservices. Microservices, by their very nature, are distributed and are based on Single Responsibility Principle. This leads to the whole architecture being divided into multiple components. And queues form the basis of asynchronous communication between these multiple components in modern applications.