• Subscribe
  • 4 Key Design Principles and Guarantees of Streaming Database

    Avital Trifsik
    0 replies
    Streaming databases are often a good fit for building real-time apps on top of large-scale data. Designing streaming databases is a complex task bc of the constraints involved in handling streaming data. What are the 4 principles for building the most efficient streaming databases for your app? 1. Auto recovery- Streaming databases are used in highly regulated domains like healthcare, financial systems, etc. Such systems cannot fail, bc if they do, the outcomes can be fatal. Since failure is not an option, auto recovery is one of the most critical DB design principles, in this case, bc it is what ensures a streaming database can continue to accept queries even if one or more nodes fail within the cluster. 2. Exactly once semantics- In the case of streaming databases, auto recovery in failure cases is not enough. Unlike traditional database design, streaming database design should ensure that the results lost during the time of failure do not affect the downstream consumer. Exactly once semantics guarantees that a message will be processed only once & the results will be accurate enough so that the consumer is oblivious to the failure incident. 3. Handling Out Of Order Records- Good database design considers handling out of order records as a critical aspect. Since they are used in highly sensitive applications where the order of processing is very important, streaming databases must handle them graciously. Btw, some of the reasons that may cause out of records are network delays, producer unreliability, and unsynchronized clocks. 4. Consistent Query Results- streaming databases have to balance between the staleness of the data and levels of consistency to optimize performance. It is tough to accomplish consistency in writes because of the distributed nature of these systems. There are 2 methods to solving the problem of consistent query results: -By ensuring that writes are not confirmed till the time, all queries originating from that stream complete. -To do it at the querying engine level. The querying engine delays the results of the specific queries requiring strong consistency guarantees until all the writes are confirmed.
    🤔
    No comments yet be the first to help